Version 12 (modified by abierjon, 6 weeks ago) (diff) |
---|
How to reduce the disk space usage and speed up your simulations
Author: D. Goll
Last revision: 2020/02/28, S. Luyssaert
Objective
This item should help you to: (1) correctly use shared disk, (2) avoid excessively large output files, and (3) reduce size of data storage.
Although correct disk usage is most pressing on obelix because this is a relatively small server shared by many users, these recommendation can be applied on all computing infrastructure as there is no use in wasting disk space. Although some of these recommendations hardly take any time and could be applied even for very small simulations, they are strongly recommended for long simulations with a large spatial domain and/or a high spatial/temporal resolution of the output files.
Correct use of shared disk
Servers and storage facility are shared among tens (Obelix) to thousands (Irene) of users. The way you use this shared facilities will affect others and thus also the other way around. Most frustration by using these shared facilities comes from your simulation that crashes because others filled the disks with their results.
- Never copy driver files from the repositories to your personal folder use libIGCM to ensure the model uses the correct correct input files.
- If your simulation experiments require large input files that are not on the repository yet discuss with your supervisor or the ORCHIDEE-team what should be done to move those files into the repository.
- If the latter is not possible, then don't store different versions of the same files on the shared disks.
- Learn how to correctly use the DEVT, TEST and PROD settings in the config.card. This by itself will not really save disk space but it will help to automatically make space if some of the disks fill up.
Avoid excessively large output files
(1) Better to prevent than to cure: don't tell libIGCM to write output files you will never use. Few users require all the output files for the objectives of their simulation experiments. You can simply tell libIGCM not to produce certain output files in the COMP/stomate.card and the COMP/sechiba.card. Users not interested in stomate output can change the default settings in the COMP/stomate.card to NONE. The NONE ensures the file is not produced. If you don't know what is in the ipcc_history file you can set it to NONE. Applying this setting will take you few seconds but may help you saving lots of time in speeding up your simulations. In addition to making better use of the storage facilities this will also speed up the simulations themselves as input and output operations can take up a substantial share (5 to 20%) of the computing time.
output_level_stomate_history = 2 output_level_stomate_ipcc_history = NONE output_level_stomate_history_4dim = NONE
(2) In the same COMP/stomate.card and COMP/sechiba.card, the user has to specify the temporal resolution of the output files. Daily files take 365 times more memory than annual files. Monthly files still take up 12 times the memory of an annual file. Speed up your simulation and save disk space by selecting the temporal resolution you will actually use. Applying this setting will take you few seconds but may help you saving lots of time in speeding up your simulations. In addition to making better use of the storage facilities this will also speed up the simulations themselves as input and output operations can take up a substantial share (5 to 20%) of the computing time.
output_freq_stomate_history = 1mo output_freq_stomate_ipcc_history = 1y output_freq_stomate_history_4dim = 1y
(3) Note that the restart files could be wisely used to enable the user to speed up the model while reducing disk usage. For example: during the spin-up write only those variables that you will check (soil C and N pools) at an annual frequency. Following the spin-up the model will be restarted. These experiments will start from the spin-up but could be using a different temporal resolution and could write more variables to the output files.
(4) ORCHIDEE makes use of predefined packages of output variables. If in the COMP/stomate.card or COMP/sechiba.card (orchidee.card and stomate.card in coupled experiments) you ask for output level 5, the model will use the ORCHIDEE/src_xml/file_def_orchidee.xml file to learn which output variables belong to which level. Level 5 implies that all variable with an output level smaller or equal to 5 will be written to the output files. If you are about to launch a large experiment, change the output levels in ORCHIDEE/src_xml/file_def_orchidee.xml such that only variables that will be used in post-processing and reporting are written. This will require some careful thinking about the post-processing prior to launching the simulations and it will also take a bit of time to adjust file_def_orchidee.xml but you will easily offset the invested time by speeding up the simulations.
(5) Pay special attention to the dimensions of the variables, ORCHIDEE writes out redundant variables (for example, variables aggregated to grid level AND on PFT level and variables with a much finer resolution than needed (for example, light absorbed at every canopy layer). If you have no plans to use these variables, don't write them to to the output files.
(6) libIGCM writes several files to make your life easier when the model crashes but may be you are confident that the model will not crash. You can control thes files in COMP/stomate.card or COMP/sechiba.card (orchidee.card and stomate.card in coupled experiments). First makes sure that this list is not duplicated in stomate.card and sechiba.card. If duplicated the file will once be written in SRF/Debug and a second time in SBG/Debug. If not duplicated, only write the files you think you will need.
[OutputText] List= (used_run.def, used_orchidee.def, used_orchidee_pft.def, out_orchidee, river_desc.nc)
After you ran the model: remove files which are not needed
You can reduce the disk space usage to about 15% by removing files you are likely never going to use.
Restart files
Restart files allow to rerun the simulation from a certain year. This files are needed in case the model stops before it reached the end date and also allows you to rerun certain periods (for example if you need additional variables). By keeping the restart files for every 10th year or less frequent, you can still restart the model without losing too much time while reducing the disk storage.
used_run.def
The run.def for each model years is written out. In case you did not change parameters between years it is the same file for every year. In that case a single used_run.def would be sufficient. Keep the used_run.def files only for the year that a change in parameters was introduced.
Debug files
ORCHIDEE writes out files which help to debug the model and to see the set up used for the model. After successfully finished simulation some cleaning can be done in this folder. If you are quite confident that the run will succeed (because you have done a similar run before), don't write the debug files (see point (6) in the section above).
Remove the RUN_DIR
In case you specified a RUN_DIR make sure you remove this directory after the simulation has ended if the model crashed. This folder can be very large. if the period finished correctly, libIGCM will remove this directory. Only specify the RUN_DIR during testing and debugging. Use the default libIGCM settings for the RUN_DIR when running PROD simulations.
Example of a simple script (by D. Goll) to remove files after the run was completed
#!/bin/ksh # this scripts removes Restart and Debug files (NOT used_run.defs) # the user has to specify the (1) experiment name, (2) the path to the output, (3) the start and end years of the simulation. # Optional you can adjust the frequency at which you want to keep files set -e # which user is running this script myself=$(whoami) # ======================================== # use specification: igcm_out=/home/surface7/${myself}/IGCM_OUT/OL2/PROD/MODEL1/ expid=simulation01 # start year: syear=1861 # end year: eyear=2099 # we keep files for every Xth year at this frequency: frq_keep=10 # ======================================== # ============================================ # 1. remove debug and restart files # 1.1 set time counter yr=$syear (( yr10th = syear + frq_keep )) # loop over years: while [[ $yr -le $eyear ]] ; do # check if it is a year we keep ... if [[ $yr == $yr10th ]] ; then echo 'we keep the files for this year=',$yr (( yr10th = yr10th + frq_keep )) else # ... if not we start removing files: for stream in SBG SRF OOL ; do if [[ $stream == "SBG" ]]; then # in SBG there is only one folder ... folders="Restart" else # ... in the other streams we have two dolfers folders="Debug Restart" fi # now check the folders ... : for folder in ${folders} ; do # ... for files for the current year: r1=${igcm_out}/${expid}/${stream}/${folder} files=$( ls ${r1}/*${yr}*) for file in $files ;do # ... and remove them. rm -f $file done done done # also remove all files from Out r1=${igcm_out}/${expid}/Out files=$( ls ${r1}/*${yr}*) for file in $files ;do rm -f $file done fi (( yr = yr + 1 )) done