Version 28 (modified by bguenet, 4 years ago) (diff) |
---|
Configuration ENSEMBLE to run ORCHIDEE at Fluxnet sites
Author: M. McGrath
Last revision: B. Guenet (2021/04/07)
This was tested for ORCHIDEE-CN-CAN (r5678 of ORCHIDEE and r5673 of ORCHIDEE_OL) on obelix.
This was tested for TRUNK 4.0 (r6798 of both ORCHIDEE and ORCHIDEE_OL) on obelix.
Background
First, look at Nicolas's page http://forge.ipsl.jussieu.fr/orchidee/wiki/Scripts/FluxnetValidation
And then look at the README file in config/ORCHIDEE_OL/ENSEMBLE. And then read this whole page before really starting to create a run.
Be sure you have checked out the whole TRUNK, including modeles/ORCHIDEE and config/ORCHIDEE_OL (https://forge.ipsl.jussieu.fr/orchidee/wiki/Documentation/UserGuide/InstallingORCHIDEEBasic).
Be sure that ioipsl_debug=.FALSE. in modeles/IOIPSL/src/errioipsl.f90. Otherwise, the output files become huge because of the high frequency writes combined with the debug information.
Start from a clean SVN ENSEMBLE install. Notice that ENSEMBLE/Job_ENSEMBLE is the main driver, and it should not be deleted! This is what I refer to when I say "Nicolas's FLUXNET scripts". It will create jobs based on SPINUP/SUBJOB/OOL_SEC_STO/.
For the following, I refer to config/ORCHIDEE_OL/SPINUP as SPINUP and config/ORCHIDEE_OL/ENSEMBLE as FLUXNET, assuming you have copied the whole ENSEMBLE directory to a new directory called FLUXNET to do the run. This notation is necessary as these scripts make use of several directories in config/ORCHIDEE_OL, including subdirectories.
The general procedure is that the Job_ENSEMBLE script will create and launch the following runs for every site:
- STOI (a fast spinup, length controlled by duree_inistomate)
- ORC-1 (a longer spinup, length controlled by duree_sechiba)
- CLEARCUT (added for CAN...aboveground biomass is removed before the run, to ensure forests have a specific age at the end..just a single year)
- FIN (in the TRUNK, this is the final fast production run, length controlled by duree_final...in CAN, this is the length of time after the CLEARCUT to allow the forest to regrow, length controlled by the fifth field in the site description in fluxnet.card)
- HIST (in CAN, this is the final fast production run, length controlled by duree_final)
The length of these phases can be modified, and additional longer spinups can be added (by changing n_iter, creating ORC-2, ORC-3, etc.), but they are typically not necessary. The final production data (from FIN) is always saved, and output from the other stages can be saved as well, but it's not recommended. In particular, the data for the ORC run can get pretty large when half-hourly output is used.
I have found the following files are used. A lot of the work below goes to ensure that conflicting options are not specified in these files. OOL_SEC_STO is selected when you are run both stomate and sechiba active (otherwise, OOL_SEC may be used):
FLUXNET/fluxnet.card FLUXNET/PARAM/run.def FLUXNET/PARAM/orchidee.def FLUXNET/PARAM/orchidee_pft.def_* SPINUP/COMP/spinup.card SPINUP/SUBJOBS/OOL_SEC_STO/COMP/sechiba.card SPINUP/SUBJOBS/OOL_SEC_STO/COMP/stomate.card SPINUP/SUBJOBS/OOL_SEC_STO/PARAM/run.def SPINUP/SUBJOBS/OOL_SEC_STO/PARAM/orchidee.def SPINUP/SUBJOBS/OOL_SEC_STO/PARAM/orchidee_pft.def_*
The following gives a general flow of how the script works, which should give an idea of the priority.
1) ENSEMBLE_initialize/ensemble.ksh reads in the options from fluxnet.card (launching the script as "./Job_ENSEMBLE fluxnet" copies fluxnet.card to ensemble.card...Job_ENSEMBLE always reads all options from ensemble.card). These are global variables while the runs are being set-up. But, when the individual runs (each stage/site) launch, they may get overwritten by PARAM and COMP options in the launch directory.
The following sections are parsed in fluxnet.card:
a) Section [SPINUP]
b) Section [UserChoices]
c) Section [CONFIG] (explicitly looks for ForcingPath, NbPFTs, NbSitesParam, NameSitesParam, !Groups)
The JOB_ENSEMBLE script and the spinup.driver create multiple new submission directories, using the SPINUP and SPINUP/SUBJOBS directories as templates. I use the notation ${}/PARAM/run.def and ${}/COMP/*card to refer to the run.def and various .card files after they have been copied from their original locations.
The following is then done for every site in Groups in fluxnet.card:
2) The directory is created for the spinup (spinup.card is taken from SPINUP/)
3) spinup.card is modified based on the Job_ENSEMBLE script, including adding UserChoices from fluxnet.card
4) the Job_* file for the spinup is modified
5) The FLUXNET/PARAM/run.def is copied to the spinup directory ${}/PARAM/run.def
6) The script checks that all options which appear in fluxnet.card[SubJobParams] also appear in FLUXNET/PARAM/run.def
7) The script writes all of these options into the ${}/COMP/spinup.card and ${}/PARAM/run.def
8) The script checks to make sure that the NbSitesParam variables in the fluxnet.card appear in the ${}/PARAM/run.def
9) The script writes all of these options into the ${}/PARAM/run.def
10) The spinup job is launched
XXXXXXXXXXXXXX
From here, the Job_ENSEMBLE script is not used anymore. Now the independent spinup jobs control the show. Things are a little more difficult to follow here unless you are really used to libIGCM. libIGCM does a very good job of generalizing functions, but that can make it more difficult to find what you are looking for. It helps to remember that each file name in the COMP directory is a "component", and libIGCM treats them all identically: initializing with IGCM_comp_Initialize, for example. As defined in SPINUP/config.card, we only have a single component in a spinup job: SPIN.
A spinup job runs like the following. Notice that all of this is carried out in ${}, which is a new directory the above script has created in the FLUXNET directory (e.g., FLUXNET/FI_HyyFLUXNET), NOT in the SPINUP directory.
11) IGCM_comp_Initialize/libIGCM_comp.ksh reads in the UserChoices? from ${}/COMP/spinup.card, which was created in step (2) above and modified in steps (3) and (7)
12) IGCM_comp_Update/libIGCM_comp.ksh calls SPIN_Update from ${}/COMP/spinup.driver, which determines we are on the "start" stage and therefore need to follow STOI instructions for the next step (referred to as SECSTOINI inside spinup.driver)
13) ${}/COMP/spinup.driver copies ${}/SUBJOB/OOL_SEC_STO to the STOI directory
14) ${}/COMP/spinup.driver forces the value of FOREST_MANAGED_FORCED in ${}/STOI/PARAM/run.def. Other values are taken from UserChoices? in ${}/COMP/spinup.card.
15) SPIN_prepare from ${}/COMP/spinup.driver sets the values of several variables in ${}/STOI/COMP/sechiba.card, based on UserChoices? in ${}/COMP/spinup.card. The same for ${}/STOI/COMP/orchidee_ol.card
16) SPIN_OptionsSechiba from ${}/COMP/spinup.driver sets the values of TimeSeriesVars3D and TimeSeriesVars2D in ${}/STOI/COMP/sechiba.card.
17) SPIN_OptionsStomate from ${}/COMP/spinup.driver sets the values of SPINUP_ANALYTIC, TimeSeriesVars3D and TimeSeriesVars2D in ${}/STOI/COMP/sechiba.card, as well as the values of FORCESOIL_STEP_PER_YEAR, STOMATE_FORCING_NAME, and STOMATE_CFORCING_NAME in ${}/STOI/PARAM/run.def.
18) Execute the STOI run
Sometimes, the scripts replace variables by values found in the various .card and .def files. Othertimes, the variables are added onto the end. This distinction is important if you get an error saying that a variable appears multiple times.
Practical steps
This section gives step-by-step instructions for getting the simulations working with a clean svn install of ORCHIDEE-CAN r6414. The "General steps" section gives an idea of how to figure things out yourself.
First, duplicate the ENSEMBLE directory to a name of your own choosing (I choose the name FLUXNET here, to match the previous section).
cd config/ORCHIDEE_OL cp -r ENSEMBLE FLUXNET cd FLUXNET
As of r6358 (much before, actually, but at least this revision), the CAN branch (and now TRUNK 4.0 and later) of config/ORCHIDEE_OL/ENSEMBLE contains a series of fluxnet*card files. These different files have different configurations, and different sites. Choose one that best matches what you want.
cp fluxnet_28sp.card fluxnet.card
Make sure the following line exists in the [UserChoices] section of new fluxnet.card (not all of the fluxnet files in the ENSEMBLE directory have them, but it causes a problem if this flag is turned on):
CRUP=n
As of around r6358, the Python script in the config/ORCHIDEE_OL/MAKE_RUN_DEF folder started generating only orchidee_pft.def_* in a few directories: OOL_SEC_STO_FG1trans, OOL_SEC_STO_FG2, SPINUP, and some others. You should make sure that your PARAM directory has all the run.defs it needs, as for a normal run.
cd ../MAKE_RUN_DEF/ module load python/2.7 (on obelix) python Make_orchidee_pft_defs.py cp ../OOL_SEC_STO_FG2/PARAM/* ../FLUXNET/PARAM/ cp ../OOL_SEC_STO_FG2/PARAM/* ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/
I have noticed that the script will complain if a value is specified in fluxnet.card but not the run.def. It will not complain if a value is specified in run.def and not fluxnet.card. Check the [UserChoices] and [SubJobParams] sections of fluxnet.card. Many of the UserChoices are already in SPINUP/COMP/spinup.card, and many of the SubJobParams are in the run.def. It seems that the scripts make decisions based on what is in fluxnet.card, so this should typically take precedence. I will point out the exact changes I make for r6414 below.
Before we get to some specifics, let's create the jobs.
cd ../FLUXNET vi config.card
Change the following lines (on obelix...on Irene, the ARCHIVE line should be fine):
JobName=FLUXNET ARCHIVE=/home/scratch01/$LOGIN
then create the job scripts
../../../libIGCM/ins_job
this creates Job_FLUXNET. Notice that this job will pull from the SPINUP directory as well. ins_job used to create Job files in every directory, but that functionally changed a while ago. Therefore, the following is now necessary (OOL_SEC_STO because we will run a job with sechiba and stomate).
cd ../SPINUP ../../../libIGCM/ins_job cd SUBJOB/OOL_SEC_STO/ ../../../../../libIGCM/ins_job cd ../../../FLUXNET
Now edit the Job_FLUXNET file. Notice that this is the Job file that is copied to all the subjobs when they run, so if you want them to run on a different queue (I use the long queue on obelix, as 500 years can take more than 12 hours), you should do that here. I also modify the run directory so I know where the jobs are running and can go to that directory easily if needed.
vi Job_FLUXNET (make the queue "medium" or "long" instead of "mediump" on obelix: #PBS -q mediump) (change RUN_DIR_PATH=/home/scratch01/mmcgrath/RUN_DIR) (change JobType=DEV if you are not sure this will work) mkdir /home/scratch01/mmcgrath/RUN_DIR
Now change the options for the sites to run against.
vi fluxnet.card
Best to run a small test with a single site. Based on the flunxet.card you copied earlier, the number of species and age classes should all be set up fine.
Always launch a test run before doing a production run, i.e. a single site. The Job_ENSEMBLE script will launch a full spinup job for every site in every group in the fluxnet.card. To limit this to one, do something like the following:
Groups= ( test ) test = ( NL-Loo , NL-Loo_1996-2006.nc , 1996, 11, 80 ,0,0,0,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0,0,0,0,0,0,0 )
COMMENT OUT ANY OTHER LINES THAT BEGIN WITH "Groups". Else, when you submit the job, you will launch a run over all of the sites in all groups, and you have to cancel them one at a time. From experience, this is painful.
The length of the spinup also matters. I use the following for production runs at the moment (in fluxnet.card...I also change the values in SPINUP/COMP/spinup.card, even though those should be overwritten by Job_FLUXNET)
n_iter=1 duree_inistomate=1 duree_sechiba=500 duree_final=1 (if this is '${DATE_END_SECTOF}', you can leave it)
This launches a simulation over one loop of the forcing file, then 500 years (regardless of the length of the forcing file), and then one final loop for analysis.
Note that for my test run, I use duree_sechiba=50 and all of the other duree values I set to 0 (except duree_inistomate and duree_final), so that it goes a little faster.
The section in the fluxnet.card with [SubJobParams] deserves special mention. As of a recent version of CAN, the run.def has been restructured to include two files: orchidee.def, orchidee_pft.def. This makes the run.def much neater and matches what is done in the coupled simulations. However, the Job_ENSEMBLE script attempts to change some variables in the run.def that fall under the [SubJobParams] section. To do this, it looks at the actual run.def file, not any included file. If it does not find a line in the run.def corresponding to the lines in [SubJobParams], it will crash. So make sure all the lines you specify under [SubJobParams] in fluxnet.card also explicitly appear in the PARAM/run.def file.
vi PARAM/run.def (add the following lines from [SubJobParams] in fluxnet.card) ALMA_OUTPUT=y SECHIBA_reset_time=y SPLIT_DT=1 SPINUP_ANALYTIC=y NBUFF=0 STOMATE_FORCING_NAME=NONE STOMATE_CFORCING_NAME=NONE FIRE_DISABLE=y # ATM_CO2=368 : value for year 2000 ATM_CO2=368 XIOS_ORCHIDEE_OK=n Nammonium_FILE = CCMI_ndep_nhx_2000.nc Nnitrate_FILE = CCMI_ndep_noy_2000.nc Nammonium_VAR = nhx Nnitrate_VAR = noy Nfert_FILE = NONE Nfert_VAR = nfer Nmanure_FILE = NONE Nmanure_VAR = Nmanure Nfert_cropland_FILE = Nfer_cropland_2000.nc Nfert_cropland_VAR = nfer Nmanure_cropland_FILE = Nmanure_cropland_2000.nc Nmanure_cropland_VAR = Nmanure Nfert_pasture_FILE = Nfer_pasture_2000.nc Nfert_pasture_VAR = Nfer Nmanure_pasture_FILE = Nmanure_pasture_2000.nc Nmanure_pasture_VAR = Nmanure Nbnf_FILE= bnf_1850.nc Nbnf_VAR= BNF_MGN_PERM2_PERYR NINPUT_UPDATE=0Y NINPUT_SUFFIX_YEAR = n (make sure the following lines are commented out, otherwise ORCHIDEE will not find a land point for any site outside of this window) LIMIT_WEST=8 LIMIT_NORTH=48 LIMIT_SOUTH=46 LIMIT_EAST=10
Note that we did not copy SPINUP_PERIOD. This is because it uses a variable that is evaluated during the execution of Job_ENSEMBLE, and therefore we let the script copy the value onto the end of the run.def.
Note that many of the Nitrogen variables above were also in PARAM/orchidee.def! Remove the following from PARAM/orchidee.def:
Nammonium_FILE = ndep_nhx.nc Nammonium_VAR = nhx Nnitrate_FILE = ndep_noy.nc Nnitrate_VAR = noy Nfert_FILE = NONE Nfert_VAR = nfer Nmanure_FILE = NONE Nmanure_VAR = Nmanure Nfert_cropland_FILE = nfert_cropland.nc Nfert_cropland_VAR = nfer Nmanure_cropland_FILE = nmanure_cropland.nc Nmanure_cropland_VAR = Nmanure Nfert_pasture_FILE = nfert_pasture.nc Nfert_pasture_VAR = Nfer Nmanure_pasture_FILE = nmanure_pasture.nc Nmanure_pasture_VAR = Nmanure Nbnf_FILE= bnf.nc Nbnf_VAR= BNF_MGN_PERM2_PERYR
Also remove the ATM_CO2 that was already existing in the PARAM/run.def:
ATM_CO2 = _AUTO_: DEFAULT = 350.
For FLUXNET jobs, we generally impose vegetation at the site. While this is set in fluxnet.card in the UserChoices, this doesn't seem to get passed to the run.def in the spinup unless we also place it in the run.def.
IMPOSE_VEG=y
The addition of the orchidee.def and orchidee_pft.def required adding them to the [ParametersFiles] in SPINUP/SUBJOBS/OOL_SEC_STO/COMP/orchidee_ol.card, so that libIGCM copies the new files to the PARAM directory of the running code. It also required changes to the driver, to select from the correct orchidee_pft.def file. To fix this, I simply copied OOL_SEC_STO_FG2/COMP/orchidee_ol.* to SPINUP/SUBJOB/OOL_SEC_STO/COMP/.
cp ../OOL_SEC_STO_FG2/COMP/orchidee_ol.* ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/
This also required adding the following to the [UserChoices] section in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card, since the SPINUP/COMP/spinup.driver looks for them:
vi ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card (add the following two lines to [UserChoices] section) NORESTART=n TIMELENGTH=y
Notice that the SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card defines the age classes and PFTs that you will be using. For the moment, we have selected our fluxnet.card to have a certain number of PFTs and age classes, but we have not conveyed this choice to libIGCM in any way. We do that by changing the SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card value of DefSuffix?:
vi ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card DefSuffix = 28pft.1ac
Make sure this matches with the fluxnet.card that you copied at the beginning!
The script adds any variables in the NameSitesParam keyword of fluxnet.card in the PARAM/run.def. SECHIBA_VEGMAX is currently in PARAM/orchidee_pft.def_*. So, depending on what you have present in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card for DefSuffix, you need to remove the following lines in PARAM/orchidee_pft.def_DefSuffix, and the Job_ENSEMBLE script will add them to the end of the run.def as it copies it around. The specific case of 28 PFTs that we are using here:
emacs PARAM/orchidee_pft.def_28pft.1ac & (remove the following) SECHIBA_VEGMAX__01=0.0357142857143 SECHIBA_VEGMAX__02=0.0357142857143 SECHIBA_VEGMAX__03=0.0357142857143 SECHIBA_VEGMAX__04=0.0357142857143 SECHIBA_VEGMAX__05=0.0357142857143 SECHIBA_VEGMAX__06=0.0357142857143 SECHIBA_VEGMAX__07=0.0357142857143 SECHIBA_VEGMAX__08=0.0357142857143 SECHIBA_VEGMAX__09=0.0357142857143 SECHIBA_VEGMAX__10=0.0357142857143 SECHIBA_VEGMAX__11=0.0357142857143 SECHIBA_VEGMAX__12=0.0357142857143 SECHIBA_VEGMAX__13=0.0357142857143 SECHIBA_VEGMAX__14=0.0357142857143 SECHIBA_VEGMAX__15=0.0357142857143 SECHIBA_VEGMAX__16=0.0357142857143 SECHIBA_VEGMAX__17=0.0357142857143 SECHIBA_VEGMAX__18=0.0357142857143 SECHIBA_VEGMAX__19=0.0357142857143 SECHIBA_VEGMAX__20=0.0357142857143 SECHIBA_VEGMAX__21=0.0357142857143 SECHIBA_VEGMAX__22=0.0357142857143 SECHIBA_VEGMAX__23=0.0357142857143 SECHIBA_VEGMAX__24=0.0357142857143 SECHIBA_VEGMAX__25=0.0357142857143 SECHIBA_VEGMAX__26=0.0357142857143 SECHIBA_VEGMAX__27=0.0357142857143 SECHIBA_VEGMAX__28=0.0357142857143 cp PARAM/*def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/
I noticed that the names of the following filenames did not match what is written in the [BoundaryFiles] of SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.card file, which will cause problems later. Make sure the filenames in the run.def/flunxet.card/stomate.card all match, and then copy PARAM/*def to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/ (NOTE: this seems to be fixed in r6798).
emacs PARAM/run.def & emacs fluxnet.card & emacs ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.card & (change the following in the fluxnet.card, and copy to the run.def...stomate.card should be okay, but check) Nammonium_FILE = ndep_nhx.nc Nnitrate_FILE = ndep_noy.nc Nfert_FILE = NONE Nmanure_FILE = NONE Nfert_cropland_FILE = nfert_cropland.nc Nmanure_cropland_FILE = nmanure_cropland.nc Nfert_pasture_FILE = nfert_pasture.nc Nmanure_pasture_FILE = nmanure_pasture.nc Nbnf_FILE= bnf.nc (now copy the files) cp PARAM/*def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/
Similarly, values found in fluxnet.card [UserChoices] seem to be required in SPINUP/COMP/spinup.card, else it crashes. So, assuming that you have made the correct choices in fluxnet.card, just copy the whole [UserChoices] section to the spinup.card.
emacs fluxnet.card & emacs ../SPINUP/COMP/spinup.card & (cp all the [UserChoices] variables, making sure none are repeated...I noticed the following had to be added, and the existing values deleted) CRUP=n ok_newhydrol=y impose_veg=y land_use=n level_hist=5
Some additional variables which need to be in run.def and not orchidee.def (anything with _AUTO_ or _AUTOBLOCKER_ after it, since the .card files look to run.def to change these values, and they don't look into the included files):
emacs PARAM/orchidee.def & emacs PARAM/run.def & (make sure the following are in run.def and not in orchidee.def) SECHIBA_restart_in = _AUTOBLOCKER_ STOMATE_RESTART_FILEIN = _AUTOBLOCKER_ XIOS_ORCHIDEE_OK = _AUTOBLOCKER_ SECHIBA_HISTFILE2 = _AUTO_ WRITE_STEP = _AUTO_ WRITE_STEP2 = _AUTO_ STOMATE_HIST_DT = _AUTO_ STOMATE_IPCC_HIST_DT = _AUTO_ RIVER_DESC = _AUTO_ VEGET_UPDATE = _AUTO_ SPINUP_ANALYTIC = _AUTO_ SPINUP_PERIOD = _AUTO_ STOMATE_OK_STOMATE = _AUTOBLOCKER_ NINPUT_UPDATE = _AUTO_ STOMATE_IMPOSE_CN = _AUTO_ (remove the following from orchidee.def) impose_veg=n (make sure the following to PARAM/run.def. Also make sure it is all capitals!) IMPOSE_VEG=y (now copy the files) cp PARAM/*def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/
Note that we can not use the analytical spinup at the present (the value is changed in the next step). In order to use the analytical spinup, we need to make sure CyclicBegin? and CyclicEnd? appear in the ${}/STOI/config.card, as ${}/STOI/COMP/stomate.card checks for these values. I have not yet figured out how to do that.
Some variables appear in fluxnet.card, but they are also special variables having an AUTO value in orchidee.def (that we moved to param.def). Therefore, remove the following lines from FLUXNET/fluxnet.card and PARAM/run.def (if they exist).
NINPUT_UPDATE=0Y XIOS_ORCHIDEE_OK=n
Make sure the following two variables exist in the PARAM/run.def, but not in the FLUXNET/fluxnet.card or PARAM/orchidee.def. In theory, fluxnet.card calculates this a better way using a different variable in Job_ENSEMBLE, as opposed to in SPINUP/COMP/spinup.driver, but it doesn't seem to be working with the current setup.
SPINUP_ANALYTIC= _AUTO_ SPINUP_PERIOD = _AUTO_ SECHIBA_reset_time=y SPLIT_DT=1 Nammonium_VAR=nhx Nnitrate_VAR=noy Nfert_VAR=nfer Nmanure_VAR=Nmanure Nfert_cropland_VAR=nfer Nmanure_cropland_VAR=Nmanure Nmanure_pasture_VAR = Nmanure Nbnf_VAR=BNF_MGN_PERM2_PERYR CRUP=n (now copy the files) cp PARAM/run.def PARAM/orchidee.def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/
Since early 2020, not all modules are properly loaded on obelix. To fix this, add the following lines in libIGCM/libIGCM_sys/libIGCM_sys_obelix.ksh (the line for Python should already be there...add the three lines after):
# Load python module load python/2.7.5 # Load a couple extras that are needed module load openmpi/2.1.5 module load gcc/5.2.0
Launch the job (from the README file).
./Job_ENSEMBLE fluxnet > out.Job_ENSEMBLE
BE SURE TO CHECK THE USED RUN.DEFs. These can be found by changing to the RUN_DIR when the job is running. The scripts will add flags to the end of the run.def, and sometimes these may conflict with what you want to run.
Notice that the files are all moved after the complete run ends (after HIST is done for CAN, after FIN for the trunk). It's not easy to find where, but something like
${R_OUT}/${config_UserChoices_TagName}/${config_UserChoices_SpaceName}/${config_UserChoices_ExperimentName}/${simulation_name}/${site}${simuation_name}HIST /home/scratch01/mmcgrath/IGCM_OUT/OL2/PROD/ensemble/FLUXNET/NL-LooFLUXNETHIST/
General steps
If you are more interested in understanding what is going on, if you are using a version of ORCHIDEE not used in the "Practical steps" section, or if the steps in the "Practical steps" section didn't work for you, this section provides general guidance on how to get things up and running. It is completmented by the "Debugging" section below.
A good first test is to see if you can get a SPINUP job working. In other words
In the end, the run.def that gets placed in the run directory is the most important input file, and everything is just processing to get it there. If you end up with a crash of your run and a FLUXNET/FI-HyyFLUXNET/STOI/Debug, this likely means something is wrong in your input file. Find the run directory, and open up the run.def to check that all values have been properly replaced by the libIGCM tools. For example...
grep Cd FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 gives the run directory of IGCM_sys_Cd : /home/scratch01/mmcgrath/RUN_DIR/FI-HyyFLUXNETSTOI.1820
Opening that with vi or emacs shows the following line:
SPINUP_PERIOD='${TIME_YEAR}'
As SPINUP_PERIOD should be an integer, the libIGCM scripts (notably the .card and .driver files) are not properly finding and replacing this value. Searching for SPINUP_PERIOD in the current directory shows two things: that this text is present in the PARAM/run.def and fluxnet.card, and that libIGCM tried to set the value, but failed.
From the file out.Job_ENSEMBLE For parameter file run.def SPINUP_PERIOD=${TIME_YEAR} 2019-12-02 14:56:26 --------Debug2--> ORCHIDEE : SPINUP_PERIOD has already been set in def file. 2019-12-02 14:56:26 --------Debug2--> default value : -1 2019-12-02 14:56:26 --------Debug2--> ORCHIDEE : SPINUP_PERIOD has already been set in run.def file. 2019-12-02 14:56:26 --------Debug2--> default value : -1 2019-12-02 14:56:26 --------Debug2--> script value : 11 2019-12-02 14:56:26 --------Debug2--> USER value : '${TIME_YEAR}' 2019-12-02 14:56:26 --------Debug2--> We will NOT set in again !
What should the value be? Search for the variable in the SPINUP directory.
grep -ir SPINUP_PERIOD ../SPINUP/*
This shows that the value is set in SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.driver. Searching the directories for TIME_YEAR shows that this variable is defined in Job_ENSEMBLE.
Debugging
These are some of the errors that I have run into, along with attempts at explaining why and where they may occur, and how to solve them.
Error files can be found in many places, including (assuming a job name of FLUXNET and a site of FI-Hyy):
FLUXNET/out.Job_ENSEMBLE FLUXNET/FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 FLUXNET/FI-HyyFLUXNET/STOI/Debug
In my experience, errors come from the following places:
'''FLUXNET/out.Job_ENSEMBLE''': PARAM/run.def '''FLUXNET/FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card '''FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card, SPINUP/SUBJOB/OOL_SEC_STO/COMP/*driver, PARAM/run.def, fluxnet.card '''FLUXNET/FI-HyyFLUXNET/STOI/Debug''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card, SPINUP/SUBJOB/OOL_SEC_STO/COMP/*driver, PARAM/run.def, or the ORCHIDEE model itself
I would recommend solving the "deepest" error first (e.g., fix an error in the STOI directory before trying to fix an error in out_qsub_FI-HyyFLUXNET).
Here are some errors:
In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 IGCM_debug_Exit : IGCM_comp_modifyDefFile : The variable XIOS_ORCHIDEE_OK cannot be modified. It should be set to AUTO.
One solution is to modify the file SPINUP/SUBJOB/OOL_SEC_STO/COMP/sechiba.driver such that the following two lines
IGCM_comp_modifyDefFile blocker run.def XIOS_ORCHIDEE_OK y ... IGCM_comp_modifyDefFile blocker run.def XIOS_ORCHIDEE_OK n
become
IGCM_comp_modifyDefFile force run.def XIOS_ORCHIDEE_OK y ... IGCM_comp_modifyDefFile force run.def XIOS_ORCHIDEE_OK n
If you do this, the value of the variable will be overwritten, so you should confirm that all values which trigger this option (in this case, XIOS=y and XIOS_ORCHIDEE_OK=y) are set to match what you want. In this case, the XIOS value was found in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol_card, PARAM/run.def,fluxnet.card).
Another error that is found:
In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 IGCM_debug_Exit : IGCM_comp_modifyDefFile : Variable STOMATE_OK_STOMATE is not set in correct file. It should be set in run.def.
This is generally a sign that a variable is in PARAM/orchidee.def and it needs to be in PARAM/run.def because libIGCM is trying to modify it, and libIGCM only knows to modify run.def at the moment. You will need to do the same to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/*def.
Another error:
In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 IGCM_debug_Exit : IGCM_comp_modifyDefFile : Error in run.def: Variable=NINPUT_UPDATE is set 2 times
Generally means that a value appears in both PARAM/run.def (likely copied there from fluxnet.card) and PARAM/orchidee.def. Need to delete the line in PARAM/orchidee.def, and then copy the whole PARAM directory to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/.
Another error:
In the file FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET 2019-12-02 13:26:41 --Debug1--> Check coherence between SeasonalFrequency and PeriodLength 2019-12-02 13:26:41 --------Debug2--> IGCM_post_CheckModuloFrequency : Master=10Y Slave=11Y 2019-12-02 13:26:41 --Debug1--> config_UserChoices_PeriodLength frequency 11Y not compatbile with 2019-12-02 13:26:41 --Debug1--> config_Post_SeasonalFrequency frequency : 10Y IGCM_debug_Exit : Check your frequency
The Job_ENSEMBLE script takes information from a variety of sources. In this case, it appears to take information from the SPINUP and SPINUP/SUBJOBS/OOL_SEC_STO directories. The script attempts to change the timeseires write frequency to match that of the FLUXNET forcing data file length (11 years in this case), but we had left the following line in SPINUP/config.card
TimeSeriesFrequency=10Y
which leads to libIGCM getting confused. The solution is to replace "10Y" in the line above with "NONE". You must then go into the config/ORCHIDEE_OL/SPINUP directory, remove the Job_JOBNAME file, and redo the ../../../libIGCM/ins_job command to create a new Job_JOBNAME file.
Another error:
IGCM_debug_Exit : IGCM_comp_modifyDefFile : Error in run.def: Variable=STOMATE_HIST_DT is set 2 times
Either you added a variable to FLUXNET/PARAM/run.def and forgot to delete it from FLUXNET/PARAM/orchidee.def or FLUXNET/PARAM/orchidee_pft.def_28pft.1ac, OR the variable appears in FLUXNET/PARAM/*def and it gets added to the run.def by the ensemble (doesn't appear to be possible, even if there is a line written in the run.def indicating it happens...the script crashes if the option doesn't already exist) or spinup (SubJobParams in ${}/COMP/spinup.card or ${}/COMP/spinup.driver).
Cleaning
If an ENSEMBLE run crashes, it can sometimes be difficult to clean up all the files so that you can easily relaunch the run after figuring out what went wrong. In particular, each site creates a new directory, which can add up to a lot of directories. It's possible that some of your runs overlap, too (i.e., they use the same base directory, but the current run only uses forested sites, while a different run used agricultural sites). There may be a libIGCM tool that does this well, but if you aren't familiar with it, here is a short script that works. Copy it to your submission directory (i.e., where you launch the ./Job_ENSEMBLE script), make it executable (e.g., chmod +x clean.sh), and launch it before re-launching the run (e.g., ./clean.sh).
#!/usr/bin/bash simulation="FLUXNET" basedir="/home/scratch01/mmcgrath/IGCM_OUT/OL2/PROD/ensemble/" sites=( FI-Hyy FI-Sod ) for site in "${sites[@]}" do rm -fr ${site}${simulation} rm -fr ${basedir}${site}${simulation}* rm -fr ${basedir}${site}${simulation}* rm -fr ${basedir}${simulation}/${site}${simulation}* rm -fr ${basedir}${simulation}/${site}${simulation}* echo "$simulation $site" done rm -fr out.job_ensemble
All you need to do is modify the site list, basedir and simulation variables for your particular run.
Speed
Some timing tests were carried out with TAG2.1, TRUNK (r6096), and CAN (r6091) on obelix. This revealed the importance of the NBUFF=0 keyword for running with FLUXNET data for a single site. When running for a single site with forcing that has lower temporal resolution (e.g., CRUNCEP, which has six-hourly resolution instead of the 30 min resolution of FLUXNET), it's much less important. The amount of data output for all runs was adjusted to give approximately the same size of files. The optimized executables were used for all tests (-O3).
I take timings from four locations: CPU Time Global and Real Time Global from out\_orchidee, and then the real and user times reported by time -p ./orhcidee\_ol. For the most part, they are similar. For clairity, I only report Real time Global from out\_orchidee below. Error bars are the standard deviation from 5 independent runs to show the variance.
The TRUNK and TAG21 have 15 PFTs, CAN has 28 PFTs, but they are all set to zero except for NeedleleafEvergreenTemperate (PFT4...4), Deciduous temperate (PFT6...8), C3Grass (PFT10..23), and C3Crop (PFT12...26), which are all set to 0.25. I wanted to simulate a somewhat realistic pixel with a mix of vegetation.
Using NBUFF=1
FLUXNET forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM, the total time is [in seconds, with standard deviation from five runs on obelix]
TAG21 1270 $\pm$ 60 TRUNK 4600 $\pm$ 600 CAN 5800 $\pm$ 500
CRUNCEP forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM
TAG21 1310 $\pm$ 70 TRUNK 1700 $\pm$ 100 CAN 1810 $\pm$ 90
Using NBUFF=0
FLUXNET forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM, the total time is [in seconds, with standard deviation from five runs on obelix]
TAG21 1250 $\pm$ 140 TRUNK 1480 $\pm$ 90 CAN 1700 $\pm$ 200
CRUNCEP forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM
TAG21 1310 $\pm$ 50 TRUNK 1440 $\pm$ 110 CAN 1700 $\pm$ 200