wiki:DevelopmentActivities/ORCHIDEE-ML-Spinup

Context Navigation

Version 16 (modified by dgoll, 4 years ago) (diff)
--

Spin up with a Machine Learning approach

What is it about?

Aim: develop a spinup acceleration procedure which is model version independent. The idea is to develop a python tool set which can applied to the ORCHIDEE family of models.

How can I contribute to this effort?

Please contact the D.Goll if you want to join. Some example we would benefit from are:

data from conventional spinup simulations
expertise how to link it to other tools, like libIGCM, ORCHIDAS etc.
expertise how to host/distribute/maintain the software
machine learning, python

Task force members

Daniel Goll, Yan Sun, Jinfeng Chang, Yilong Wang, Yuanyuan Huang, Vladislav Bastrikov, Nicolas Viovy Matt McGrath?

Status reports

26/01/2021

DONE: Proof of concept for ORCHIDEE-CNP v1.2
ONGOING: Finding a common setup for pixel selection applicable to all ORCHIDEE versions
ONGOING: Collecting data from other ORCHIDEE versions for testing
ONGOING: Translating matlab into python code
ONGOING: Cleaning the code
ONGOING: Recruiting task force members

16/02/2021

Yan gave a presentation on progress with python coding, results on CNP and trunk, and timeline for next 2 months.

Input files: restart + climate forcing (not hist file as might ORCHIDEE might introduce noise)
K-means clustering: add plot which shows the total distance vs k to monitor if the chosen number of cluster paranmeter is well chosen (part of the monitoring info for user)
Add checks and quality statistics to monitor if each steps performs well & stop the procedure is results fail minimum quality criteria (e.g. stop if machine learning fails to predict training pixels)
Externalize all parameters of the routines in one file.

Work distribution:

Matt: Provide trunk v4.0 data (EQ files, + results from 200yr after scratch w/o anal spinup)
Yilong refines & extend coding of tool 1&2
Run tests with the refined tools for other forcings (everyone)
Yan will focus next month on PhD defens (20.March)

03/03/2021

First version of python tools are available for testing
Yilong gave an overview

Next steps:

put code and documentation on github (Daniel, Vlad, Yilong)
add documentation on how to run the tools; adapt them to other models (Yan,Yilong)

all attempt to run the tools with their model data (keep a log on github about what model data used)

information/suggestions on run the tools:

user specification files: need more information, e.g. what file name corresponds to Equilibirum information what to info from transient run (Yan)
things to improve: figure labelling, user spec file (simplify)
try to use qsub to avoid blocking nodes on obelix

16/03/2021

github has been setup and some initial test and exchanges were done
next: everyone try and test the tool on the two available datasets (CNP, trunk); report bugs, improvmenets, etc on github
ongoing: acquire data from other model (versions): CABLE, ORCHIDEE-MICT, ORCHIDEE-<any>

next meeting will be scheduled after discussion with Yan after her defence

01/04/2021

github code status: YY could run the code, DG did some test modifying some inputs, all detected (minor) problems are listed in issues in github
TODO1 (yan): provide information in README how to insert data from other simulation; separate the user specification files into experiment specific (e.g. path to model output, forcing period (for tool 3), etc) and model version specific (e.g. CNP, MICT, Trunk, CABLE, etc).
TODO2 (yan): provide a tool 2 output which condense the information from now multiple files into a single file.
TOdO3 (yan): work on the manuscript (incl. results from test with other model versions (if feasible from TODO4) and CABLE)
TODO4 (YY, DG, all): test the tools 1 and 2 when TODO1 and TODO2 are ready.
TODO5 (DG): discuss with project team about the running scripts.
TODO6( Yan) : code a evaluation tool (tool 3); check criterias are (1) high priority (total land C stock), (2) medium priority (land C stock on pixel), (3) others / drift over forcing period (i.e. climate loop).

14/04/2021

Progress since last meeting:

update of README
bug detected for biomass pool
evaluation tool for developers

To do

update development to github (e.g README)(yan)
produce evaluation tool to test if the ML works for training sites (Yan)
send the data location for CNP-MIMICS runs and MICT runs (Yan)
adapt the tool for MIMICS and MICT (Daniel , + all) to test if tool structure and documentation
produce restart files for CABLE (Yan)
finalize the paper within 4 weeks
next meeting in 3 week due to Yans move

05/05/2021

TODO:

revise the varlist.json to be more flexible regarding varying variables/dimensions in the restart files of ORCHIDEE versions (Vlad)
MICT restart file: which variables are needed which ae not? What do the dimensions stand for? (Jingfeng)
visualization of the quality of the training (Daniel)
CNP-MIMICS trainging data (Daniel)

19/05/2021

Progress since last meeting:

MICT: deepC_a, deepC_s, deepC_p are state variables. carbon stores the depth integrated SOC information and can be derived from the other three.
new json syntax for more flexibility proposed NEXT: Yan, Yilong discuss about feasilibilty to introduce the concept
evaluation tool: LOOCV (optional for developers), quick check plots (mandatory for users) NEXT: finalize and upload to github
evluation tools: different statistical variables to be tested, tradeoff between user-friendliness and information content

Download in other formats:

Plain Text