Version 26 (modified by ajornet, 8 years ago) (diff) |
---|
ORCHIDEE performance (trunk revision 3623)
In order to test Orchidee performance several decisions have been done. Check the details below:
- Trunk
- Revision 3623
- In Curie
- Production mode
- XIOS
- IOIPSL
- Orchidee
- Real CPU Time
- XIOS library
Note: the total cpu time can be calculated by multiplying with the numbers of MPI used.
Table 1: Orchidee Scalability
Following table shows:
- Simulation length: 1Y
- FG2.CWRR configuration
- Output level
- monthly
- daily output
- MPI parallelization
- e.g: 64 Cores Simulation = 1 XIOS core + 63 Orchidee cores
- Forcing resolution vs number of cores
- Cell Format: out_orchidee_00XX (in seconds, managed by Orchidee) / orchid_XXXXX.o (Manage by Curie)
Note: no server is used for 1 MPI case
Conclusion
Use this table to know what is the best trade off between processors and computing time ( XIOS ):
Resolution | Num. Processors | Real Computing Time | Total Computing Time |
---|---|---|---|
2 deg | 32p | 4m24 | 2h20 |
1 deg | 64p | 8m13 | 8h46 |
0.5 deg | 64p | 24m02 | 25h33 |
Table Description:
- Resolution: forcing file grid resolution used in the simulation
- Num. Processors: number of processors
- Real Computing Time: simulation length. It belongs to the slowest core.
- Total Computing Time: time spent by all processors to complete the simulation. Check here to Request Computing hours.
XIOS (restart)
Reference simulation
1 MPI | 4 MPI | 8 MPI | 16 MPI | 32 MPI | 64 MPI | 128 MPI | |
twodeg | 1h21 | 29m | 13m45 | 7m06 | 4m24* | 3m11 | 3m20 |
onedeg | Mem limit | 1h40 | 47m03 | 23m | 12m42 | 8m13* | 7m05 |
halfdeg | Not tested | Mem limit | 2h50 | 1h20 | 42m07 | 24m02* | 20m36 |
*: Recommended number of processors
XIOS
XIOS only writes the history outputs files.
1 MPI | 4 MPI | 8 MPI | 16 MPI | 32 MPI | 64 MPI | 128 MPI | |
twodeg | 1h25 | 32m18 | 17m1 | 10m23 | 7m33 | 6m12 | 6m30 |
onedeg | Not possible | Mem limit | 57m | 33m40 | 23m07 | 18m32 | 17m50 |
halfdeg | Not possible | Mem limit | 3h33 | 2h03 | 1h25 | 1h05 | 1h02 |
IOIPSL (restart)
1 MPI | 4 MPI | 8 MPI | 16 MPI | 32 MPI | 64 MPI | 128 MPI | |
twodeg | 1h34 | 26m04 | 13m39 | 7m31 | 4m40 | 3m15 | 2m38 |
onedeg | 5h56 | 1h34 | 52m57 | 28m54 | 17m30 | 12m | 8m53 |
halfdeg | >16h40 (max limit) | 6h23 | 3h50 | 2h08 | 1h17 | 50m34 | 36m45 |
IOIPSL
Only IOIPSL enabled.
1 MPI | 4 MPI | 8 MPI | 16 MPI | 32 MPI | 64 MPI | 128 MPI | |
twodeg | 1h37 | 29m | 16m36 | 10m20 | 7m30 | 5m59 | 5m21 |
onedeg | 5h39 | 1h48 | 1h03 | 36m44 | 28m10 | 21m47 | 19m33 |
halfdeg | >16h40 (max limit) | 7h21 | 4h32 | 2h49 | 1h58 | 1h32 | 1h17 |
Table 2: Orchidee XIOS scalability (IO) - TODO -
In this table the set up FG2.CWRR is used with 63MPI for orchidee_ol and 1MPI for the server XIOS. For the case with IOIPSL, then 64MPI are used for orchidee_ol. The first line describes different levels of output.
yearly with XIOS | monthly with XIOS | monthly and daily with XIOS | daily and 3hour with XIOS | daily and 3hour with IOIPSL | |
twodeg | |||||
onedeg | |||||
halfdeg |
FG2.CWRR
Description of the test case... to come