= Performance = == Mict R3359 (allinea map) == This profile is done with Allinea Map under Curie Machine, 16 cores MPI, 1 Month and 0.5 degree with IOIPSL. All components are compiled in Production mode (fast). {{{ ########## ########## ########## ########## ########## ########## ########## ########## Execution Sum Up ########## ########## ########## ########## ########## ########## ########## ########## Jobid : 4569560 Jobname : M65_test User : p529jorn Account : gen6328@standard Limits : time = 4:10:00 , memory/task = 4000 Mo Date : submit = 15/04/2016 09:42:37 , start = 15/04/2016 09:52:13 Execution : partition = standard , QoS = normal , Comment = (null) Resources : ntasks = 16 , cpus/task = 1 , ncpus = 16 , nodes = 1 Nodes=curie4179 CPU_IDs=0-15 Mem=64000 Memory / step -------------- Resident Size (Mo) Virtual Size (Go) JobID Max (Node:Task) AveTask Max (Node:Task) AveTask ----------- ------------------------ ------- -------------------------- ------- Accounting / step ------------------ JobID JobName Ntasks Ncpus Nnodes Layout Elapsed Ratio CPusage Eff State ------------ ------------ ------ ----- ------ ------- ------- ----- ------- --- ----- 4569560 M65_test - 16 1 - 00:55:53 100 - - - ########## ########## ########## ########## ########## ########## ########## ########## }}} Screenshots: Main: [[Image(main.png, 20%, title="main")]] MPI: [[Image(mpi_mict_map.png, 20%, title="main")]] Memory: [[Image(mem_mict_map.png​, 20%, title="main")]] IO: [[Image(io_mict_map.png, 20%, title="main")]] CPU Time: [[Image(cputime_mict_map.png​, 20%, title="main")]] CPI: [[Image(cpi_mict_map.png​, 20%, title="main")]] Click the link below to download the profiling file: attachment:orchidee_ol_16p_1t_2016-04-15_09-52.map == Comparision 11/04/2016 == * Date 11/04/2016 * ADA Machine * IOIPSL production mode * Orchidee production mode * 1Y * 16 cores * Forcing: * 1 Degree * 3H Considerations: * MICT is in the same level of modifications as Trunk revision 3346 * MICT is using '''parallel interpolation''' for aggregate 2D subroutine === Overview === [[Image(trunk_vs_mict_grouped.png​, 20%, title="main")]] Subroutines are placed in 4 different groups described below: * ioipsl: all subroutines related to IOIPSL library * Top orchidee: subroutines >1% of computing time * Interpolation: interpolation time by aggregate_2D subroutine * other orchidee: remaining subroutines from orchidee === Mict R3359 (gprof) === This is a profiling test done with gprof tool: {{{ Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ks/call Ks/call name 25.66 1383.92 1383.92 2245127 0.00 0.00 mathelp_mp_ma_fuscat_r21_ 9.62 1902.84 518.92 3835809 0.00 0.00 mathelp_mp_moycum_index_ 9.18 2398.02 495.18 3835826 0.00 0.00 histcom_mp_histwrite_real_ 5.96 2719.41 321.39 17524 0.00 0.00 thermosoil_mp_thermosoil_cond_pft_ 3.81 2924.90 205.49 17520 0.00 0.00 hydrol_mp_hydrol_soil_ 3.62 3119.87 194.97 420480 0.00 0.00 hydrol_mp_hydrol_soil_coef_ 3.59 3313.39 193.52 17524 0.00 0.00 thermosoil_mp_thermosoil_getdiff_ 3.11 3481.04 167.65 365 0.00 0.00 stomate_wet_ch4_pt_ter_wet2_mp_ch4_wet_flux_density_wet2_ 3.05 3645.33 164.29 365 0.00 0.00 stomate_wet_ch4_pt_ter_wet1_mp_ch4_wet_flux_density_wet1_ 2.92 3803.03 157.70 365 0.00 0.00 stomate_wet_ch4_pt_ter_wet3_mp_ch4_wet_flux_density_wet3_ 2.86 3957.34 154.31 365 0.00 0.00 stomate_wet_ch4_pt_ter_0_mp_ch4_wet_flux_density_0_ 2.74 4105.24 147.90 365 0.00 0.00 stomate_wet_ch4_pt_ter_wet4_mp_ch4_wet_flux_density_wet4_ 2.67 4249.50 144.26 17522 0.00 0.00 thermosoil_mp_thermosoil_coef_ 1.63 4337.37 87.87 17520 0.00 0.00 hydrol_mp_hydrol_diag_soil_ 1.59 4423.39 86.02 2666157 0.00 0.00 mod_orchidee_omp_transfert_mp_gather_omp_r1_ 1.57 4507.82 84.43 55 0.00 0.00 interpol_help_mp_aggregate_2d_ 1.37 4581.90 74.08 17520 0.00 0.00 diffuco_mp_diffuco_trans_co2_ 1.36 4655.06 73.16 17520 0.00 0.00 stomate_mp_stomate_main_ 1.22 4720.59 65.53 17520 0.00 0.00 stomate_permafrost_soilcarbon_mp_microactem_ 1.06 4777.86 57.27 17520 0.00 0.00 hydrol_mp_hydrol_main_ 0.96 4829.85 51.99 1602027 0.00 0.00 mathelp_mp_ma_fuscat_r11_ 0.77 4871.20 41.35 17522 0.00 0.00 thermosoil_mp_thermosoil_readjust_ 0.74 4911.35 40.15 2664512 0.00 0.00 mod_orchidee_omp_transfert_mp_gather_omp_i1_ }}} Total Simulation time: 5358 seconds IO: mathelp + histcom = 25.66 + 9.62 + 9.18 = ~45% === Trunk R3346 === This is a profiling test done with gprof tool: {{{ Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ks/call Ks/call name 22.26 441.54 441.54 7 0.06 0.06 interpol_help_mp_aggregate_2d_ 14.52 729.66 288.12 2171415 0.00 0.00 histcom_mp_histwrite_real_ 13.26 992.66 263.00 17520 0.00 0.00 hydrol_mp_hydrol_soil_ 10.28 1196.56 203.90 773813 0.00 0.00 mathelp_mp_ma_fuscat_r21_ 5.07 1297.17 100.61 2171397 0.00 0.00 mathelp_mp_moycum_index_ 4.16 1379.77 82.60 17520 0.00 0.00 diffuco_mp_diffuco_trans_co2_ 3.81 1455.34 75.57 17520 0.00 0.00 hydrol_mp_hydrol_diag_soil_ 3.67 1528.21 72.87 157680 0.00 0.00 hydrol_mp_hydrol_soil_coef_ 2.29 1573.69 45.48 1400412 0.00 0.00 mathelp_mp_ma_fuscat_r11_ 2.27 1618.76 45.07 17520 0.00 0.00 hydrol_mp_hydrol_main_ 1.86 1655.66 36.90 17521 0.00 0.00 thermosoil_mp_thermosoil_getdiff_ 1.46 1684.63 28.97 17521 0.00 0.00 thermosoil_mp_thermosoil_humlev_ 0.99 1704.17 19.54 157680 0.00 0.00 hydrol_mp_hydrol_soil_tridiag_ 0.94 1722.82 18.65 17520 0.00 0.00 stomate_litter_mp_littercalc_ 0.92 1740.99 18.17 17520 0.00 0.00 hydrol_mp_hydrol_split_soil_ 0.86 1758.10 17.11 17520 0.00 0.00 stomate_mp_stomate_main_ 0.81 1774.07 15.98 1133588 0.00 0.00 mod_orchidee_omp_transfert_mp_gather_omp_r1_ }}} Total Simulation time: 1956 seconds IO: mathelp + histcom = 14.25 + 10.28 + 5.07 = ~30% == Comparision 18/02/2016 == 18/02/2016: revisions trunk 2916 and MICT 3161 were considered to be equivalents. The same run.def file is used to compare both developments. The simulations were carried out under the following conditions: * 1 Year * Global * CRU-NCEP v5.3.2 (6 hourly) * CURIE * IO library: IOIPSL * Compilation mode IOIPSL: production * Compilation mode Orchidee: production === Mict R3161 === Time table: ||= N procs =||= 4 =||= 8 =||= 16 =||= 32 =||= 64 =||= 128 =|| ||= 0.5 deg =|| Memory error || >16h39 322 days|| 13h00 || 8h46 || 6h35 || 5h38 || ||= 1 deg =|| 6h37 || 4h20 || 2h36 || 1h45 || 1h21 || 1h08 || ||= 2 deg =|| 1h40 || 56 || 35 || 24 || 19 || 16 || Note: 0.5 deg in 4 N procs did not start due to memory requirements. 0.5 deg in 8 N procs could not finish the simulation in the maximum time given by the HPC. It stopped at the simulation day 322. Both values can be extrapolated. === Trunk R2916 === The same simulations with the same options where carried out with the following results: ||= N procs =||= 4 =||= 8 =||= 16 =||= 32 =||= 64 =||= 128 =|| ||0.5 deg || 8h38 || 5h31 || 3h26 || 2h23 || 1h48 || 1h31 || ||1 deg || 2h07 || 1h17 || 47 || 32 || 25 || 21 || ||2 deg || 38 || 19 || 11 || 8 || 6 || 5 ||