New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
2020WP/HPC-09_epico_Loop_fusion (diff) – NEMO

Changes between Version 9 and Version 10 of 2020WP/HPC-09_epico_Loop_fusion


Ignore:
Timestamp:
2020-11-30T20:33:03+01:00 (4 years ago)
Author:
epico
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • 2020WP/HPC-09_epico_Loop_fusion

    v9 v10  
    2222=== Description 
    2323 
    24 {{{#!box width=25em help 
    25 Describe the goal of development and the methodology, \\ 
    26 add reference documents or publications if relevant. 
    27 }}} 
    28  
    29 '' 
    3024The computational peak performance of the target parallel architecture can be better exploited working on the vectorisation level of the code. Many compilers usually are able to perform automatic vectorisation but the code needs to be written in such a way as to drive the compiler to increase the vectorisation level. A screening of the code will be needed in order to limit the dependency issues. Moreover, directives can also be used to increase the execution of SIMD instructions and to get closer to modern core peak performance. 
    3125 
     
    3630 
    3731Planned optimisations will be designed taking care to ensure that scientific quality of the code is not compromised.  
    38 '' 
    3932 
    4033=== Implementation 
    4134 
    42 {{{#!box width=35em help 
    43 Describe flow chart of the changes in the code. \\ 
    44 List the Fortran modules and subroutines to be created/edited/deleted. \\ 
    45 Detailed list of new variables to be defined (including namelists), \\ 
    46 give for each the chosen name and description wrt coding rules. 
    47 }}} 
    48  
    49 '' 
    5035The DO-loops fusion can be inserted in the NEMO code gradually, but this requires to move the halo exchanges earlier in the code and this is possible thanks to the extended halo=2 
    5136 
     
    61462. proceed with LDF module (next year) 
    62473. we will complete with the remaining routines from the most computing intensive towards the less computing intensive one. (next year) 
    63 '' 
    6448 
    6549=== Documentation updates 
    6650 
    67 {{{#!box width=55em help 
    68 Using previous parts, define the main changes to be done in the NEMO literature  
    69 (manuals, guide, web pages, …). 
    70 }}} 
    7151 
    72 '' 
    7352The use of compilation key named key_loop_fusion must be included in the documentation. key_loop_fusion can be used to activate or deactivate the loop fusion optimization 
    74 '' 
    7553 
    7654== Preview  
     
    10987{{{ 
    11088 
    111 details goes here 
     89Current code is : URL: https://forge.ipsl.jussieu.fr/nemo/svn/NEMO/branches/2020/dev_r13898_Tiling_Cleanup_MPI3 @ r13906 
     9013906  ( last change @ r13906 ) 
     91 
     92SETTE validation report generated for : 
     93 
     94       URL: https://forge.ipsl.jussieu.fr/nemo/svn/NEMO/branches/2020/dev_r13898_Tiling_Cleanup_MPI3 @ r13906+ (last changed revision) 
     95 
     96       on ifort_zeus_xios arch file 
     97 
     98 
     99!!---------------1st pass------------------!! 
     100 
     101   !----restart----! 
     102WGYRE_PISCES_ST              run.stat    restartability  passed :  13906+ 
     103WGYRE_PISCES_ST              tracer.stat restartability  passed :  13906+ 
     104WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  13906+ 
     105WORCA2_ICE_PISCES_ST         tracer.stat restartability  passed :  13906+ 
     106WORCA2_OFF_PISCES_ST         tracer.stat restartability  passed :  13906+ 
     107WAMM12_ST                    run.stat    restartability  passed :  13906+ 
     108WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  13906+ 
     109WAGRIF_DEMO_ST               run.stat    restartability  FAILED :  13906+  (results are different after  19     time steps) 
     110WWED025_ST                   run.stat    restartability  passed :  13906+ 
     111WISOMIP+_ST                  run.stat    restartability  passed :  13906+ 
     112WOVERFLOW_ST                 ocean.output               MISSING :  13906+ 
     113WOVERFLOW_ST                 incomplete test 
     114WLOCK_EXCHANGE_ST            ocean.output               MISSING :  13906+ 
     115WLOCK_EXCHANGE_ST            incomplete test 
     116WVORTEX_ST                   run.stat    restartability  passed :  13906+ 
     117WICE_AGRIF_ST                run.stat    restartability  passed :  13906+ 
     118 
     119   !----repro----! 
     120WGYRE_PISCES_ST              run.stat    reproducibility passed :  13906+ 
     121WGYRE_PISCES_ST              tracer.stat reproducibility passed :  13906+ 
     122WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  13906+ 
     123WORCA2_ICE_PISCES_ST         tracer.stat reproducibility passed :  13906+ 
     124WORCA2_OFF_PISCES_ST         tracer.stat reproducibility passed :  13906+ 
     125WAMM12_ST                    run.stat    reproducibility passed :  13906+ 
     126WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  13906+ 
     127WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  13906+ 
     128WAGRIF_DEMO_ST               run.stat    reproducibility FAILED :  13906+  (results are different after  8      time steps) 
     129WWED025_ST                   run.stat    reproducibility passed :  13906+ 
     130WISOMIP+_ST                  run.stat    reproducibility passed :  13906+ 
     131WVORTEX_ST                   run.stat    reproducibility passed :  13906+ 
     132WICE_AGRIF_ST                run.stat    reproducibility passed :  13906+ 
     133 
     134   !----agrif check----! 
     135ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat    unchanged  -    passed :  13906+ 13906+ 
     136 
     137   !----result comparison check----! 
     138 
     139check result differences between : 
     140VALID directory : /work/asc/fm27215/dev_r13898_Tiling_Cleanup_MPI3/NEMO_VALIDATION at rev 13906+ 
     141and 
     142REFERENCE directory : /work/asc/fm27215/trunk@r13787/trunk/NEMO_VALIDATION_H2_BERG at rev 13787 
     143 
     144WGYRE_PISCES_ST       run.stat    files are identical 
     145WGYRE_PISCES_ST       tracer.stat files are identical 
     146WORCA2_ICE_PISCES_ST  run.stat    files are identical  
     147WORCA2_ICE_PISCES_ST  tracer.stat files are identical  
     148WORCA2_OFF_PISCES_ST  tracer.stat files are identical 
     149WAMM12_ST             run.stat    files are identical 
     150WORCA2_SAS_ICE_ST     run.stat    files are identical 
     151WAGRIF_DEMO_ST        run.stat    files are DIFFERENT (results are different after  17  time steps) 
     152WWED025_ST            run.stat    files are identical 
     153WISOMIP+_ST           run.stat    files are identical 
     154WVORTEX_ST            run.stat    files are identical 
     155WICE_AGRIF_ST         run.stat    files are identical 
     156WOVERFLOW_ST          incomplete test 
     157WLOCK_EXCHANGE_ST     incomplete test 
    112158 
    113159}}}