New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
2021WP/HPC-03_Mele_Comm_Cleanup – NEMO
wiki:2021WP/HPC-03_Mele_Comm_Cleanup

Name and subject of the action

Last edition: Wikinfo(changed_ts)? by Wikinfo(changed_by)?

The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.

  1. Summary
  2. Preview
  3. Tests
  4. Review

Summary

Action Communications cleanup
PI(S) Francesca Mele, Italo Epicoco
Digest This task concerns the removal/shifting of unnecessary communications within routines when using halo 2.
Dependencies
Branch source:/NEMO/branches/2021/dev_r14393_HPC-03_Mele_Comm_Cleanup
Previewer(s) TDB
Reviewer(s) TBD
Ticket #2607

Description

This action aims at completing what was stared in 2020. Due to the halo=2 most of the lbc_lnk communications can be removed or moved earlier in the code. A careful analysis of the DO LOOP ranges leads the removal of the useless lbc_lnk calls as well.

Implementation

The implementation (or better the rationalization) of the lbc_lnk calls will start from DYN and ZDF modules (the TRA module was completed in 2020). Because we decided to support halo=1 execution, we put the lbc_lnk call inside of an IF statement condition whenever needed.

The movement of a communication earlier in the code (in the case of halo=2) often leads to a difference in the outputs when compared with the halo=1 run. This is due to the north folding algorithm. In the attached document (NorthFold and Halo2.pdf) we analysed the problem and found that, due to the north fold, some expressions are evalauted correctly but with a different order of the floating point operations with respect to the halo=1 case.

In order to preserve the bit comparison of the results between the cases halo=1 and halo=2, we have to "force" the order of the floating point operations in both cases by introducing round brackets whithin the interested expressions.

We propose to insert the round brackets in the expressions where needed directly into the trunk. This will cause a version of the trunk which is not bit comparable with the previous one, but the differences will be only due to the changes in the order of the floating point operations, hence they should be acceptable. ticket2607_r14608_halo1_halo2_compatibility branch has been created for this purpose.

The branch dev_r14393_HPC-03_Mele_Comm_Cleanup has been revisioned and merged to dev_r14273_HPC-02_Daley_Tiling. More details can be found in https://forge.ipsl.jussieu.fr/nemo/wiki/2021WP/HPC-02_Daley_Tiling

In the branch has been added a bug fixing on neighbours collectives exchanges regarding the suppression of communications involving only land points. Moreover the communication calls have been set to 5-point on TRA modules where data dependencies are satisfied.

Finally the loop fusion approach has been extended to other routines of NEMO CODE. Implementation details following.

The CALLs to tra_adv_*_lf routines have been shifted from ./TOP/TRP/trcadv.F90 and ./OCE/TRA/traadv.F90 files. The corresponding CALLs have been moved into each single adv rountine ( tra_adv_fct, tra_adv_qck, tra_adv_cen, tra_adv_ubs) and activated if key_loop_fusion is defined. For each adv scheme the correspondent loop-fusion version routine has been added in a new module ( tra_adv_qck_lf, tra_adv_cen_lf, tra_adv_ubs_lf) except for tra_adv_fct whose tra_adv_fct_lf routine has been added in the same module, this is because the tra_adv_fct_lf needs to call other routines already present in the original module.

The same approach has been applied for the lateral diffusivity trends scheme: the new modules dynldf_iso_lf and dynldf_lap_blp_lf have been added and called respectively from dyn_ldf_iso and dyn_ldf_lap_blp if key_loop_fusion is defined.

Finally a couple of loops has been fused in vor_een routine of dyn_vor module.

Documentation updates

Error: Failed to load processor box
No macro or processor named 'box' found

...

Preview

Error: Failed to load processor box
No macro or processor named 'box' found

...

Tests

Error: Failed to load processor box
No macro or processor named 'box' found

...

Review

Error: Failed to load processor box
No macro or processor named 'box' found

...

Last modified 3 years ago Last modified on 2021-05-08T11:30:25+02:00

Attachments (1)

Download all attachments as: .zip