Opened 4 years ago
Closed 4 years ago
#2551 closed Bug (fixed)
bug with nn_fwb=1 (mpp_delay_sum): global sum not updated
Reported by: | mathiot | Owned by: | systeam |
---|---|---|---|
Priority: | low | Milestone: | |
Component: | LBC | Version: | trunk |
Severity: | minor | Keywords: | lib_mpp |
Cc: |
Description
Context
results of mpp_delay_sum are not updated between the first time step and the time step when restarts are written. This is similar to #2381. I suspect similar error in mpp_delay_max (not tested). The concern file is LBC/lib_mpp.F90.
Analysis
On UGA cluster, the MPI request handle value after call to mpi_iallreduce (ie, variable: ndelayid(idvar)) is not necessary a positive integer (test on my cluster always output negative integer for the handle), so mpp_delay_rcv is not call and so the value not updated.
Question: MPI3 (which allow non blocking mpi_allreduce) is 8 years old. Maybe the support to mpi2 is no more needed ?
Similar ticket: #2381 (nn_fwb=1 and key_mpi2)
Not tested, but I suspect it also affects NEMO4.
Fix
Suggestion for a fix:
- 1: make sure than after initialisatoin (case when we start from a restart or without a restart), ndelayid(idvar) have MPI_REQUEST_NULL value if a mpi_wait is not needed.
- 2: always call mpp_delay_rcv
- 3: if ndelayid(idvar) .NE. MPI_REQUEST_NULL => wait the request to be completed
- 4: compute todelay(kid)%z1d( : ) if needed (only for the sum)
- 5: in case of mpi2, force correct value for ndelayid(idvar) because not defined by mpi_allreduce (MPI_REQUEST_NULL)
- 6: apply similar change in mpp_delay_max where needed.
Test with mpi (not mpi2) shows a correct behavior:
mpp_delay_sum output at time steep k = glob_sum at time step k-1
Review of the analysis and suggested changes are needed because I am out of my comfort zone here.
--- lib_mpp.F90 (revision 13582) +++ lib_mpp.F90 (working copy) @@ -510,6 +510,7 @@ ELSE ALLOCATE(todelay(idvar)%y1d(isz)) todelay(idvar)%y1d(:) = CMPLX(todelay(idvar)%z1d(:), 0., wp) ! create %y1d, complex variable needed by mpi_sumdd + ndelayid(idvar) = MPI_REQUEST_NULL ! initialised request to a valid value END IF ENDIF @@ -517,10 +518,10 @@ ! -------------------------- ALLOCATE(todelay(idvar)%z1d(isz), todelay(idvar)%y1d(isz)) ! allocate also %z1d as used for the restart CALL mpi_allreduce( y_in(:), todelay(idvar)%y1d(:), isz, MPI_DOUBLE_COMPLEX, mpi_sumdd, ilocalcomm, ierr ) ! get %y1d - todelay(idvar)%z1d(:) = REAL(todelay(idvar)%y1d(:), wp) ! define %z1d from %y1d + ndelayid(idvar) = MPI_REQUEST_NULL ENDIF - IF( ndelayid(idvar) > 0 ) CALL mpp_delay_rcv( idvar ) ! make sure %z1d is received + CALL mpp_delay_rcv( idvar ) ! make sure %z1d is received ! send back pout from todelay(idvar)%z1d defined at previous call pout(:) = todelay(idvar)%z1d(:) @@ -529,7 +530,7 @@ # if defined key_mpi2 IF( ln_timing ) CALL tic_tac( .TRUE., ld_global = .TRUE.) CALL mpi_allreduce( y_in(:), todelay(idvar)%y1d(:), isz, MPI_DOUBLE_COMPLEX, mpi_sumdd, ilocalcomm, ierr ) - ndelayid(idvar) = 1 + ndelayid(idvar) = MPI_REQUEST_NULL IF( ln_timing ) CALL tic_tac(.FALSE., ld_global = .TRUE.) # else CALL mpi_iallreduce( y_in(:), todelay(idvar)%y1d(:), isz, MPI_DOUBLE_COMPLEX, mpi_sumdd, ilocalcomm, ndelayid(idvar), ierr ) @@ -597,9 +598,10 @@ ! -------------------------- ALLOCATE(todelay(idvar)%z1d(isz)) CALL mpi_allreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_DOUBLE_PRECISION, mpi_max, ilocalcomm, ierr ) ! get %z1d + ndelayid(idvar) = MPI_REQUEST_NULL ENDIF - IF( ndelayid(idvar) > 0 ) CALL mpp_delay_rcv( idvar ) ! make sure %z1d is received + CALL mpp_delay_rcv( idvar ) ! make sure %z1d is received ! send back pout from todelay(idvar)%z1d defined at previous call pout(:) = todelay(idvar)%z1d(:) @@ -607,7 +609,7 @@ ! send p_in into todelay(idvar)%z1d with a non-blocking communication # if defined key_mpi2 IF( ln_timing ) CALL tic_tac( .TRUE., ld_global = .TRUE.) - CALL mpi_allreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_TYPE, mpi_max, ilocalcomm, ndelayid(idvar), ierr ) + CALL mpi_allreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_TYPE, mpi_max, ilocalcomm, ierr ) IF( ln_timing ) CALL tic_tac(.FALSE., ld_global = .TRUE.) # else CALL mpi_iallreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_TYPE, mpi_max, ilocalcomm, ndelayid(idvar), ierr ) @@ -628,17 +630,15 @@ !!---------------------------------------------------------------------- INTEGER,INTENT(in ) :: kid INTEGER :: ierr !!---------------------------------------------------------------------- #if defined key_mpp_mpi - IF( ndelayid(kid) /= -2 ) THEN -#if ! defined key_mpi2 + IF( ndelayid(kid) .NE. MPI_REQUEST_NULL ) THEN ! if could probably be removed as it seems mpi_wait can digest MPI_REQUEST_NULL as input (nothing is done, return immediately) IF( ln_timing ) CALL tic_tac( .TRUE., ld_global = .TRUE.) - CALL mpi_wait( ndelayid(kid), MPI_STATUS_IGNORE, ierr ) ! make sure todelay(kid) is received - IF( ln_timing ) CALL tic_tac(.FALSE., ld_global = .TRUE.) -#endif - IF( ASSOCIATED(todelay(kid)%y1d) ) todelay(kid)%z1d(:) = REAL(todelay(kid)%y1d(:), wp) ! define %z1d from %y1d - ndelayid(kid) = -2 ! add flag to know that mpi_wait was already called on kid + CALL mpi_wait( ndelayid(kid), MPI_STATUS_IGNORE, ierr ) ! after this ndelayid(kid) = MPI_REQUEST_NULL + IF( ln_timing ) CALL tic_tac( .FALSE., ld_global = .TRUE.) ENDIF + IF( ASSOCIATED(todelay(kid)%y1d) ) todelay(kid)%z1d(:) = REAL(todelay(kid)%y1d(:), wp) ! define %z1d from %y1d #endif END SUBROUTINE mpp_delay_rcv
Commit History (2)
Changeset | Author | Time | ChangeLog |
---|---|---|---|
13636 | mathiot | 2020-10-19T16:15:38+02:00 | fix ticket #2551 in trunk |
13635 | mathiot | 2020-10-19T16:14:38+02:00 | fix ticket #2551 in NEMO4 |
Change History (4)
comment:1 Changed 4 years ago by mathiot
comment:2 Changed 4 years ago by mathiot
In 13635:
comment:3 Changed 4 years ago by mathiot
In 13636:
comment:4 Changed 4 years ago by mathiot
- Resolution set to fixed
- Status changed from new to closed
After discussion with Seb., we agreed on:
- the suggested solution (the test on the request value has been removed and a comment has been added to explain why)
- support of mpi2 should be removed but it is not the purpose of this ticket. So it is not done here. I will let the ST do it if needed.
wrong highlighting:
lib_mpp.F90
todelay(idvar)%z1d(:) = REAL(todelay(idvar)%y1d(:), wp) ! define %z1d from %y1dIF( ndelayid(idvar) > 0 )CALL mpp_delay_rcv( idvar ) ! make sure %z1d is received1IF( ndelayid(idvar) > 0 )CALL mpp_delay_rcv( idvar ) ! make sure %z1d is receivedndelayid(idvar),ierr )