#354 closed Defect (fixed)
ORCA-LIM3 MPI problem?
Reported by: | MAMM | Owned by: | vancop |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | LIM3 | Version: | v3.0 |
Severity: | Keywords: | LIM* MPI v3.0 | |
Cc: |
Description
Hi,
I am running both ORCA-LIM2 and ORCA-LIM3 on a Beowulf-style cluster. ORCA-LIM2 runs fine. ORCA-LIM3 runs gingerly for 8.6 years of integration and then abruptly stops. The model results are just fine until this moment. I get the following message:
<NO ERROR MESSAGE> : Pointer conversions exhausted
Too many MPI objects may have been passed to/from Fortran
without being freed
This means nothing to me, and I do not know where does the problem first occur, but I expect it must be a LIM3 issue, as I have no trouble with LIM2. I have, nevertheless, checked that the model does not run in to out-of-bounds problems. Before I start searching for a solution, I wanted to check whether other NEMO users/developers have come across this problem.
Thanks,
Miguel Angel
Commit History (0)
(No commits)
Change History (9)
comment:1 in reply to: ↑ description Changed 15 years ago by rblod
comment:2 follow-up: ↓ 3 Changed 15 years ago by MAMM
Rachid,
I am not sure I understand. There are already calls to mpp_comm_free(ncomm_ice) within the loop over categories in limthd.F90. I do not see how adding another call at the end can help. In fact, when I try, I get the following message: MPI_COMM_FREE : Null communicator
Sorry for the trouble. I realise it is probably a silly problem, but I just cannot see how to go about solving it.
Miguel Angel
comment:3 in reply to: ↑ 2 Changed 15 years ago by rblod
All apologise for the trouble, I forgot I freed the communicator in limthd when I implemented this....
In the same routine mpp_ini_ice, It may deal with the ngrp_ice which is not destructed, and could be just after the creation of the ice communicator (something like call mpi_group_free(ngrp_ice), but here my knowledge of mpi reaches its limits.
In addition, for what I know, LIM3 has been run successfully on parallel at NOCS for a long time period
Replying to MAMM:
Rachid,
I am not sure I understand. There are already calls to mpp_comm_free(ncomm_ice) within the loop over categories in limthd.F90. I do not see how adding another call at the end can help. In fact, when I try, I get the following message: MPI_COMM_FREE : Null communicator
Sorry for the trouble. I realise it is probably a silly problem, but I just cannot see how to go about solving it.
Miguel Angel
comment:4 Changed 12 years ago by clevy
- Owner changed from NEMO team to vancop
comment:5 Changed 10 years ago by clem
- Resolution set to fixed
- Status changed from new to closed
comment:6 Changed 7 years ago by nemo
- Keywords LIM* added
comment:7 Changed 7 years ago by nemo
- Keywords release-3.0 added
comment:8 Changed 2 years ago by nemo
- Keywords r3.0 added; release-3.0 removed
comment:9 Changed 2 years ago by nemo
- Keywords v3.0 added; r3.0 removed
Replying to MAMM:
Hi Miguel
I would suspect the routine mpp_ini_ice (in module lib_mpp.F90) in which I created a special communicator for ice processors only, called ncomm_ice. This routine is called at each ice time step and for each category before thermodynamics call. The communicator is always overwritten but never really freed. A cleaner way to do this could may be a call to MPI_COMM_FREE(ncomm_ice) at the end at the end of the ice time_step or at the beginning of the routine.
I hope it helps
Rachid