New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
#354 (ORCA-LIM3 MPI problem?) – NEMO

Opened 15 years ago

Closed 10 years ago

Last modified 2 years ago

#354 closed Defect (fixed)

ORCA-LIM3 MPI problem?

Reported by: MAMM Owned by: vancop
Priority: normal Milestone:
Component: LIM3 Version: v3.0
Severity: Keywords: LIM* MPI v3.0
Cc:

Description

Hi,

I am running both ORCA-LIM2 and ORCA-LIM3 on a Beowulf-style cluster. ORCA-LIM2 runs fine. ORCA-LIM3 runs gingerly for 8.6 years of integration and then abruptly stops. The model results are just fine until this moment. I get the following message:

<NO ERROR MESSAGE> : Pointer conversions exhausted
Too many MPI objects may have been passed to/from Fortran
without being freed

This means nothing to me, and I do not know where does the problem first occur, but I expect it must be a LIM3 issue, as I have no trouble with LIM2. I have, nevertheless, checked that the model does not run in to out-of-bounds problems. Before I start searching for a solution, I wanted to check whether other NEMO users/developers have come across this problem.

Thanks,

Miguel Angel

Commit History (0)

(No commits)

Change History (9)

comment:1 in reply to: ↑ description Changed 15 years ago by rblod

Replying to MAMM:

Hi,

I am running both ORCA-LIM2 and ORCA-LIM3 on a Beowulf-style cluster. ORCA-LIM2 runs fine. ORCA-LIM3 runs gingerly for 8.6 years of integration and then abruptly stops. The model results are just fine until this moment. I get the following message:

<NO ERROR MESSAGE> : Pointer conversions exhausted
Too many MPI objects may have been passed to/from Fortran
without being freed

This means nothing to me, and I do not know where does the problem first occur, but I expect it must be a LIM3 issue, as I have no trouble with LIM2. I have, nevertheless, checked that the model does not run in to out-of-bounds problems. Before I start searching for a solution, I wanted to check whether other NEMO users/developers have come across this problem.

Thanks,

Miguel Angel

Hi Miguel

I would suspect the routine mpp_ini_ice (in module lib_mpp.F90) in which I created a special communicator for ice processors only, called ncomm_ice. This routine is called at each ice time step and for each category before thermodynamics call. The communicator is always overwritten but never really freed. A cleaner way to do this could may be a call to MPI_COMM_FREE(ncomm_ice) at the end at the end of the ice time_step or at the beginning of the routine.

I hope it helps

Rachid

comment:2 follow-up: Changed 15 years ago by MAMM

Rachid,

I am not sure I understand. There are already calls to mpp_comm_free(ncomm_ice) within the loop over categories in limthd.F90. I do not see how adding another call at the end can help. In fact, when I try, I get the following message: MPI_COMM_FREE : Null communicator

Sorry for the trouble. I realise it is probably a silly problem, but I just cannot see how to go about solving it.

Miguel Angel

comment:3 in reply to: ↑ 2 Changed 15 years ago by rblod

All apologise for the trouble, I forgot I freed the communicator in limthd when I implemented this....
In the same routine mpp_ini_ice, It may deal with the ngrp_ice which is not destructed, and could be just after the creation of the ice communicator (something like call mpi_group_free(ngrp_ice), but here my knowledge of mpi reaches its limits.
In addition, for what I know, LIM3 has been run successfully on parallel at NOCS for a long time period

Replying to MAMM:

Rachid,

I am not sure I understand. There are already calls to mpp_comm_free(ncomm_ice) within the loop over categories in limthd.F90. I do not see how adding another call at the end can help. In fact, when I try, I get the following message: MPI_COMM_FREE : Null communicator

Sorry for the trouble. I realise it is probably a silly problem, but I just cannot see how to go about solving it.

Miguel Angel

comment:4 Changed 12 years ago by clevy

  • Owner changed from NEMO team to vancop

comment:5 Changed 10 years ago by clem

  • Resolution set to fixed
  • Status changed from new to closed

comment:6 Changed 7 years ago by nemo

  • Keywords LIM* added

comment:7 Changed 7 years ago by nemo

  • Keywords release-3.0 added

comment:8 Changed 2 years ago by nemo

  • Keywords r3.0 added; release-3.0 removed

comment:9 Changed 2 years ago by nemo

  • Keywords v3.0 added; r3.0 removed
Note: See TracTickets for help on using tickets.