source: CPL/oasis3-mct/branches/OASIS3-MCT_5.0_branch/doc/UG6_Compilation_running.tex @ 6331

Last change on this file since 6331 was 6331, checked in by aclsce, 17 months ago

Moved oasis-mct_5.0 in oasis3-mct/branches directory.

  • Property svn:executable set to *
File size: 18.3 KB
Line 
1\newpage
2%
3
4\chapter{Compiling, running, debugging, load balancing}
5\label{sec_compilationrunning}
6
7\section{Compiling OASIS3-MCT}
8\label{subsec_compile}
9
10OASIS3-MCT is a mixed MPI-OpenMP parallel code. Compiling OASIS3-MCT libraries can be done
11from the {\tt oasis3-MCT/util/make\_dir} directory
12with the makefile {\tt TopMakefileOasis3}.
13
14{\tt TopMakefileOasis3} includes the header file {\tt make.inc}
15which should then point to (include) your own {\tt make.your\_platform}
16file.  That file is specific to the hardware and compiling platform used.
17
18Several header files are distributed with the release and can by used as
19a template to create a custom file for your machine.
20The root of the OASIS3-MCT tree can be anywhere, but it must be defined
21by the variable {\tt COUPLE}.  Similarly, the variable {\tt ARCHDIR} defines
22the location of the compilation directory.
23Finally, the OASIS3-MCT library
24should be compiled with the same compilers and system software as any
25coupled model component using it.  After successful compilation, resulting
26libraries are found in the directory in {\tt \$ARCHDIR/lib} while
27object and module files are found in {\tt \$ARCHDIR/build-static} and {\tt \$ARCHDIR/build-shared}.
28
29OASIS3-MCT has historically created static libraries for use in
30Fortran source codes.  However, C language bindings are now
31available, and python codes are now fully supported. Therefore, the OASIS3-MCT makefile {\tt TopMakefileOasis3}
32supports compilation of both static and shared libraries.
33
34{\tt TopMakefileOasis3} has several targets including:
35
36\begin{itemize}
37\item  {\tt oasis3-psmile       =} static-libs-fortran (for backwards compatibility)
38\item  {\tt static-libs-fortran =} static OASIS3-MCT libraries for Fortran only (default)
39\item  {\tt shared-libs-fortran =} shared (dynamic) OASIS3-MCT libraries for Fortran only
40\item  {\tt static-libs         =} static OASIS3-MCT libraries including Fortran and c-bindings
41\item  {\tt shared-libs         =} shared (dynamic) OASIS3-MCT libraries including Fortran and c-bindings
42\item  {\tt pyoasis             =} builds and installs shared-libs plus higher and intermediate python classes
43\item  {\tt realclean           =} cleans and resets the build
44\end{itemize}
45
46The names of the libraries produced
47are {\it mct}, {\it mpeu}{\it scrip}, {\it psmile.MPI1}, and {\it oasis.cbind}
48with standard prefixes ({\it lib}) and suffixes ({\it .a} or {\it .so}).
49
50The following targets have been used historically to compile
51OASIS3-MCT for Fortran
52codes and they are all still supported:
53
54\begin{itemize}
55\item {\tt make -f TopMakefileOasis3 help}
56
57  provides a current list of available targets.
58
59\item {\tt make -f TopMakefileOasis3 realclean}
60
61  removes all OASIS3-MCT compiled sources and librairies.
62
63\item {\tt make -f TopMakefileOasis3} or
64
65      {\tt make -f TopMakefileOasis3 oasis3\_psmile}
66
67  compiles static versions of OASIS3-MCT Fortran libraries {\it mct}, {\it mpeu},
68  {\it scrip} and {\it psmile};
69
70\end{itemize}
71
72Log and error messages from compilation are normally saved in the directory
73{\tt /util/make\_dir} in the files
74{\tt COMP.log} and {\tt COMP.err} or similar.  The {\tt TopMakefileOasis3}
75output will direct users to the compile output files.
76
77To interface a component code with OASIS3-MCT and use the module {\tt mod\_oasis} (see section \ref{subsubsec_Use}), it is required to include OASIS3-MCT modules from {\tt \$ARCHDIR/include} and link with appropriate libraries in {\tt \$ARCHDIR/lib} during the compilation and linking.
78
79Exchange of coupling fields in single and double precision is now supported directly through the interface
80(see section \ref{subsubsec_Declaration}).  Single precision fields are converted to double precision fields internally and temporarily.
81For double precision coupling fields, there is no need to promote REAL variables to DOUBLE PRECISION at compilation; this is done automatically within the OASIS3-MCT library.
82
83\section{CPP keys}
84\label{subsec_cpp}
85
86The following OASIS3-MCT CPP keys can be specified in {\tt CPPDEF} in {\tt make.{\it your\_platform}} file:
87\begin{itemize}
88\item {\tt TREAT\_OVERLAY}:  ensures, in {\tt SCRIPR/CONSERV}
89  remapping (see section \ref{subsec_interp}), that if two cells of
90  the source grid overlay and none is masked a priori, the one with the greater numerical
91  index will not be considered (they also can be both masked); this is mandatory
92  for this remapping. For example, if the grid line with {\it i=1} overlaps
93  the grid line with {\it i=imax}, it is the latter that must be masked;
94  when this is not the case with the mask defined in {\it masks.nc},
95  this CPP key forces these rules to be respected.
96
97\item {\tt \_\_NO\_16BYTE\_REALS}{\bf must} be specified  if you compile
98with {\bf PGF90}.
99\end{itemize}
100
101%\item {\tt balance}: Add a MPI\_Wtime() function before and after
102%  mct\_isend (MPI put) and mct\_recv (MPI get) to calculate the time
103%  of the send and receive of a coupling field. This option can be used
104%  to produce timestamps in OASIS3-MCT debug files. During a post-processing
105%  phase, this information can be used to perform an analysis of the
106%  coupled components load (un)balance; specific tools have been
107%  developed to do this and will be released with a further version of
108%  OASIS3-MCT. {\bf This option is temporarily not recommended as it was observed that
109%  it was increasing the simulation time of coupled models on
110%  the PRACE computer MareNostrum.}
111
112\section{Examples on how to run OASIS3-MCT}
113\label{subsec_running}
114
115The following examples of running environments are provided with the
116sources in the {\tt oasis3-mct/examples} directory.
117
118\subsection{tutorial\_communication}
119\label{subsec_tutorial}
120
121The directory  {\tt oasis3-mct/examples/tutorial\_communication}
122contains the files of a tutorial to learn how to instrument codes
123with calls to the OASIS3-MCT library in order to couple them
124together. The tutorial involves two toy model codes, {\tt ocean.F90}
125and {\tt atmos.F90}, to be instrumented with calls
126to OASIS3-MCT API (Application Program Interface) routines. Toy models
127are skeleton programs that do not contain any real physics or dynamics
128but that can reproduce real exchanges of coupling
129fields. Instrumenting those toy models gives a practical experience of
130using the OASIS3-MCT library. All information about this tutorial is
131provided in the document {\tt tutorial\_communication.pdf} therein.
132
133This tutorial is extracted from the Short Online Private Course (SPOC)
134on “Code Coupling with OASIS3-MCT” shortly described in the next section.
135
136\subsection{spoc}
137\label{subsec_spoc}
138
139This directory contains the sources used in the Short Online Private
140Course (SPOC) on “Code Coupling with OASIS3-MCT” developed in the
141framework of the ESiWACE Centre of Excellence. This SPOC is composed
142of videos, quizzes and hands-on. The goal is to instrument two toy
143models to set-up a real coupled model exchanging coupling fields (directory
144  /spoc\_communication) and to learn more about OASIS3-MCT regridding
145functionality (directory {\tt /spoc\_regridding}). If you are interested in
146attending the SPOC, please visit the online training section of
147CERFACS web site at https://cerfacs.fr/online-training/.
148
149Videos and quizzes extracted from the SPOC are also available
150as Open Education Resources (OER) material at https://www.oercommons.org/courseware/lesson/85340 .
151
152\subsection{regrid\_environment}
153\label{subsec_regrid}
154
155The {\tt regrid\_environment} directory offers a scripting environment to
156calculate the regridding weights and the regridding error for specific
157couple of grids and specific regridding algorithms with either the
158SCRIP library, ESMF or XIOS. The document {\tt
159  regrid\_environment\_documentation.pdf} therein contains all
160instructions on how to run this tutorial.
161
162\subsection{Fortran, C and python equivalent examples}
163\label{subsec_equivalent}
164
165Different examples implementing the different parts of the API with the Fortran, C and python interfaces are provided as practical illustrations in directory {\tt /pyoasis/examples} :
166
167\begin{itemize}
168
169\item{{\tt 1-serial}}: one coupling exchange between a serial sender and a serial receiver.
170\item{{\tt 2-apple}}: one coupling exchange between an Apple-parallel sender and a serial receiver; an additonal component, not part of the coupling, is also started and the example shows how to use the {\tt commworld} argument, in Fortran and C, and the communicator optional argument when setting the component in python.
171\item{{\tt 3-box}}: one coupling exchange between an Box-parallel sender and a serial receiver; it shows also how to check if a coupling field declared in the code is activated in the configuration file {\it namcouple}.
172\item{{\tt 4-orange}}: one coupling exchange between an Orange-parallel sender and a serial receiver; not all processes of the sender participate in the coupling and this example shows how to use {\tt create\_couplcomm}.
173\item{{\tt 5-points}}: one coupling exchange between a Point-parallel sender and a serial receiver.
174\item{{\tt 6-apple\_and\_orange}}: one coupling exchange between an Apple-parallel sender and an Orange-parallel receiver; not all processes of the sender participate in the coupling and this example shows how to use {\tt set\_couplcomm}.
175\item{{\tt 7-multiple-puts}}: two coupling fields are both sent from a serial sender to two different serial receivers; this example also sets up an intra communicator between the sender and one receiver and an inter communicator between the sender and the other receiver.
176\item{{\tt 8-interoperability/fortran\_and\_C}}: implements a coupling of a bundle field, with two bundle elements, between a Fortran Apple-parallel sender and a C component. This C component is Orange-parallel for the reception of the bundle field; it also defines another partition of type Box onto which a second coupling field is defined and sent to a third Fortran serial receiver. The sum of the Box partitions in the C component does not cover the global grid, hence the fourth argument {\tt ig\_size} is used to specify the grid global size. The C component also illustrates how the order of the partition definition does not need to be the same for the different processes but that, in that case, a meaningful {\tt name} fifth argument must be used.
177\item{{\tt 8-interoperability/fortran\_and\_python}}: implements the same coupling exchanges than {{\tt 8-interoperability/fortran\_and\_C}} but with the C component replaced by a python component.
178\item{{\tt 9-python\_fortran\_C-multi\_intracomm}}: illustrates the set-up of an intracommunicator between a Fortran, a C and a python components using OASIS3-MCT; a bcast is then realised to share some data. In this example, an additional component is also launched at start but does not participate in the coupling and hence uses the {\tt coupled} third argument of {\tt oasis\_initi\_comp}.
179\item{{\tt 10-grid}}: a single Box-parallel component defines and writes two grids {\tt pyoa} and {\tt mono}, the first one with distributed calls from all the processes, the second one from the master process only.
180\item{{\tt 11-test-interpolation}}: one exchange of a coupling bundle field defined on real grids involving a first-order concervative regridding between an Apple-parallel sender and a serial receiver.  In the Fortran and C examples, the grids are fixed, while in the python example, the user chooses the source and target grids interactively, among the ones available in the files available in the {\tt common\_data} directory. This example produces graphical output of the received fields if the following packages are installed
181\begin{itemize}
182\item pip3 install matplotlib
183\item pip3 install scipy
184\item pip3 install cartopy
185\item pip3 uninstall shapely
186\item pip3 install shapely --no-binary shapely
187\end{itemize}
188\item{\tt 12-grid-functions}: Graphical version of {{\tt 10-grid}} (i.e. the {\tt pyoa} grid layout is displayed if the same graphical packages than for {{\tt 11-test-interpolation}} are installed).
189\end{itemize}
190
191The different examples can be launched with the {\tt Makefile} from
192directory {\tt /pyoasis} using targets {\tt examples}, {\tt
193  examples\_f} or
194  {\tt examples\_c} to run respectively python, Fortran and C examples.
195
196\section{Debugging}
197\label{subsec_debug}
198
199\subsection{Debug files}
200If you experience problems while running your coupled model with
201OASIS3-MCT, you can obtain more information on what is happening by
202increasing the {\tt \$NLOGPRT} value in your {\it namcouple}, see section
203\ref{subsec_namcouplefirst} for details.
204
205\subsection{Time statistics files}
206\label{timestat}
207
208The variable TIMER\_Debug, defined in the {\it namcouple} (second
209number on the line below \$NLOGPRT keyword), is used to obtain time
210statistics over all the processors for each routine.
211
212Different output are written (in files named {\tt *.timers\_xxxx})
213depending on TIMER\_Debug value :
214
215\begin{itemize}
216\item {TIMER\_Debug=0} : nothing is calculated, nothing is written.
217\item {TIMER\_Debug=1} : the times are calculated and written in a
218  single file by the process 0 as well as the min and the max times
219  over all the processes.
220\item {TIMER\_Debug=2} : the times are calculated and each process
221  writes its own file ; process 0 also writes the min and the max
222  times over all the processes in its file.
223\item {TIMER\_Debug=3} : the times are calculated and each process
224  writes its own file ; process 0 also writes in its file the min
225  and the max times over all processes and also writes in its file
226  all the results for each process.
227\end{itemize}
228 
229The time given for each timer is calculated by the difference between
230calls to {\tt oasis\_timer\_start()} and {\tt oasis\_timer\_stop()}
231and is the accumulated time over the entire run. Here is an overview
232of the meaning of the different timers as implemented by default.
233\footnote{Many other measures can be obtained by defining the logical
234{\tt local\_timers\_on} as {\tt .true.} in different routines or by
235implementing other timers. Of course, OASIS3\_MCT and the model code then have to be recompiled.}
236
237\begin{itemize}
238
239\item {'total'} : total time of the simulation, implemented
240  in {\tt mod\_oasis\_method} (i.e. between the end of {\tt
241    oasis\_init\_comp} and the {\tt
242    mpi\_finalize} in routine {\tt oasis\_terminate}).
243
244\item {'init\_thru\_enddef'} : time between the end of {\tt
245    oasis\_init\_comp} and the end of {\tt oasis\_enddef}, implemented
246  in {\tt mod\_oasis\_method}.
247
248\item {'part\_definition'} : time spent in routine {\tt oasis\_def\_partition}.
249
250\item {'oasis\_enddef'} : time spent in
251  routine {\tt oasis\_enddef}; this routine performs basically all the
252  important steps to initialize the coupling exchanges, e.g. the
253  internal management of the partition and variable definition, the
254  definition of the patterns of communication between the source and
255  target processes, the reading of the remapping weight-and-address
256  file and the initialisation of the sparse matrix vector multiplication.
257\item {'grcv\_00x'} : time spent in the reception of field x in {\tt
258    mct\_recv} (including communication and possible waiting time
259  linked to unbalance of components).
260\item {'wout\_00x'} : time spent in the I/O for field x in routine
261  {\tt oasis\_advance\_run}.
262\item {'gcpy\_00x'} : time spent in routine {\tt oasis\_advance\_run}
263  in copying the field x just received in a local array.
264\item {'pcpy\_00x'} : time spent in routine {\tt oasis\_advance\_run}
265  in copying the local field x in the array to send (i.e. with local
266  transformation besides division for averaging).
267\item {'pavg\_00x'} : time spent in routine {\tt oasis\_advance\_run}
268  to calculate the average of field x (if done).
269\item {'pmap\_00x'/'gmap\_00x'} : time spent in routine {\tt
270    oasis\_advance\_run} for the matrix vector multiplication for
271  field x on the source/target processes.
272\item {'psnd\_00x'} : time spent in routine {\tt oasis\_advance\_run}
273  for sending field x (i.e. including call to {\tt mct\_waitsend} and
274  {\tt mct\_isend}).
275\item {'wtrn\_00x'} : time spent in routine {\tt oasis\_advance\_run}
276  to write fields associated with non-instant loctrans operations to
277  restart files  (see section \ref{subsec_restartdata} for details).
278\item {'wrst\_00x'} : time spent in routine {\tt oasis\_advance\_run}
279  to write fields to
280  restart files (see section \ref{subsec_restartdata} for details).
281\end{itemize}
282
283\section{Load balancing analysis of coupled model components}
284\label{lucia}
285
286  An efficient use of the allocated computing resources in a coupled system requires the harmonisation of the component execution speed. This operation, called load balancing, is often neglected, either because of the apparent resource abundance or practical difficulties.
287 To facilitate this work, a load balancing analysis functionality is included in OASIS3-MCT and can be activated by setting to 1 the third number under {\tt \$NLOGPRT} in the  {\it namcouple} configuration file (see section \ref{subsec_namcouplefirst}). Some details on this functionality are provided here and more information can be found in the {\tt balancing\_documentation.pdf} file in {\tt
288  oasis3-mct/util/load\_balancing} directory.
289
290  When activated, the load balancing analysis functionality outputs the full timeline of all OASIS3-MCT related events, for any of the allocated resources. This timeline is saved in one NetCDF file per coupled component, {\tt timeline\_XXX\_component.nc} where {\tt XXX} is the component name. It provides the comprehensive sequence of all operations related to the coupling (field send and receive through MPI, field output on disk, field interpolation and mapping, field reading on disk, restart writing, initialisation and termination phase of the OASIS3-MCT setup) so that any simulation slow down in link with the use of the OASIS3-MCT library can be identified.
291
292 The analysis of the coupling field exchanges, amongst all coupling events, allows to not only identify the waste of resources by components which are recurrently waiting for their coupling fields but it also reveals other bottlenecks such as disk access or model internal load imbalance. The full picture of these events makes possible an optimal load balancing, even for the most complex configurations.
293
294 In addition to the detailed timeline saved in the NetCDF file, more general computing information (simulation time, speed, waiting time, etc.) is also provided in a text file {\tt load\_balancing\_info.txt}  for the coupled model and for each component. In simple cases, this global information can help to allocate resources in a balanced way.
Note: See TracBrowser for help on using the repository browser.