Changes between Initial Version and Version 1 of Doc/ComputingCenters/TGCC/IreneAmd


Ignore:
Timestamp:
02/25/20 12:05:20 (4 years ago)
Author:
aclsce
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Doc/ComputingCenters/TGCC/IreneAmd

    v1 v1  
     1{{{ 
     2#!html 
     3<h1>Working on the Irene-amd machine </h1> 
     4}}} 
     5---- 
     6[[PageOutline(1-3,Table of contents,,numbered)]] 
     7 
     8# Introduction # 
     9 * On-line users manual: https://www-tgcc.ccc.cea.fr/docs/irene.info.pdf (you will need a TGCC login and password) 
     10  * Partition name : Rome 
     11  * CPUs: 2x64 AMD Rome@2.6Ghz (AVX2) 
     12  * Core/Node: 128 
     13  * Nodes: 2292 
     14  * Total core: 293376 
     15  * RAM/Node: 256GB 
     16  * RAM/core : 2GB 
     17 * When submitting a job through ccc_msub or ccc_mprun, you must specify -m work, -m scratch, -m store, or combine them like in -m work,scratch. This constraint has the advantage that your jobs won't be suspended if a file system you don't need becomes unavailable. This is done in all jobs in libIGCM. 
     18 * The default time limit for a job submission is 2 hours (7 200 s). 
     19 * Irene post-processing nodes : xlarge are free and useful for post-processing operations. 
     20  * Fat nodes for computation requiring a lot of shared memory (coming from irene.info) 
     21  * Partition name: ''xlarge''' 
     22  * CPUs: 4x28-cores Intel Skylake@2.1GHz 
     23  * GPUs: 1x Nvidia Pascal P100 
     24  * !Cores/Node: 112 
     25  * Nodes: 5 
     26  * Total cores: 560 
     27  * RAM/Node: 3TB 
     28  * RAM/Core: 5.3GB 
     29  * IO: 2 HDD de 1 TB + 1 SSD 1600 GB/NVMe 
     30  
     31# Job manager commands # 
     32 * {{{ccc_msub job}}} -> submit a job  
     33 * {{{ccc_mdel ID}}} -> kill the job with the specified ID number  
     34 * {{{ccc_mstat -u login}}} -> display all jobs submitted by login, add {{{-f}}} to see full job name 
     35 * {{{ccc_mpp}}} -> display all jobs submitted on the machine. {{{ ccc_mpp -n }}} to avoid colors. 
     36 * {{{ ccc_mpp  -u $(whoami)}}} ->display your jobs. 
     37 
     38# Suggested environment # 
     39 
     40## General environment ## 
     41 
     42Before working on Irene you need to prepare your environment. This is important to do before compilation to ensure the use of same modules as done by libIGCM running environment. We propose you 2 files which you can copy from the home igcmg. The first one called '''bashrc''' will source the second called '''bashrc_irene'''. Copy both files to your home, rename them by adding a dot as prefix. You can add personal settings in your .bashrc_irene-amd. Do as follow:  
     43{{{ 
     44cp ~igcmg/MachineEnvironment/irene/bashrc ~/.bashrc 
     45cp ~igcmg/MachineEnvironment/irene/bashrc_irene-amd ~/.bashrc_irene-amd 
     46}}} 
     47The .bashrc will source your own .bashrc_irene-amd which must be in your home.  
     48 
     49After re-connexion or source of .bashrc_irene-amd, check your loaded modules for intel, netcdf, mpi, hdf5 needed for the compilation:  
     50{{{ 
     51module list  
     52Currently Loaded Modulefiles: 
     53 1) ccc/1.0(default)                6) feature/mkl/lp64         11) c/intel/19.0.5.281(default)        16) feature/openmpi/mpi_compiler/intel(default)  21) sharp/2.0(default)          26) netcdf-c/4.6.0(default) 
     54 2) datadir/drf                     7) feature/mkl/sequential   12) c++/intel/19.0.5.281(default)      17) flavor/openmpi/standard(default)             22) hcoll/4.4.2938(default)     27) netcdf-fortran/4.4.4(default) 
     55 3) datadir/own(default)            8) feature/mkl/single_node  13) fortran/intel/19.0.5.281(default)  18) feature/openmpi/net/auto(default)            23) ucx/1.7.0(default)          28) hdf5/1.8.20(default) 
     56 4) dfldatadir/own(default)         9) mkl/19.0.5.281           14) intel/19.0.5.281(default)          19) .tuning/openmpi/4.0.2                        24) mpi/openmpi/4.0.2(default)  29) feature/bridge/heterogenous_mpmd 
     57 5) flavor/buildcompiler/intel/19  10) licsrv/intel             15) flavor/buildmpi/openmpi/4.0        20) hwloc/2.0.4                                  25) flavor/hdf5/parallel 
     58}}} 
     59 
     60The modules are specified in the file ~igcmg/MachineEnvironment/irene-amd/env_irene-amd which is sourced in bashrc_irene-amd. The same file env_irene-amd is sourced in libIGCM. 
     61 
     62--> Be careful this environment can be update during next weeks according to TGCC recommendations 
     63 
     64 
     65[[NoteBox(note, Create ~/.forward file in your main home containing only one line with your email address to receive emails from libIGCM. , 600px)]] 
     66 
     67## Subversion version ## 
     68 
     69Since only recent subversion version (i.e > 1.6) are installed on Irene supercomputer, some usual functionalities are not available anymore (ex : svn command on copy of subdirectories...). In order to keep these functionalities, subversion 1.6.9 has been installed. To use this version : 
     70{{{ 
     71module unload subversion 
     72module use ~igcmg/Modules/tools 
     73module load subversion/1.6.9 
     74irene190 : svn --version 
     75 
     76svn, version 1.6.9 (r901367) 
     77 
     78  compiled Oct 31 2018, 11:12:49 
     79}}} 
     80 
     81The use of this version has been added in default environment ~igcmg/MachineEnvironment/irene-amd/bashrc_irene-amd. 
     82Beware of coherence of subversion version you use from the extraction of your model/configuration to the use of svn commands in directories of your model/configuration. 
     83 
     84# How the storage project is set by libIGCM # 
     85When you use libIGCM it is recommended to dedicate one modipsl/libIGCM to one project allocation. By default, the output folders IGCM_OUT will be created in the directories $CCCSCRATCHDIR, $CCCWORKDIR and $CCCSTOREDIR corresponding to the project used in the main job. It is important that the same project is used in the post-processing jobs in libIGCM.  
     86 
     87[[NoteBox(warn,For gencmip6 project\, you have to set a subproject for computing and gencmip6 is forced for all directories., 600px)]] 
     88 
     89If you need to use another project for the computing than the storage, it is possible to set the variable '''!DataProject''' in config.card !UserChoices section, for example !DataProject=gen6328, read more [wiki:DocEsetup#TheUserChoicessection here]. This project will be used for all output directories for the computing job and post-processing jobs even if they have another project for computing in the headers. The variable''' !DataProject''' can also be used if you work with different project allocations in the same modipsl. Only exception (harmless) is the first RUN_DIR folder which is always created in the $CCCSCRATCHDIR corresponding to the dfldatadir loaded in when submitting main job. When the job resubmits itself, the RUN_DIR will be in the same project space as the rest of the output.  
     90# Example of ins_job  
     91{{{ 
     92> ./ins_job 
     93...  
     94Wait for the next question ... 
     95Hit Enter or give project ID (default is gen0000), possible projects are gen1111 gen2222 ... or other xxxcmip6 : aaacmip6  bbbcmip6      
     96gen0000 (RETURN) 
     97ProjectID is gen0000 at Irene 
     98Hit Enter or give TYPE OF NODE required for post-processing (default is "xlarge"), possible types of nodes are "skylake" or "xlarge" :  
     99(RETURN) 
     100ProjectNode for post-processing is xlarge at Irene 
     101Hit Enter or give NUMBER OF CORES required for post-processing (default is "8") 
     102possible numbers of cores are "1" to "112" for xlarge :  
     103(RETURN) 
     104ProjectCore for post-processing is 8 
     105Wait for the next question ... 
     106Hit Enter or give project ID (default is gen0000), possible projects are gen1111 gen2222 ... or other xxxcmip6 : aaacmip6  bbbcmip6     
     107(RETURN) 
     108PostID is gen0000 at Irene on xlarge for post-processing 
     109}}} 
     110 
     111# File systems used on Irene # 
     112A figure to illustrate Irene filesystems is available [wiki:Doc/ComputingCenters/TGCC#TGCCsmachinesandfilesystems here] 
     113 
     114# Irene job headers # 
     115Here is an example of a job header as generated by libIGCM on the Irene-amd machine: 
     116 
     117{{{ 
     118###################### 
     119## IRENE   TGCC/CEA ## 
     120###################### 
     121#MSUB -r MY-SIMULATION 
     122#MSUB -o Script_Output_MY-SIMULATION.000001 
     123#MSUB -e Script_Output_MY-SIMULATION.000001 
     124#MSUB -eo 
     125#MSUB -n 976 
     126#MSUB -x 
     127#MSUB -T 86400 
     128#MSUB -A dekcmip6 
     129#MSUB -q rome 
     130#MSUB -m store,work,scratch 
     131}}} 
     132 
     133The detail is as follows: 
     134 
     135|| '''Control''' || '''Keyword''' || '''Argument''' || '''Example''' || '''Comments''' || 
     136|| ''Job name'' || {{{-r}}} || string || {{{#MSUB -r Job_MY-SIMULATION}}} || The string should not contain the underscore (_) character. || 
     137|| ''Standard output file name'' || {{{-o}}} || string || {{{#MSUB -o Script_Output_MY-SIMULATION.000001}}} || || 
     138|| ''Error output file name'' || {{{-e}}} || string || {{{#MSUB -e Script_Output_MY-SIMULATION.000001}}} || If both {{{-o}}} and {{{-e}}} names are the same, the two outputs will be merged. The {{{-eo}}} option of the example is as a matter of fact redundant with that. || 
     139|| ''Number of MPI tasks allocated'' || {{{-n}}} || integer || {{{#MSUB -n 976}}} || This is for the SPMD case; in the MPMD case, the integer is the number of cores. || 
     140|| ''Node exclusivity'' || {{{-x}}} || none || {{{#MSUB -x}}} || || 
     141|| ''Wall-time (maximum time allowed for execution)'' || {{{-T}}} || integer || {{{#MSUB -T 86400}}} || Time is expressed in number of seconds. || 
     142|| ''Project allocation'' || {{{-A}}} || string || {{{#MSUB -A dekcmip6}}} || || 
     143|| ''Partition used'' || {{{-q}}} || string || {{{#MSUB -q rome}}} || Choice is amongst {{{rome}}} || 
     144|| ''Visible spaces'' || {{{-m}}} || (comma-separated) string(s) || {{{#MSUB -m store,work,scratch}}} || Specific to TGCC. The job can only "see" spaces specified on this line. Possible spaces are amongst {{{store}}} and/or {{{work}}} and/or {{{scratch}}}. || 
     145|| ''Job priority'' || {{{-U}}} || string || {{{#MSUB -U high}}} || Possible arguments are {{{high}}}, {{{medium}}} or {{{low}}}. (This command does not appear by default on jobs generated by libIGCM.) ||