{{{ #!html

Working on Ada

}}} ---- [[PageOutline(1-3,Table of contents,,numbered)]] # IDRIS users' manual # * See: http://www.idris.fr/eng/ada/ for ada : computing server * See: http://www.idris.fr/eng/adapp/ for adapp : pre-post-treatment * See: http://www.idris.fr/eng/ergon/ for ergon : file server # Commands to manage jobs on ada # * The job's time limit is measured in real time, for example 1 hour on 32 procs accounts for 32 hours. Be careful not to have too much time on 1 processor. * llsubmit --> submit a job * llcancel --> cancel a job * llq -u ''login'' --> indicates all jobs in the queue or running for the login ''login'' * Trick: parameterize the llq display to see the job names {{{ llq -u $(whoami) -f %jn %id %st %c %dq %h -W }}} * Post-mortem : idrjar , idrjar -l -j #jobid#, to obtain detailed information: memory, real time, efficiency,... * Example of idrjar output : {{{ ada > idrjar |----------------------------------------------| |--- IDRIS/CNRS. Version du 18 mars 2015 ---| |----------------------------------------------| Sorties concernant l'identifiant rpslxxx pour la période du ==> 01 juin 2013 au 19 juin 2013 Owner Job Name JobId Queue tEse tCpu #T (%) S ------- ----------- --------------- ----- ---- ------ --- ------- - rpslxxx ADA337 ada338.290170.0 c32t2 133 1232 32 28.95 C rpslxxx ADA337 ada338.290333.0 c32t2 5425 165141 32 95.13 C rpslxxx PACKDEBUG ada338.290610.0 t2 11 2 1 18.18 C rpslxxx ADA337 ada338.290438.0 c32t2 5471 166878 32 95.32 C rpslxxx PACKRESTART ada338.290611.0 t2 182 25 1 13.74 C rpslxxx REBUILDWRK ada338.290612.0 t2 1577 503 1 31.90 C rpslxxx PACKOUTPUT ada338.290730.0 t2 114 43 1 37.72 C }}} # Example of a job to start an executable in MPI # Here is an example of a simple job to start an executable orchidee_ol (or gcm.e commented). The input files and the executable must be in the directory before starting the executable. {{{ #!/bin/ksh # ###################### # ## ADA IDRIS ## # ###################### # Query's name # @ job_name = test # Job type # @ job_type = parallel # Standard output file # @ output = Script_Output_test.$(jobid) # Error output file (the same) # @ error = Script_Output_test.$(jobid) # Number of requested processes # @ total_tasks = 8 # max. CPU time per MPI process hh:mm:ss # @ wall_clock_limit = 1:00:00 # Number of task OpenMP/pthreads per MPI process ### @ parallel_threads = 4 # End of header # @ queue poe ./orchidee_ol #poe ./gcm.e }}} # Information on Ergon files from Adapp # Ergon files are visible from Adapp. Use $ARCHIVE to reach Ergon files on Adapp. $ARCHIVE is /arch/home/rech/lab/plabxxx on Adapp. All Unix command are available on Adapp to provides information on Ergon files. # Job Header for MPI - MPI/OMP with libIGCM # ## Forced model ## ### MPI ### To launch a job on XXX MPI tasks {{{ #!/bin/ksh # ###################### # ## ADA IDRIS ## # ###################### # Job name # @ job_name = MyJob # Job type # @ job_type = parallel # Standard output file name # @ output = Script_Output_MyJob.000001 # Error output file name # @ error = Script_Output_MyJob.000001 # Total number of tasks # @ total_tasks = XXX # @ environment = "BATCH_NUM_PROC_TOT=XXX" # Maximum CPU time per task hh:mm:ss # @ wall_clock_limit = 1:00:00 # End of the header options # @ queue }}} ### hybrid MPI-OMP ### [[NoteBox(note,Hybrid version are only available with _v6 configurations, 600px)]] To launch a job on XXX MPI tasks and YYY threads OMP on each task * first you need to modify your config.card {{{ ATM= (gcm.e, lmdz.x, XXXMPI, YYYOMP) }}} * second you need to modify your job header {{{ #!/bin/ksh # ###################### # ## ADA IDRIS ## # ###################### # Job name # @ job_name = MyJob # Job type # @ job_type = parallel # Standard output file name # @ output = Script_Output_MyJob.000001 # Error output file name # @ error = Script_Output_MyJob.000001 # Total number of tasks # @ total_tasks = XXX # @ environment = "BATCH_NUM_PROC_TOT=XXX*YYY" # Maximum CPU time per task hh:mm:ss # @ wall_clock_limit = 1:00:00 # Specific option for OpenMP parallelization: Number of OpenMP threads per MPI task # @ parallel_threads = YYY # End of the header options # @ queue }}} ## Coupled model ## ### MPI ### To launch a job on XXX (32) MPI tasks. 5 for NEMO, 1 for oasis and 26 MPI tasks for LMDZ by default for IPSLCM5A. {{{ #!/bin/ksh # ###################### # ## ADA IDRIS ## # ###################### # Job name # @ job_name = MyCoupledJob # Job type # @ job_type = parallel # Standard output file name # @ output = Script_Output_MyCoupledJob.000001 # Error output file name # @ error = Script_Output_MyCoupledJob.000001 # Total number of tasks # @ total_tasks = 32 # @ environment = "BATCH_NUM_PROC_TOT=32" # Maximum CPU time per task hh:mm:ss # @ wall_clock_limit = 1:00:00 # End of the header options # @ queue }}} ### hybrid MPI-OMP ### [[NoteBox(note,Hybrid version are only available with _v6 configurations, 600px)]] To launch a job on XXX (24) MPI tasks and YYY (2) threads OMP for LMDZ, ZZZ (7) MPI tasks for NEMO and SSS (1) XIOS servers : * first you need to modify your config.card. On ada, this is working for IPSLCM6_rc0 (IPSLCM6A_VLR) : {{{ ATM= (gcm.e, lmdz.x, 24MPI, 2OMP) SRF= ("" ,"" ) SBG= ("" ,"" ) OCE= (opa, opa.xx , 7MPI) ICE= ("" ,"" ) MBG= ("" ,"" ) CPL= ("", "" ) IOS= (xios_server.exe, xios.x, 1MPI) }}} * second you need to modify your job header {{{ #!/bin/ksh # ###################### # ## ADA IDRIS ## # ###################### # Job name # @ job_name = MyCoupledJob # Job type # @ job_type = parallel # Standard output file name # @ output = Script_Output_MyCoupledJob.000001 # Error output file name # @ error = Script_Output_MyCoupledJob.000001 # Total number of tasks # @ total_tasks = 32 # @ environment = "BATCH_NUM_PROC_TOT=56" # Maximum CPU time per task hh:mm:ss # @ wall_clock_limit = 1:00:00 # Specific option for OpenMP parallelization: Number of OpenMP threads per MPI task # @ parallel_threads = 2 # End of the header options # @ queue }}} # Specificities libIGCM on Ada # At IDRIS and for Ada, output files are 'packed' using libIGCM_v2, i.e. they are grouped by periods (in general 1 year) using the command `tar` or `ncrcat` for NetCDF output files. [[BR]] This option implies that files must be temporarily stored on the $WORKDIR space, which means that a large storage is needed (at least 20 To).[[BR]] The diagram below details all jobs including `pack_debug`, `pack_restart` and `pack_output` as well as the directories those jobs are using. Note that the files are temporarily stored in the $WORKDIR/IGCM_OUT directories before being grouped and sent on Ergon in the IGCM_OUT directories.[[BR]] [[Image(libIGCM_pack.jpg, 50%)]] You will obtain annual output files with 12 monthly values in the Output/MO directory if you put `PeriodLength=1M` and `PackFrequency=1Y` in `config.card`. This is the default grouping period of most configurations but you can of course change it. [[BR]] What you must remember: * The tool [wiki:DocGmonitor#RunChecker RunChecker.job] is meant to help you monitoring your simulations. It offers a synthetic view of the different post processing jobs' status. * The tool [wiki:DocGmonitor#Unknownerror clean_latestPackperiod.job] is meant to help you clean until the last successfully computed pack period. * If you detect anomalies and must rerun part of the simulation, you will have to make new complete pack periods (e.g. filling a gap by running 1 month of simulation is out of the question). * The restart files are stored and grouped on Ergon in the directory IGCM_OUT/.../RESTART * The different output text-files are stored and grouped on Ergon in the directory IGCM_OUT/.../DEBUG * The listings for pack-jobs outputs stay on Ada in the directory $WORKDIR/IGCM_OUT/.../Out * If you put the `SpaceName=TEST`parameter in `config.card` the pack jobs will not be started and your simulation will be stored in the WORKDIR/IGCM_OUT directory. This can be very useful for short tests. To learn more about this Section, you can read the documentation on [wiki:DocFsimu Simulation and post-processing] and on [wiki:DocGmonitor Monitor, debug and relaunching.][[BR]] Finally, in case of panic, visit us or send your questions to the list platform-users. # Specificities for Adapp # * Adapp is dedicated to pre and post-treatment. * Note that Ergon files are visible in read only mode through $ARCHIVE. * you can use idrls to know the status of a file stored on ergon. See {{{idrls -?}}}. m means migrated on tape only, - means on disk. {{{ cd $ARCHIVE idrls IGCM/RESTART/IPSLCM6/DEVT/piControl/O1T03V14/*/Restart/* M ACCESS L USER GROUP SIZE MOD_DATE ACC_DATE EXP_DATE FILE_NAME = ========== = ======== ===== ============ ========== ========== ========== ========= - -rwxrwxr-x 1 rpslxxx psl 218188352 09.06.2015 22.01.2016 22.01.2017 IGCM/RESTART/IPSLCM6/DEVT/piControl/O1T03V14/ICE/Restart/O1T03V14_18891231_restart_icemod.nc m -rwxrwxr-x 1 rpslxxx psl 1411362796 09.06.2015 22.01.2016 22.01.2017 IGCM/RESTART/IPSLCM6/DEVT/piControl/O1T03V14/OCE/Restart/O1T03V14_18891231_restart.nc }}} * Use largely Adapp for analyses and interactive work * Adapp is free of charge ## IDRIS users' manual for adapp ## * See: http://www.idris.fr/eng/adapp/ for adapp : pre-post-treatment ## Header for adapp job ## A post-treatment jobs includes these header lines : {{{ # @ job_type = serial # @ requirements = (Feature == "prepost") }}}