Changes between Version 9 and Version 10 of Documentation/UserGuide/DifferencesNetcdf


Ignore:
Timestamp:
2020-02-28T17:33:39+01:00 (4 years ago)
Author:
pmaugis
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Documentation/UserGuide/DifferencesNetcdf

    v9 v10  
    1 = How to check whether two (netcdf) files are identical = 
     1== How to check whether two (netcdf) files are identical == 
     2Author: S.Luyssaert and A.S. Lansø 
     3 
     4Last revised: 2020/02/28, P. Maugis 
     5 
    26 
    37=== cdo diffv === 
    48 
    5 Rather than comparing plots, its faster and more precise to compare whether two netcdf files (i.e. a history or restart file between 2 model versions) are numerically identical. The follow command works on asterix and obelix 
    6  
     9Rather than comparing plots, it's faster and more precise to compare whether two netcdf files (i.e. a history or restart file between 2 model versions) are numerically identical.  
     10If available (i.e. on obelix), you can use the following command: 
    711{{{  
    812cdo diffv   path_file_1   path_file_2 > output_file_name.txt 
    913}}} 
     14The attached script [https://forge.ipsl.jussieu.fr/orchidee/attachment/wiki/Documentation/UserGuide/restartability/differr100.sh differ100.sh] by Josefine Ghattas does that also and nicely.  
     15 
     16The comparison is easier if the same variables are contained in the two netcdf files in the same order.  
     17However 5dim variables are ignored by the cdo diffv command, thus not all variables in the restart files can be compared by this method 
     18 
    1019ADVANTAGE: the output file tells you which fields are different. Be aware, though that this method works best for smaller netCDF files. If your history file is more than a few megabytes, the output text file may be many hundreds of megabytes. In that case, the md5sum command may be a better option. 
    1120 
    12 DISADVANTAGE: only works for netcdf files  
     21DISADVANTAGE: only works for netcdf files, and for tables rank lower than 4. 
    1322 
    1423=== md5sum === 
    1524 
    16 If you expect that the files are identical (bit by bit) you can use 
     25If you expect the files to be identical (bit by bit), you can use 
    1726{{{ 
    18 md5sum path_file 
     27#! bash 
     28md5sum path_file1 > sum1 
     29md5sum path_file2 > sum2 
     30cmp -s sum1 sum2 
    1931}}} 
    20 as a result you will get a code. Run the same command on the second file and only when the code is identical for both files, the files are exactly the same. 
     32The two first commands create signature strings for each files, written in files 'sum1' and 'sum2' (which will thus be created/overwritten). The output of the third line will be 0 if files are identical, 1 otherwise. 
    2133 
    2234ADVANTAGE: works for all files. 
     
    2638=== Matlab === 
    2739 
    28 The matlab function nccmp are able to compare all variables contained within two netcdf files. The original version can be found here: https://fr.mathworks.com/matlabcentral/fileexchange/47857-comparing-two-netcdf-files. 
    29 I have made some small modifications such that the information produced by the script are put into a file instead of printed to the screen. The update version can be found here on IRENE:/ccc/work/cont003/dofoco/dofoco/SCRIPTS/debug/nccmp.m and here on obelix:/home/data03/dofoco/SCRIPTS_obelix/debug. 
     40The matlab function 'nccmp' is able to compare all variables contained within two netcdf files. The original version can be found here: https://fr.mathworks.com/matlabcentral/fileexchange/47857-comparing-two-netcdf-files. 
     41I have made some small modifications so that the information produced by the script are put into a file instead of printed to the screen. The update version can be found here on IRENE:/ccc/work/cont003/dofoco/dofoco/SCRIPTS/debug/nccmp.m and here on obelix:/home/data03/dofoco/SCRIPTS_obelix/debug. 
    3042 
    3143Run the function by typing: 
    3244{{{ 
    33 NCCMP(ncfile1,ncfile2,tolerance,forceCompare) 
     45NCCMP(ncfile1, ncfile2, tolerance, forceCompare) 
    3446}}} 
    35 Tolerance is if you allow some variation in the variables between the two files. We want identical files thus put [] here. 
     47Tolerance is whether you allow some variation in the variables between the two files. We want identical files thus put [] here. 
    3648 
    3749forceCompare can be set to true or false.  
    3850        True - write all occurrences of differences in a variable (specifically gives all the indices) to the file: all_diff.txt.  
    3951 
    40         False - only write if there is differences in a variable and its first occurrence of such differences to the file: first_diff.txt.  
     52        False - only write if there are differences in a variable and its first occurrence of such differences to the file: first_diff.txt.  
    4153 
    42 For global simulation the True option can produce a large file and the information might be hard to process, if there are many differences between the compared files. In addition, the True option can make the much script slower. However, for small simulation the true option might be very useful.         
     54For global simulations the True option can produce a large file and the information might be hard to process, if there are many differences between the compared files. In addition, the True option can make the script much slower. However, for small simulations the True option might be very useful.       
     55 
     56 
     57 
     58 
     59 
     60 
     61The matlab function nccmp are able to compare all variables contained within two netcdf files. The original version can be found here: https://fr.mathworks.com/matlabcentral/fileexchange/47857-comparing-two-netcdf-files. 
     62I have made some small modifications such that the information produced by the script are put into a file instead of being printed to the screen. The updated version can be found [https://forge.ipsl.jussieu.fr/orchidee/attachment/wiki/Documentation/UserGuide/restartability/nccmp_obelix.m here] 
     63 
     64Sadly, matlab is not on obelix, but on IRENE. To open matlab on IRENE type ''Matlab'' or if you wish to run from the terminal type ''matlab -nodesktop''.  
     65 
     66Next run the function by typing: 
     67 
     68{{{ 
     69NCCMP(ncfile1,ncfile2,tolerance,forceCompare) 
     70}}} 
     71 
     72''Tolerance'' is if you allow some variation in the variables between the two files. We want identical files thus put [] here. 
     73 
     74''forceCompare'' can be set to true or false.  
     75 
     76- True - write all occurrences of differences in a variable (specifically gives all the indices) to the file: all_diff.txt.  
     77 
     78- False - only write, if there is differences in a variable, the first occurrences of such differences to the file: first_diff.txt.  
     79 
     80For global simulation, the True option can produce a large file and the information might be hard to process, if there are many differences between the compared restart files. In addition, the True option makes the script much slower. However, for small simulation the True option is very useful.  
     81 
     82I recommend that you use the re-ordered files from the difffer100.sh script as inputs to nccmp.  
     83