Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

ncdump not found by short term archiver

l_vankampenhout@uu_nl

Leo van Kampenhout
Member
In the short term archiver log I get the following error:

Code:
Archiving restarts for cam (atm)
WARNING: ncdump -v nhfil /projects/0/uuesm2//b.e21.BSSP126cmip6.f09_g17.CMIP6-SSP1-2.6-sara.01/run/b.e21.BSSP126cmip6.f09_g17.CMIP6-SSP1-2.6-sara.01.cam.r.2020-01-01-00000.nc  failed rc=127
    out=
    err=/bin/sh: ncdump: command not found

Which causes a problem when trying to continue the run (see this older post from 2018: https://bb.cgd.ucar.edu/cesm/thread...revious-year-missing-from-run-directory.4564/)

My question is : how do I get the short term archiver to load the correct module (netcdf)? I'm obviously not on Cheyenne, but have "module load netcdf" in my ~/.module/default and ~/.module/bash so ncdump should be available by default. Also I have it in my list of modules in `env_mach_specific.xml` but that seems not to be used. I can't find a similar module list for the archiver: any ideas?

PS. Just thinking ahead, a user-friendly solution would be to test the availability of `ncdump` in one of the submission test scripts, when `DOUT_S = TRUE` .
 

jedwards

CSEG and Liaisons
Staff member
If you have netcdf module defined in config_machines.xml and thus in env_mach_specific.xml you can test that that is working by going to a case directory and doing 'source .env_mach_specific.sh' (or csh if that's your shell) this will load the environment that the model sees into your shell, then try ncdump. If it's not there investigate your module.
 

l_vankampenhout@uu_nl

Leo van Kampenhout
Member
Thanks Jim. I've tried 'module purge' and then 'source .env_mach_specific.sh' like you said and ncdump is found. Are you sure that the short term archiver script loads all the same modules that the model itself does?
 

l_vankampenhout@uu_nl

Leo van Kampenhout
Member
as a workaround, I have added this line in the archiving script:

Code:
kampe004@int2:~/cesm/cesm_tags/cesm2.1.3-release-git/cime $ git diff
diff --git a/scripts/lib/CIME/case/case_st_archive.py b/scripts/lib/CIME/case/case_st_archive.py
index cd2b051..52ada93 100644
--- a/scripts/lib/CIME/case/case_st_archive.py
+++ b/scripts/lib/CIME/case/case_st_archive.py
@@ -388,6 +388,7 @@ def _archive_restarts_date_comp(case, casename, rundir, archive, archive_entry,
     history files that are associated with these restart files.)
     """
     datename_str = _datetime_str(datename)
+    case.load_env() #LvK Load modules

     if datename_is_last or case.get_value('DOUT_S_SAVE_INTERIM_RESTART_FILES'):
         if not os.path.exists(archive_restdir):

which seems to work correctly when I execute ./case.st_archive manually (ncdump is found). Will now test this in a batch run, to be continued.
 

jedwards

CSEG and Liaisons
Staff member
Yes it does, the next possibility is that netcdf is installed on your login nodes but not your compute nodes.
 

l_vankampenhout@uu_nl

Leo van Kampenhout
Member
Yes it does, the next possibility is that netcdf is installed on your login nodes but not your compute nodes.
I just checked, but sourcing '.env_mach_specific.sh' works on a compute node and gives a working ncdump.

The workaround I described in my previous post seems to work OK, the run has successfully finished, archived, and restarted overnight.
 
Top