case.setup cannot find our module command. We use bash primarily on our system. As noted in the title we are on Centos 8 using slurm for our scheduler and lmod for modules.
When we run setup_newcase we do not get any errors. This is the last bit of output:
*********************************************************************************************************************************
This compset and grid combination is not scientifically supported, however it is used in 10 tests.
*********************************************************************************************************************************
Using project from config_machines.xml: none
No charge_account info available, using value from PROJECT
cesm model version found: cesm2.2.0
Batch_system_type is slurm
job is case.run USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S
job is case.st_archive USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S
Creating Case directory /scratch/wew/cesmtest
When we then run case.setup the program cannot find module:
ERROR: module command None load openmpi/3.1.6 netcdf-c/4.7.4 anaconda2/2019.10 failed with message:
/bin/sh: None: command not found
In config_machines.xml we have an entry for our machine as follows:
<machine MACH="monsoon">
<DESC>
Example port to centos8 linux system with gcc, netcdf, pnetcdf and mpich
using modules from Environment Modules – A Great Tool for Clusters » ADMIN Magazine
</DESC>
<NODENAME_REGEX>cn*</NODENAME_REGEX>
<OS>LINUX</OS>
<PROXY> </PROXY>
<COMPILERS>gnu</COMPILERS>
<MPILIBS>openmpi</MPILIBS>
<PROJECT>none</PROJECT>
<SAVE_TIMING_DIR> </SAVE_TIMING_DIR>
<CIME_OUTPUT_ROOT>/scratch/$USER/cesm/scratch</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/common/contrib/cesm/inputdata</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>common/contrib/cesm/inputdata/lmwg</DIN_LOC_ROOT_CLMFORC>
<DOUT_S_ROOT>/common/contrib/cesm/archive/$CASE</DOUT_S_ROOT>
<BASELINE_ROOT>/common/contrib/cesm/cesm_baselines</BASELINE_ROOT>
<CCSM_CPRNC>/scratch/$USER/cesm2/tools/cime/tools/cprnc/cprnc</CCSM_CPRNC>
<GMAKE>make</GMAKE>
<GMAKE_J>8</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>hpcsupport -at- nau.edu</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>8</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>8</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="openmpi" compiler="gnu">
<executable>mpirun</executable>
<arguments>
<arg name="ntasks"> -np {{ total_tasks }} </arg>
</arguments>
</mpirun>
<module_system type="module">
<init_path lang="bash">/packages/lmod/lmod/init/bash</init_path>
<cmd_path lang="bash">module</cmd_path>
<modules compiler="gnu">
<command name="load">openmpi/3.1.6</command>
<command name="load">netcdf-c/4.7.4</command>
<command name="load">anaconda2/2019.10</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">256M</env>
<env name="MODULEPATH">/packages/modulefiles</env>
<env name="TMPDIR">/tmp/$SLURM_JOB_USER</env>
<env name="JOBDIR">$ENV{TMPDIR}/$SLURM_JOB_ID</env>
</environment_variables>
<resource_limits>
<resource name="RLIMIT_STACK">-1</resource>
</resource_limits>
</machine>
We also tried having an entry for csh and sh like with bash and it made no difference. What do we need to look at to take care of the module issue? Thanks.
When we run setup_newcase we do not get any errors. This is the last bit of output:
*********************************************************************************************************************************
This compset and grid combination is not scientifically supported, however it is used in 10 tests.
*********************************************************************************************************************************
Using project from config_machines.xml: none
No charge_account info available, using value from PROJECT
cesm model version found: cesm2.2.0
Batch_system_type is slurm
job is case.run USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S
job is case.st_archive USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S
Creating Case directory /scratch/wew/cesmtest
When we then run case.setup the program cannot find module:
ERROR: module command None load openmpi/3.1.6 netcdf-c/4.7.4 anaconda2/2019.10 failed with message:
/bin/sh: None: command not found
In config_machines.xml we have an entry for our machine as follows:
<machine MACH="monsoon">
<DESC>
Example port to centos8 linux system with gcc, netcdf, pnetcdf and mpich
using modules from Environment Modules – A Great Tool for Clusters » ADMIN Magazine
</DESC>
<NODENAME_REGEX>cn*</NODENAME_REGEX>
<OS>LINUX</OS>
<PROXY> </PROXY>
<COMPILERS>gnu</COMPILERS>
<MPILIBS>openmpi</MPILIBS>
<PROJECT>none</PROJECT>
<SAVE_TIMING_DIR> </SAVE_TIMING_DIR>
<CIME_OUTPUT_ROOT>/scratch/$USER/cesm/scratch</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/common/contrib/cesm/inputdata</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>common/contrib/cesm/inputdata/lmwg</DIN_LOC_ROOT_CLMFORC>
<DOUT_S_ROOT>/common/contrib/cesm/archive/$CASE</DOUT_S_ROOT>
<BASELINE_ROOT>/common/contrib/cesm/cesm_baselines</BASELINE_ROOT>
<CCSM_CPRNC>/scratch/$USER/cesm2/tools/cime/tools/cprnc/cprnc</CCSM_CPRNC>
<GMAKE>make</GMAKE>
<GMAKE_J>8</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>hpcsupport -at- nau.edu</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>8</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>8</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="openmpi" compiler="gnu">
<executable>mpirun</executable>
<arguments>
<arg name="ntasks"> -np {{ total_tasks }} </arg>
</arguments>
</mpirun>
<module_system type="module">
<init_path lang="bash">/packages/lmod/lmod/init/bash</init_path>
<cmd_path lang="bash">module</cmd_path>
<modules compiler="gnu">
<command name="load">openmpi/3.1.6</command>
<command name="load">netcdf-c/4.7.4</command>
<command name="load">anaconda2/2019.10</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">256M</env>
<env name="MODULEPATH">/packages/modulefiles</env>
<env name="TMPDIR">/tmp/$SLURM_JOB_USER</env>
<env name="JOBDIR">$ENV{TMPDIR}/$SLURM_JOB_ID</env>
</environment_variables>
<resource_limits>
<resource name="RLIMIT_STACK">-1</resource>
</resource_limits>
</machine>
We also tried having an entry for csh and sh like with bash and it made no difference. What do we need to look at to take care of the module issue? Thanks.