Hello,
I am porting CESM to the Texas A&M University (TAMU) Grace HPRC cluster and have encountered a persistent error that I believe points to a fundamental toolchain incompatibility.
After successfully creating a case, ./case.setup fails with ERROR: module command None purge failed with message: /bin/sh: None: command not found.
The crucial detail is that this error occurs even when the correct, complete module environment is manually loaded in the shell immediately before running ./case.setup. This suggests the CIME script is not correctly inheriting the environment from the parent shell.
I am using CESM 2.2.0. The output of ./describe_version is:
cesm2.2.0-0-g332937b
The source code tree at /scratch/user/junliang123/cesm2.2.0 is unmodified. All porting customizations have been made in the $HOME/.cime directory, following the recommended practice.
1. Porting: Created three modular configuration files (config_machines.xml, config_compilers.xml, config_batch.xml) in the $HOME/.cime directory. These are based on a combination of official porting guides and a previously successful port on this machine.
2. Environment Setup: Created a function load_cesm_env in my .bashrc file to load a specific, known-good combination of modules.
3. Case Creation: From the /scratch/user/junliang123/cesm2.2.0/cime/scripts directory, I run:
./create_newcase --case $SCRATCH/cesm_cases/BHIST_v2.2_final_test --compset BHIST --res f19_g17 --machine grace
This step completes successfully and creates the case directory.
4. Case Configuration: I then navigate to the case directory and configure the project:
cd $SCRATCH/cesm_cases/BHIST_v2.2_final_test
./xmlchange PROJECT=***********
5. Pre-loading Environment: I manually load the correct modules into my shell:
Bash
load_cesm_env
The module list command confirms all necessary modules are loaded.
6. Running Setup (Failure Point): I immediately run the setup script:
Bash
./case.setup
This is where the process fails with the module command None purge error.
Compiler Version: intel-compilers/2022.1.0
1. $HOME/.cime/config_machines.xml
<?xml version="1.0"?>
<config_machines>
<machine MACH="grace">
<DESC>Intel Xeon, Slurm batch system on Grace@TAMU</DESC>
<NODENAME_REGEX>.*grace</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>intel</COMPILERS>
<MPILIBS>impi</MPILIBS>
<CIME_OUTPUT_ROOT>$ENV{SCRATCH}</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/ihesp/obs_root/inputdata</DIN_LOC_ROOT>
<DOUT_S_ROOT>$ENV{SCRATCH}/archive/$CASE</DOUT_S_ROOT>
<GMAKE_J>8</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>junliang123</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>48</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>48</MAX_MPITASKS_PER_NODE>
<mpirun mpilib="impi">
<executable>mpirun</executable>
<arguments>
<arg name="num_tasks"> -np $TOTALPES</arg>
</arguments>
</mpirun>
<module_system type="module">
<init_path lang="sh">/sw/lmod/lmod/init/sh</init_path>
<cmd_path lang="sh">module</cmd_path>
<modules>
<command name="purge"></command>
<command name="load">intel-compilers/2022.1.0</command>
<command name="load">impi/2021.6.0</command>
<command name="load">imkl/2022.1.0</command>
<command name="load">Python/3.10.4</command>
<command name="load">CMake/3.24.3</command>
<command name="load">netCDF-Fortran/4.6.0</command>
<command name="load">PnetCDF/1.12.3</command>
<command name="load">HDF5/1.12.2</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">256M</env>
</environment_variables>
</machine>
</config_machines>
2. $HOME/.cime/config_compilers.xml
<?xml version="1.0"?>
<config_compilers>
<compiler COMPILER="intel">
<ADD_FFLAGS_LEND>-qopenmp -assume realloc_lhs</ADD_FFLAGS_LEND>
</compiler>
<mpilib MPILIB="impi">
<ADD_LDFLAGS>-L$MKLROOT/lib/intel64</ADD_LDFLAGS>
</mpilib>
</config_compilers>
3. $HOME/.cime/config_batch.xml
<?xml version="1.0"?>
<config_batch>
<batch_system MACH="grace" type="slurm">
<queues>
<queue default="true">short</queue>
</queues>
</batch_system>
</config_batch>
4. load_cesm_env function from .bashrc
load_cesm_env() {
echo "Loading PROVEN CESM environment..."
module purge
module load intel-compilers/2022.1.0
module load impi/2021.6.0
module load imkl/2022.1.0
module load Python/3.10.4
module load CMake/3.24.3
module load netCDF-Fortran/4.6.0
module load PnetCDF/1.12.3
module load HDF5/1.12.2
echo " Proven environment loaded."
module list
}
5. Full Terminal Log of Final Attempt
[junliang123@grace5 scripts]$ ./create_newcase --case $SCRATCH/cesm_cases/BHIST_v2.2_final_test --compset BHIST --res f19_g17 --machine grace
Compset longname is HIST_CAM60_CLM50%BGC-CROP_CICE_POP2%ECO_MOSART_CISM2%NOEVOLVE_WW3_BGC%BDRD... (rest of successful create_newcase output) ...Creating Case directory /scratch/user/junliang123/cesm_cases/BHIST_v2.2_final_test
[junliang123@grace5 scripts]$ cd $SCRATCH/cesm_cases/BHIST_v2.2_final_test
[junliang123@grace5 BHIST_v2.2_final_test]$ ./xmlchange PROJECT=**********
[junliang123@grace5 BHIST_v2.2_final_test]$ load_cesm_envLoading PROVEN CESM environment...
Proven environment loaded.Currently Loaded Modules:
1) GCCcore/11.3.0 5) numactl/2.0.14 9) bzip2/1.0.8 13) SQLite/3.38.3 17) OpenSSL/1.1 21) CMake/3.24.3 25) zstd/1.5.2 29) PnetCDF/1.12.3 2) zlib/1.2.12 6) UCX/1.12.1 10) ncurses/6.3 14) XZ/5.2.5 18) Python/3.10.4 22) Szip/2.1.1 26) libxml2/2.9.13 30) HDF5/1.12.2 3) binutils/2.38 7) impi/2021.6.0 11) libreadline/8.1.2 15) GMP/6.2.1 19) cURL/7.83.0 23) gzip/1.12 27) netCDF/4.9.0 4) intel-compilers/2022.1.0 8) imkl/2022.1.0 12) Tcl/8.6.12 16) libffi/3.4.2 20) libarchive/3.6.1 24) lz4/1.9.3 28) netCDF-Fortran/4.6.0
[junliang123@grace5 BHIST_v2.2_final_test]$ ./case.setupERROR: module command None purge failed with message:/bin/sh: None: command not found
My question is:
As detailed above, case.setup fails to initialize the module system. Given that create_newcase works correctly (reading my custom XML files) and the error persists even after manually loading the identical module environment, it strongly suggests a deep toolchain incompatibility. The CIME Python scripts appear unable to correctly fork a sub-shell that inherits the necessary environment to find and execute the module command on the TAMU Grace cluster.
Any suggestions or insights into this behavior would be greatly appreciated.
Thank you so much!
I am porting CESM to the Texas A&M University (TAMU) Grace HPRC cluster and have encountered a persistent error that I believe points to a fundamental toolchain incompatibility.
After successfully creating a case, ./case.setup fails with ERROR: module command None purge failed with message: /bin/sh: None: command not found.
The crucial detail is that this error occurs even when the correct, complete module environment is manually loaded in the shell immediately before running ./case.setup. This suggests the CIME script is not correctly inheriting the environment from the parent shell.
I am using CESM 2.2.0. The output of ./describe_version is:
cesm2.2.0-0-g332937b
The source code tree at /scratch/user/junliang123/cesm2.2.0 is unmodified. All porting customizations have been made in the $HOME/.cime directory, following the recommended practice.
1. Porting: Created three modular configuration files (config_machines.xml, config_compilers.xml, config_batch.xml) in the $HOME/.cime directory. These are based on a combination of official porting guides and a previously successful port on this machine.
2. Environment Setup: Created a function load_cesm_env in my .bashrc file to load a specific, known-good combination of modules.
3. Case Creation: From the /scratch/user/junliang123/cesm2.2.0/cime/scripts directory, I run:
./create_newcase --case $SCRATCH/cesm_cases/BHIST_v2.2_final_test --compset BHIST --res f19_g17 --machine grace
This step completes successfully and creates the case directory.
4. Case Configuration: I then navigate to the case directory and configure the project:
cd $SCRATCH/cesm_cases/BHIST_v2.2_final_test
./xmlchange PROJECT=***********
5. Pre-loading Environment: I manually load the correct modules into my shell:
Bash
load_cesm_env
The module list command confirms all necessary modules are loaded.
6. Running Setup (Failure Point): I immediately run the setup script:
Bash
./case.setup
This is where the process fails with the module command None purge error.
Porting Files and Logs
This is a port to a new machine. Below are the contents of all custom configuration files, the environment function, and the full terminal log showing the final failed attempt.Compiler Version: intel-compilers/2022.1.0
1. $HOME/.cime/config_machines.xml
<?xml version="1.0"?>
<config_machines>
<machine MACH="grace">
<DESC>Intel Xeon, Slurm batch system on Grace@TAMU</DESC>
<NODENAME_REGEX>.*grace</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>intel</COMPILERS>
<MPILIBS>impi</MPILIBS>
<CIME_OUTPUT_ROOT>$ENV{SCRATCH}</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/ihesp/obs_root/inputdata</DIN_LOC_ROOT>
<DOUT_S_ROOT>$ENV{SCRATCH}/archive/$CASE</DOUT_S_ROOT>
<GMAKE_J>8</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>junliang123</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>48</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>48</MAX_MPITASKS_PER_NODE>
<mpirun mpilib="impi">
<executable>mpirun</executable>
<arguments>
<arg name="num_tasks"> -np $TOTALPES</arg>
</arguments>
</mpirun>
<module_system type="module">
<init_path lang="sh">/sw/lmod/lmod/init/sh</init_path>
<cmd_path lang="sh">module</cmd_path>
<modules>
<command name="purge"></command>
<command name="load">intel-compilers/2022.1.0</command>
<command name="load">impi/2021.6.0</command>
<command name="load">imkl/2022.1.0</command>
<command name="load">Python/3.10.4</command>
<command name="load">CMake/3.24.3</command>
<command name="load">netCDF-Fortran/4.6.0</command>
<command name="load">PnetCDF/1.12.3</command>
<command name="load">HDF5/1.12.2</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">256M</env>
</environment_variables>
</machine>
</config_machines>
2. $HOME/.cime/config_compilers.xml
<?xml version="1.0"?>
<config_compilers>
<compiler COMPILER="intel">
<ADD_FFLAGS_LEND>-qopenmp -assume realloc_lhs</ADD_FFLAGS_LEND>
</compiler>
<mpilib MPILIB="impi">
<ADD_LDFLAGS>-L$MKLROOT/lib/intel64</ADD_LDFLAGS>
</mpilib>
</config_compilers>
3. $HOME/.cime/config_batch.xml
<?xml version="1.0"?>
<config_batch>
<batch_system MACH="grace" type="slurm">
<queues>
<queue default="true">short</queue>
</queues>
</batch_system>
</config_batch>
4. load_cesm_env function from .bashrc
load_cesm_env() {
echo "Loading PROVEN CESM environment..."
module purge
module load intel-compilers/2022.1.0
module load impi/2021.6.0
module load imkl/2022.1.0
module load Python/3.10.4
module load CMake/3.24.3
module load netCDF-Fortran/4.6.0
module load PnetCDF/1.12.3
module load HDF5/1.12.2
echo " Proven environment loaded."
module list
}
5. Full Terminal Log of Final Attempt
[junliang123@grace5 scripts]$ ./create_newcase --case $SCRATCH/cesm_cases/BHIST_v2.2_final_test --compset BHIST --res f19_g17 --machine grace
Compset longname is HIST_CAM60_CLM50%BGC-CROP_CICE_POP2%ECO_MOSART_CISM2%NOEVOLVE_WW3_BGC%BDRD... (rest of successful create_newcase output) ...Creating Case directory /scratch/user/junliang123/cesm_cases/BHIST_v2.2_final_test
[junliang123@grace5 scripts]$ cd $SCRATCH/cesm_cases/BHIST_v2.2_final_test
[junliang123@grace5 BHIST_v2.2_final_test]$ ./xmlchange PROJECT=**********
[junliang123@grace5 BHIST_v2.2_final_test]$ load_cesm_envLoading PROVEN CESM environment...

1) GCCcore/11.3.0 5) numactl/2.0.14 9) bzip2/1.0.8 13) SQLite/3.38.3 17) OpenSSL/1.1 21) CMake/3.24.3 25) zstd/1.5.2 29) PnetCDF/1.12.3 2) zlib/1.2.12 6) UCX/1.12.1 10) ncurses/6.3 14) XZ/5.2.5 18) Python/3.10.4 22) Szip/2.1.1 26) libxml2/2.9.13 30) HDF5/1.12.2 3) binutils/2.38 7) impi/2021.6.0 11) libreadline/8.1.2 15) GMP/6.2.1 19) cURL/7.83.0 23) gzip/1.12 27) netCDF/4.9.0 4) intel-compilers/2022.1.0 8) imkl/2022.1.0 12) Tcl/8.6.12 16) libffi/3.4.2 20) libarchive/3.6.1 24) lz4/1.9.3 28) netCDF-Fortran/4.6.0
[junliang123@grace5 BHIST_v2.2_final_test]$ ./case.setupERROR: module command None purge failed with message:/bin/sh: None: command not found
My question is:
As detailed above, case.setup fails to initialize the module system. Given that create_newcase works correctly (reading my custom XML files) and the error persists even after manually loading the identical module environment, it strongly suggests a deep toolchain incompatibility. The CIME Python scripts appear unable to correctly fork a sub-shell that inherits the necessary environment to find and execute the module command on the TAMU Grace cluster.
Any suggestions or insights into this behavior would be greatly appreciated.
Thank you so much!