Hi @jedwards,
I am porting cesm2_1_1 to our university machine. The job scheduler is slurm.
We added the following block to the config_machine.xml:
<machine MACH="pace-hive">
<DESC>Georgia Tech PACE cluster, Linux RHEL7</DESC>
<NODENAME_REGEX>.*.pace.gatech.edu</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>gcc</COMPILERS>
<MPILIBS>mvapich2</MPILIBS>
<CIME_OUTPUT_ROOT>/scratch/CESM</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/xxxx/.../xxxx/scratch/CESM_INPUTS/2.2.0</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>/xxxx/.../xxxx/scratch/CESM_INPUTS/2.2.0/lmwg</DIN_LOC_ROOT_CLMFORC>
<DOUT_S_ROOT>/scratch/CESM/archive/$CASE</DOUT_S_ROOT>
<BASELINE_ROOT>/xxxx/.../xxxx/scratch/CESM_INPUTS/2.2.0/ccsm_baselines</BASELINE_ROOT>
<CCSM_CPRNC></CCSM_CPRNC>
<GMAKE_J>4</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>pace-support - at - oit.gatech.edu</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>24</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>24</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="mvapich2">
<executable>mpirun</executable>
<arguments>
<arg name="mpi"></arg>
<arg name="num_tasks">-n {{ total_tasks }}</arg>
</arguments>
</mpirun>
<module_system type="module">
<init_path lang="perl">/usr/local/pace-apps/lmod/lmod/init/perl</init_path>
<init_path lang="python">/usr/local/pace-apps/lmod/lmod/init/env_modules_python.py</init_path>
<init_path lang="csh">/usr/local/pace-apps/lmod/lmod/init/csh</init_path>
<init_path lang="sh">/usr/local/pace-apps/lmod/lmod/init/sh</init_path>
<cmd_path lang="perl">/usr/local/pace-apps/lmod/lmod/libexec/lmod perl</cmd_path>
<cmd_path lang="python">/usr/local/pace-apps/lmod/lmod/libexec/lmod python</cmd_path>
<cmd_path lang="sh">module</cmd_path>
<cmd_path lang="csh">module</cmd_path>
<modules>
<command name="purge"/>
</modules>
<modules>
<command name="load">perl/5.34.1</command>
</modules>
<modules compiler="gcc">
<command name="load">gcc/10.3.0</command>
<command name="load">mvapich2/2.3.6</command>
<command name="load">hdf5/1.10.8</command>
<command name="load">netcdf-c/4.8.1</command>
<command name="load">mkl/20.0.4</command>
</modules>
<modules mpilib="mvapich2">
<command name="load">mvapich2/2.3.6</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">64M</env>
</environment_variables>
<environment_variables compiler="gcc">
<env name="NETCDF_PATH">/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-10.3.0/netcdf-c-4.8.1-qbpmsrxilalurws7acutvesy4h5yyzxy</env>
<env name="HDF5">/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-10.3.0/hdf5-1.10.8-jzwozkvmnzdcjeqp2gmf6hpwi5jqz7if</env>
</environment_variables>
</machine>
and the following block in the config_batch.xml:
<batch_system MACH="pace-hive" type="slurm">
<batch_submit>sbatch</batch_submit>
<directives queue="hive">
<directive default="/bin/bash" > -S {{ shell }} </directive>
<directive> --partition=hive</directive>
<directive> --account={{ project }}</directive>
<directive> --nodes={{ num_nodes }}</directive>
<directive> --ntasks-per-node={{ tasks_per_node }}</directive>
</directives>
<queues>
<queue walltimemax="12:00:00" nodemin="1" nodemax="24" default="true">hive</queue>
</queues>
</batch_system>
After we successfully created the case by using command
./create_newcase --case test-hive --res f09_g17 --compset B1850 --mach pace-hive --walltime 02:00:00 -q hive --project XXX
The case set up failed with the following messages:
###################
Creating batch scripts
Writing case.run script from input template /xxx/cesm2_2_0/cime/config/cesm/machines/template.case.run
Traceback (most recent call last):
File "/xxx/cesm2_2_0/cime/scripts/test1/./case.setup", line 67, in <module>
_main_func(__doc__)
File "/xxx/cesm2_2_0/cime/scripts/test1/./case.setup", line 64, in _main_func
case.case_setup(clean=clean, test_mode=test_mode, reset=reset, keep=keep)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case_setup.py", line 270, in case_setup
run_and_log_case_status(functor, phase, caseroot=caseroot)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/utils.py", line 1768, in run_and_log_case_status
rv = func()
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case_setup.py", line 254, in <lambda>
functor = lambda: _case_setup_impl(self, caseroot, clean=clean, test_mode=test_mode, reset=reset, keep=keep)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case_setup.py", line 203, in _case_setup_impl
env_batch.make_all_batch_files(case)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_batch.py", line 940, in make_all_batch_files
self.make_batch_script(input_batch_script, job, case)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_batch.py", line 194, in make_batch_script
overrides = self.get_job_overrides(job, case)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_batch.py", line 189, in get_job_overrides
overrides["mpirun"] = case.get_mpirun_cmd(job=job, overrides=overrides)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case.py", line 1437, in get_mpirun_cmd
executable, mpi_arg_list, custom_run_exe, custom_run_misc_suffix = env_mach_specific.get_mpirun(self, mpi_attribs, job)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_mach_specific.py", line 504, in get_mpirun
arg_value = transform_vars(self.text(arg_node),
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/utils.py", line 1509, in transform_vars
while directive_re.search(text):
TypeError: expected string or bytes-like object
######################################
It seems like the node information is not read in correctly but I don't know how to modify the config_batch.xml file.
Thanks,
Melody
I am porting cesm2_1_1 to our university machine. The job scheduler is slurm.
We added the following block to the config_machine.xml:
<machine MACH="pace-hive">
<DESC>Georgia Tech PACE cluster, Linux RHEL7</DESC>
<NODENAME_REGEX>.*.pace.gatech.edu</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>gcc</COMPILERS>
<MPILIBS>mvapich2</MPILIBS>
<CIME_OUTPUT_ROOT>/scratch/CESM</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/xxxx/.../xxxx/scratch/CESM_INPUTS/2.2.0</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>/xxxx/.../xxxx/scratch/CESM_INPUTS/2.2.0/lmwg</DIN_LOC_ROOT_CLMFORC>
<DOUT_S_ROOT>/scratch/CESM/archive/$CASE</DOUT_S_ROOT>
<BASELINE_ROOT>/xxxx/.../xxxx/scratch/CESM_INPUTS/2.2.0/ccsm_baselines</BASELINE_ROOT>
<CCSM_CPRNC></CCSM_CPRNC>
<GMAKE_J>4</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>pace-support - at - oit.gatech.edu</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>24</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>24</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="mvapich2">
<executable>mpirun</executable>
<arguments>
<arg name="mpi"></arg>
<arg name="num_tasks">-n {{ total_tasks }}</arg>
</arguments>
</mpirun>
<module_system type="module">
<init_path lang="perl">/usr/local/pace-apps/lmod/lmod/init/perl</init_path>
<init_path lang="python">/usr/local/pace-apps/lmod/lmod/init/env_modules_python.py</init_path>
<init_path lang="csh">/usr/local/pace-apps/lmod/lmod/init/csh</init_path>
<init_path lang="sh">/usr/local/pace-apps/lmod/lmod/init/sh</init_path>
<cmd_path lang="perl">/usr/local/pace-apps/lmod/lmod/libexec/lmod perl</cmd_path>
<cmd_path lang="python">/usr/local/pace-apps/lmod/lmod/libexec/lmod python</cmd_path>
<cmd_path lang="sh">module</cmd_path>
<cmd_path lang="csh">module</cmd_path>
<modules>
<command name="purge"/>
</modules>
<modules>
<command name="load">perl/5.34.1</command>
</modules>
<modules compiler="gcc">
<command name="load">gcc/10.3.0</command>
<command name="load">mvapich2/2.3.6</command>
<command name="load">hdf5/1.10.8</command>
<command name="load">netcdf-c/4.8.1</command>
<command name="load">mkl/20.0.4</command>
</modules>
<modules mpilib="mvapich2">
<command name="load">mvapich2/2.3.6</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">64M</env>
</environment_variables>
<environment_variables compiler="gcc">
<env name="NETCDF_PATH">/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-10.3.0/netcdf-c-4.8.1-qbpmsrxilalurws7acutvesy4h5yyzxy</env>
<env name="HDF5">/usr/local/pace-apps/spack/packages/linux-rhel7-x86_64/gcc-10.3.0/hdf5-1.10.8-jzwozkvmnzdcjeqp2gmf6hpwi5jqz7if</env>
</environment_variables>
</machine>
and the following block in the config_batch.xml:
<batch_system MACH="pace-hive" type="slurm">
<batch_submit>sbatch</batch_submit>
<directives queue="hive">
<directive default="/bin/bash" > -S {{ shell }} </directive>
<directive> --partition=hive</directive>
<directive> --account={{ project }}</directive>
<directive> --nodes={{ num_nodes }}</directive>
<directive> --ntasks-per-node={{ tasks_per_node }}</directive>
</directives>
<queues>
<queue walltimemax="12:00:00" nodemin="1" nodemax="24" default="true">hive</queue>
</queues>
</batch_system>
After we successfully created the case by using command
./create_newcase --case test-hive --res f09_g17 --compset B1850 --mach pace-hive --walltime 02:00:00 -q hive --project XXX
The case set up failed with the following messages:
###################
Creating batch scripts
Writing case.run script from input template /xxx/cesm2_2_0/cime/config/cesm/machines/template.case.run
Traceback (most recent call last):
File "/xxx/cesm2_2_0/cime/scripts/test1/./case.setup", line 67, in <module>
_main_func(__doc__)
File "/xxx/cesm2_2_0/cime/scripts/test1/./case.setup", line 64, in _main_func
case.case_setup(clean=clean, test_mode=test_mode, reset=reset, keep=keep)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case_setup.py", line 270, in case_setup
run_and_log_case_status(functor, phase, caseroot=caseroot)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/utils.py", line 1768, in run_and_log_case_status
rv = func()
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case_setup.py", line 254, in <lambda>
functor = lambda: _case_setup_impl(self, caseroot, clean=clean, test_mode=test_mode, reset=reset, keep=keep)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case_setup.py", line 203, in _case_setup_impl
env_batch.make_all_batch_files(case)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_batch.py", line 940, in make_all_batch_files
self.make_batch_script(input_batch_script, job, case)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_batch.py", line 194, in make_batch_script
overrides = self.get_job_overrides(job, case)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_batch.py", line 189, in get_job_overrides
overrides["mpirun"] = case.get_mpirun_cmd(job=job, overrides=overrides)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/case/case.py", line 1437, in get_mpirun_cmd
executable, mpi_arg_list, custom_run_exe, custom_run_misc_suffix = env_mach_specific.get_mpirun(self, mpi_attribs, job)
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/XML/env_mach_specific.py", line 504, in get_mpirun
arg_value = transform_vars(self.text(arg_node),
File "/xxx/cesm2_2_0/cime/scripts/Tools/../../scripts/lib/CIME/utils.py", line 1509, in transform_vars
while directive_re.search(text):
TypeError: expected string or bytes-like object
######################################
It seems like the node information is not read in correctly but I don't know how to modify the config_batch.xml file.
Thanks,
Melody