Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Issues porting new machine to CIME

emily57

Emily
New Member
Good afternoon everyone,

I am a researcher at Purdue University currently trying to run CESM on our supercomputer named Bell. I'm having issues "Porting and Validating CIME on a new platform" for this machine. Talking to others at the university, this is a consistent issue for our machines, and the steps in the link above seem to not be working for anyone here.

The general issue happens on step 7.5, 'Validating a CESM port with prognostic components'. The below line leads to the errors shown in the picture below.

./create_test --xml-category prealpha --xml-machine cheyenne --xml-compiler intel --machine bell --compiler intel

I've also included the config_machines.xml file that we've created for Bell. Is anyone aware of what these errors could mean? Please feel free to let me know if you need any more information.

What version of the code are you using?
CESM2.2.2


Have you made any changes to files in the source tree?
config_machines.xml :

<machine MACH="bell">
<DESC>
Purdue cluster, OS is Linux, batch system is SLURM
</DESC>
<NODENAME_REGEX>bell-*</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>intel,gnu</COMPILERS>
<MPILIBS>openmpi,impi</MPILIBS>
<PROJECT>none</PROJECT>
<CIME_OUTPUT_ROOT>$ENV{RCAC_SCRATCH}/cesm_output</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>/home/barber57/CESM2025/</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>/home/barber57/CESM2025/</DIN_LOC_ROOT_CLMFORC>
<DOUT_S_ROOT>$ENV{RCAC_SCRATCH}/cesm_output/${CASE}</DOUT_S_ROOT>
<BASELINE_ROOT>$ENV{RCAC_SCRATCH}/cesm_output</BASELINE_ROOT>
<CCSM_CPRNC>/apps/external/conda/2025.02/bin/python</CCSM_CPRNC>
<GMAKE>make</GMAKE>
<GMAKE_J>8</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>purdue</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>128</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>128</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="impi">
<executable>mpirun</executable>
</mpirun>
<mpirun mpilib="openmpi">
<executable>mpirun</executable>
</mpirun>
<module_system type="module" allow_error="true">
<init_path lang="perl">/opt/lmod/8.4.4/init/perl</init_path>
<init_path lang="python">/opt/lmod/8.4.4/init/env_modules_python.py</init_path>
<init_path lang="sh">/opt/lmod/8.4.4/init/sh</init_path>
<init_path lang="csh">/opt/lmod/8.4.4/init/csh</init_path>
<cmd_path lang="perl">/opt/lmod/8.4.4/libexec/lmod perl</cmd_path>
<cmd_path lang="python">/opt/lmod/8.4.4/libexec/lmod python</cmd_path>
<cmd_path lang="sh">module</cmd_path>
<cmd_path lang="csh">module</cmd_path>
<modules>
<command name="purge"/>
</modules>
<modules compiler="intel">
<command name="load">intel/19.0.5.281</command>
</modules>
<modules compiler="gnu">
<command name="load">gcc/9.3.0</command>
</modules>
<modules mpilib="impi">
<command name="load">impi/2019.5.281</command>
</modules>
<modules mpilib="openmpi">
<command name="load">openmpi/4.0.5</command>
</modules>
<modules>
<command name="load">netcdf-fortran/4.4.4</command>
<command name="load">netlib-lapack/3.8.0</command>
<command name="load">openblas/0.3.8</command>
<command name="load">cmake/3.18.2</command>
</modules>
<modules mpilib="mpi-serial">
<command name="load">netcdf-fortran/4.4.4</command>
</modules>
</module_system>
<environment_variables>
<env name="OMP_STACKSIZE">256M</env>
</environment_variables>
<resource_limits>
<resource name="RLIMIT_STACK">-1</resource>
</resource_limits>
</machine>




Picture 2
1744654773409.png

Describe every step you took leading up to the problem:
More or less just cd commands, and editing the .xml file above.


If this is a port to a new machine: Please attach any files you added or changed for the machine port (e.g., config_compilers.xml, config_machines.xml, and config_batch.xml) and tell us the compiler version you are using on this machine.
Please attach any log files showing error messages or other useful information.

Attached. Working on getting the compiler version, will attach that as soon as I know.
 

jedwards

CSEG and Liaisons
Staff member
Thank you for the report. First I do not recommend using cesm2.2.2. If you need support for cmip6 use cesm2.1.5
if you want something in our development branch I recommend cesm3.0-alpha06d. If you would like to continue to debug this problem using cesm2.2.2
I would suggest first looking at a single test. It looks like everything is failing in the setup phase which usually indicates an error loading the module
Code:
environment. I suggest going to one of the case directories, for example:
Code:
cd /scratch/bell/barber57/cesm_output/SMS_Lm13.f10_f10_musgs.I1850clm50SpG.bell_intel.20250414_141100_cpjhwh
then try sourcing the file .env_machine_specific.xml
Code:
source .env_machine_specific.xml
If you get errors when sourcing this file you can make adjustments in your config_machine.xml file then recreate the case:
Code:
./create_test SMS_Lm13.f10_f10_musgs.I1850clm50SpG.bell_intel

I'm not sure given what you have provided, but it looks like all of the tests are failing with the same issue.
 
Top