Issues with scripts_regression_tests.py

P Banerjee

Priyanka Banerjee
New Member
Hello,

I am trying to port CESM2 to the Discovery HPC at Dartmouth College. I have been able to successfully run a few short test simulations with B, C and G compsets for about 1-2 years using about 240-300 processors. However, I am having issues with scripts_regression_tests.py. I have gone through the forum, but am unable to resolve these issues and appreciate any advice on this.

I would like to carry out the prealpha and ensemble tests after this. Given that I have been able to run a few compsets, I was wondering which of the porting tests are absolutely important to pass and which can be ignored due to machine-specific configuration requirements.

What version of the code are you using?
release-cesm2.1.5

Have you made any changes to files in the source tree?
I have made changes to config_machines.xml, config_compilers.xml, config_batch.xml and template.case.run for machine "Homebrew". The blocks showing the changes are in the attached file config_cesm.txt.
I have built zlib, hdf5, netcdf-c and netcdf-fortran using mpicc and mpifort from mpich/4.2.3-intel23. The configuration details (along with my .bashrc) are provided in cesm_compilers.txt

Describe your problem or question:
I am getting errors with scripts_regression_tests.py in : A_RunUnitTests, H_TestMakeMacros and for a few tests in Z_FullSystemTest (ERIO.f09_g16.X.homebrew_intel, IRT_N2.f19_g16_rx1.A.homebrew_intel, LDSTA.f45_g37_rx1.A.homebrew_intel, NCK_Ld3.f45_g37_rx1.A.homebrew_intel and PRE.f19_f19.ADESP_TEST.homebrew_intel).
I am running the individual tests within Z_FullSystemTest one by one. The tests NCK_Ld3.f45_g37_rx1.A.homebrew_intel and PRE.f19_f19.ADESP_TEST.homebrew_intel are running successfully, but results are showing differences with cprnc.

I am providing error messages for A_RunUnitTests and H_TestMakeMacros below and attaching the logs for the relevant Z_FullSystemTest tests.

(cesm-env) [f0080v3@discovery-01 tests]$ ./scripts_regression_tests.py A_RunUnitTests --machine homebrew --compiler intel

Testing commit 16b5f7de570d2454af282c1bc57f80c8f211293b
Using cime_model = cesm
Testing machine = homebrew
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_011258

test_CIMEXML_doctests (__main__.A_RunUnitTests) ... FAIL
test_CIME_doctests (__main__.A_RunUnitTests) ... ok
test_resolve_variable_name (__main__.A_RunUnitTests) ... ok
test_unittests (__main__.A_RunUnitTests) ... .........................
----------------------------------------------------------------------
Ran 25 tests in 2.736s

OK
ok

======================================================================
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./scripts_regression_tests.py", line 139, in test_CIMEXML_doctests
run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML"))
File "./scripts_regression_tests.py", line 76, in run_cmd_assert_result
test_obj.assertEqual(stat, expected_stat, msg=msg)
AssertionError: 1 != 0 :
COMMAND: PYTHONPATH=/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1
FROM_DIR: /dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML
SHOULD HAVE WORKED, INSTEAD GOT STAT 1
OUTPUT: **********************************************************************
File "/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML/machines.py", line 258, in machines.Machines.is_valid_compiler
Failed example:
machobj.is_valid_compiler("gnu")
Expected:
True
Got:
False
**********************************************************************
1 items had failures:
1 of 4 in machines.Machines.is_valid_compiler
***Test Failed*** 1 failures.
ERRPUT:


----------------------------------------------------------------------
Ran 4 tests in 6.183s

FAILED (failures=1)
Detected failures, leaving directory: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_011258

(cesm-env) [f0080v3@discovery-01 tests]$ ./scripts_regression_tests.py H_TestMakeMacros --machine homebrew --compiler intel

Testing commit 16b5f7de570d2454af282c1bc57f80c8f211293b
Using cime_model = cesm
Testing machine = homebrew
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_014959

test_append_flags (__main__.H_TestMakeMacros)
Test appending flags to a list. ... ok
test_append_flags_without_base (__main__.H_TestMakeMacros)
Test appending flags to a value set before Macros is included. ... ok
test_base_flags (__main__.H_TestMakeMacros)
Test that we get "base" compiler flags. ... ok
test_build_time_append_flags (__main__.H_TestMakeMacros)
Test build_time selection of compiler flags. ... ok
test_build_time_attribute (__main__.H_TestMakeMacros)
The macro writer writes conditionals for build-time choices. ... ok
test_build_time_base_flags (__main__.H_TestMakeMacros)
Test selection of base flags based on build-time attributes. ... ok
test_build_time_base_flags_same_parent (__main__.H_TestMakeMacros)
Test selection of base flags in the same parent element. ... ok
test_compiler_changeable_at_build_time (__main__.H_TestMakeMacros)
The macro writer writes information for multiple compilers. ... ok
test_config_reject_cyclical_references (__main__.H_TestMakeMacros)
Test that cyclical <var> references are rejected. ... ok
test_config_reject_self_references (__main__.H_TestMakeMacros)
Test that <var> self-references are rejected. ... ok
test_config_variable_insertion (__main__.H_TestMakeMacros)
Test that <var> elements insert variables from config_build. ... ok
test_env_and_shell_command (__main__.H_TestMakeMacros)
Test that <env> elements work inside <shell> elements. ... ok
test_environment_variable_insertion (__main__.H_TestMakeMacros)
Test that <env> elements insert environment variables. ... FAIL
test_generic_item (__main__.H_TestMakeMacros)
The macro writer can write out a single generic item. ... ok
test_ignore_non_match (__main__.H_TestMakeMacros)
The macro writer ignores an entry with the wrong machine name. ... ok
test_mach_and_os_beats_mach (__main__.H_TestMakeMacros)
The macro writer chooses the most-specific match possible. ... ok
test_mach_beats_os (__main__.H_TestMakeMacros)
The macro writer chooses machine-specific over os-specific matches. ... ok
test_machine_specific_append_flags (__main__.H_TestMakeMacros)
Test appending flags that are either more or less machine-specific. ... ok
test_machine_specific_base_and_append_flags (__main__.H_TestMakeMacros)
Test that machine-specific base flags coexist with machine-specific append flags. ... ok
test_machine_specific_base_flags (__main__.H_TestMakeMacros)
Test selection among base compiler flag sets based on machine. ... ok
test_machine_specific_base_over_append_flags (__main__.H_TestMakeMacros)
Test that machine-specific base flags override default append flags. ... ok
test_machine_specific_item (__main__.H_TestMakeMacros)
The macro writer can pick out a machine-specific item. ... ok
test_multiple_shell_commands (__main__.H_TestMakeMacros)
Test that more than one <shell> element can be used. ... ok
test_os_specific_item (__main__.H_TestMakeMacros)
The macro writer can pick out an OS-specific item. ... ok
test_reject_ambiguous (__main__.H_TestMakeMacros)
The macro writer dies if given an ambiguous set of matches. ... ok
test_reject_duplicate_defaults (__main__.H_TestMakeMacros)
The macro writer dies if given many defaults. ... ok
test_reject_duplicates (__main__.H_TestMakeMacros)
The macro writer dies if given many matches for a given configuration. ... ok
test_shell_command_insertion (__main__.H_TestMakeMacros)
Test that <shell> elements insert shell command output. ... ok
test_variable_insertion_with_machine_specific_setting (__main__.H_TestMakeMacros)
Test that machine-specific <var> dependencies are correct. ... ok

======================================================================
FAIL: test_environment_variable_insertion (__main__.H_TestMakeMacros)
Test that <env> elements insert environment variables.
----------------------------------------------------------------------
Traceback (most recent call last):
File "./scripts_regression_tests.py", line 2591, in test_environment_variable_insertion
env={"NETCDF": "/path/to/netcdf"})
File "./scripts_regression_tests.py", line 2234, in assert_variable_equals
self.parent.assertEqual(self.query_var(var_name, env, var), value)
AssertionError: '-L/dartfs-hpc/rc/home/3/f0080v3/software/zli[182 chars]tcdf' != '-L/path/to/netcdf -lnetcdf'
- -L/dartfs-hpc/rc/home/3/f0080v3/software/zlib/lib -L/dartfs-hpc/rc/home/3/f0080v3/software/hdf5/lib -L/dartfs-hpc/rc/home/3/f0080v3/software/netcdf/lib -L/dartfs-hpc/rc/home/3/f0080v3/software/netcdf/lib -L/path/to/netcdf -lnetcdf
+ -L/path/to/netcdf -lnetcdf


----------------------------------------------------------------------
Ran 29 tests in 3.532s

FAILED (failures=1)
Detected failures, leaving directory: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_014959
 

Attachments

jedwards

CSEG and Liaisons
Staff member
You should not overwrite the definition of the machine homebrew - instead make a copy and give it a unique name. Then I think that the easiest way to begin debugging is to pick one of the system tests - say SMS.f19_g16.A and get that working before attempting the entire test suite again.
 

P Banerjee

Priyanka Banerjee
New Member
Thank you for your reply. I have now given a unique machine name and the SMS.f19_g16.A and SMS.f19_g16.X tests are working. However, I keep on getting the same error messages with scripts_regression_tests.py. What am I missing here?
 

jedwards

CSEG and Liaisons
Staff member
Now that you have solved that problem let's focus on another - single issue (not the entire scripts_regression_tests.py) pick one that you want to
solve and post all of the relevant logs and settings.
 

P Banerjee

Priyanka Banerjee
New Member
Alright, I am starting with the first failed test A_RunUnitTests. All the relevant changes made in config_*.xml are shown in the attached config_cesm_discovery.txt and the netcdf/hdf5 configuration details are shown in Discovery_hpc_settings .txt. The code version is in git_describe.txt.

I am getting the following error message. It seems that cesm is detecting machine="derecho" from machines.py
#####################################################################################################################

(cesm-env) [f0080v3@discovery-01 tests]$ ./scripts_regression_tests.py A_RunUnitTests --machine discovery --compiler intel

Testing commit 16b5f7de570d2454af282c1bc57f80c8f211293b
Using cime_model = cesm
Testing machine = discovery
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251124_105013

test_CIMEXML_doctests (__main__.A_RunUnitTests) ... FAIL
test_CIME_doctests (__main__.A_RunUnitTests) ... ok
test_resolve_variable_name (__main__.A_RunUnitTests) ... ok
test_unittests (__main__.A_RunUnitTests) ... .........................
----------------------------------------------------------------------
Ran 25 tests in 2.348s

OK
ok

======================================================================
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./scripts_regression_tests.py", line 139, in test_CIMEXML_doctests
run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML"))
File "./scripts_regression_tests.py", line 76, in run_cmd_assert_result
test_obj.assertEqual(stat, expected_stat, msg=msg)
AssertionError: 1 != 0 :
COMMAND: PYTHONPATH=/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1
FROM_DIR: /dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML
SHOULD HAVE WORKED, INSTEAD GOT STAT 1
OUTPUT: **********************************************************************
File "/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML/machines.py", line 258, in machines.Machines.is_valid_compiler
Failed example:
machobj.is_valid_compiler("gnu")
Expected:
True
Got:
False
**********************************************************************
1 items had failures:
1 of 4 in machines.Machines.is_valid_compiler
***Test Failed*** 1 failures.
ERRPUT:


----------------------------------------------------------------------
Ran 4 tests in 5.627s

FAILED (failures=1)
Detected failures, leaving directory: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251124_105013

##########################################################################################################################
 

Attachments

jedwards

CSEG and Liaisons
Staff member
The failures in A_RunUnitTests are on us, the machine cori-haswell was removed from the machine definition without
updating the cesm2.1.x tag. Go to the cime directory in your source and run
Code:
git checkout maint-5.6
this will give you the latest cime compatible with cesm2.1.5.
Then try running scripts_regression_tests.py again.
 

P Banerjee

Priyanka Banerjee
New Member
Thanks for your suggestion. However, I am getting the same error message as before with A_RunUnitTests. This is what I did:

(cesm-env) [f0080v3@discovery-01 cime]$ git checkout maint-5.6

M config/cesm/config_inputdata.xml
M config/cesm/machines/config_batch.xml
M config/cesm/machines/config_compilers.xml
M config/cesm/machines/config_machines.xml
M config/cesm/machines/template.case.run
branch 'maint-5.6' set up to track 'origin/maint-5.6'.
Switched to a new branch 'maint-5.6'

(cesm-env) [f0080v3@discovery-01 cime]$ git status
On branch maint-5.6
Your branch is up to date with 'origin/maint-5.6'.

Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: config/cesm/config_inputdata.xml
modified: config/cesm/machines/config_batch.xml
modified: config/cesm/machines/config_compilers.xml
modified: config/cesm/machines/config_machines.xml
modified: config/cesm/machines/template.case.run

Untracked files:
(use "git add <file>..." to include in what will be committed)
config/cesm/machines/config_batch.xml.new
config/cesm/machines/config_compilers.xml.new
config/cesm/machines/config_machines.xml.new
config/cesm/machines/template.case.run.new
config/cesm/machines/template.case.run.old
scripts/jraeco/
scripts/prealpha_list

no changes added to commit (use "git add" and/or "git commit -a")
 

jedwards

CSEG and Liaisons
Staff member
Just to confirm - you are getting this same error after the update?

Code:
======================================================================
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "./scripts_regression_tests.py", line 139, in test_CIMEXML_doctests
   run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML"))
 File "./scripts_regression_tests.py", line 76, in run_cmd_assert_result
   test_obj.assertEqual(stat, expected_stat, msg=msg)
AssertionError: 1 != 0 :
COMMAND: PYTHONPATH=/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1
FROM_DIR: /dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML
SHOULD HAVE WORKED, INSTEAD GOT STAT 1
OUTPUT: **********************************************************************
File "/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML/machines.py", line 258, in machines.Machines.is_valid_compiler
Failed example:
   machobj.is_valid_compiler("gnu")
Expected:
   True
Got:
   False

Try again after setting env variable CIME_MODEL=cesm.
 

P Banerjee

Priyanka Banerjee
New Member
Yes, I am getting the same error after git checkout maint-5.6. Even after including CIME_MODEL=cesm in ~/.bashrc the error persists, although scripts_regression_tests is showing that cime_model=cesm is used.

Using cime_model = cesm
Testing machine = discovery
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251126_011332

I am trying to solve this for quite a few days now and am really confused about what am I missing.
 
Back
Top