Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Issues with scripts_regression_tests.py

P Banerjee

Priyanka Banerjee
New Member
Hello,

I am trying to port CESM2 to the Discovery HPC at Dartmouth College. I have been able to successfully run a few short test simulations with B, C and G compsets for about 1-2 years using about 240-300 processors. However, I am having issues with scripts_regression_tests.py. I have gone through the forum, but am unable to resolve these issues and appreciate any advice on this.

I would like to carry out the prealpha and ensemble tests after this. Given that I have been able to run a few compsets, I was wondering which of the porting tests are absolutely important to pass and which can be ignored due to machine-specific configuration requirements.

What version of the code are you using?
release-cesm2.1.5

Have you made any changes to files in the source tree?
I have made changes to config_machines.xml, config_compilers.xml, config_batch.xml and template.case.run for machine "Homebrew". The blocks showing the changes are in the attached file config_cesm.txt.
I have built zlib, hdf5, netcdf-c and netcdf-fortran using mpicc and mpifort from mpich/4.2.3-intel23. The configuration details (along with my .bashrc) are provided in cesm_compilers.txt

Describe your problem or question:
I am getting errors with scripts_regression_tests.py in : A_RunUnitTests, H_TestMakeMacros and for a few tests in Z_FullSystemTest (ERIO.f09_g16.X.homebrew_intel, IRT_N2.f19_g16_rx1.A.homebrew_intel, LDSTA.f45_g37_rx1.A.homebrew_intel, NCK_Ld3.f45_g37_rx1.A.homebrew_intel and PRE.f19_f19.ADESP_TEST.homebrew_intel).
I am running the individual tests within Z_FullSystemTest one by one. The tests NCK_Ld3.f45_g37_rx1.A.homebrew_intel and PRE.f19_f19.ADESP_TEST.homebrew_intel are running successfully, but results are showing differences with cprnc.

I am providing error messages for A_RunUnitTests and H_TestMakeMacros below and attaching the logs for the relevant Z_FullSystemTest tests.

(cesm-env) [f0080v3@discovery-01 tests]$ ./scripts_regression_tests.py A_RunUnitTests --machine homebrew --compiler intel

Testing commit 16b5f7de570d2454af282c1bc57f80c8f211293b
Using cime_model = cesm
Testing machine = homebrew
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_011258

test_CIMEXML_doctests (__main__.A_RunUnitTests) ... FAIL
test_CIME_doctests (__main__.A_RunUnitTests) ... ok
test_resolve_variable_name (__main__.A_RunUnitTests) ... ok
test_unittests (__main__.A_RunUnitTests) ... .........................
----------------------------------------------------------------------
Ran 25 tests in 2.736s

OK
ok

======================================================================
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./scripts_regression_tests.py", line 139, in test_CIMEXML_doctests
run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML"))
File "./scripts_regression_tests.py", line 76, in run_cmd_assert_result
test_obj.assertEqual(stat, expected_stat, msg=msg)
AssertionError: 1 != 0 :
COMMAND: PYTHONPATH=/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1
FROM_DIR: /dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML
SHOULD HAVE WORKED, INSTEAD GOT STAT 1
OUTPUT: **********************************************************************
File "/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML/machines.py", line 258, in machines.Machines.is_valid_compiler
Failed example:
machobj.is_valid_compiler("gnu")
Expected:
True
Got:
False
**********************************************************************
1 items had failures:
1 of 4 in machines.Machines.is_valid_compiler
***Test Failed*** 1 failures.
ERRPUT:


----------------------------------------------------------------------
Ran 4 tests in 6.183s

FAILED (failures=1)
Detected failures, leaving directory: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_011258

(cesm-env) [f0080v3@discovery-01 tests]$ ./scripts_regression_tests.py H_TestMakeMacros --machine homebrew --compiler intel

Testing commit 16b5f7de570d2454af282c1bc57f80c8f211293b
Using cime_model = cesm
Testing machine = homebrew
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_014959

test_append_flags (__main__.H_TestMakeMacros)
Test appending flags to a list. ... ok
test_append_flags_without_base (__main__.H_TestMakeMacros)
Test appending flags to a value set before Macros is included. ... ok
test_base_flags (__main__.H_TestMakeMacros)
Test that we get "base" compiler flags. ... ok
test_build_time_append_flags (__main__.H_TestMakeMacros)
Test build_time selection of compiler flags. ... ok
test_build_time_attribute (__main__.H_TestMakeMacros)
The macro writer writes conditionals for build-time choices. ... ok
test_build_time_base_flags (__main__.H_TestMakeMacros)
Test selection of base flags based on build-time attributes. ... ok
test_build_time_base_flags_same_parent (__main__.H_TestMakeMacros)
Test selection of base flags in the same parent element. ... ok
test_compiler_changeable_at_build_time (__main__.H_TestMakeMacros)
The macro writer writes information for multiple compilers. ... ok
test_config_reject_cyclical_references (__main__.H_TestMakeMacros)
Test that cyclical <var> references are rejected. ... ok
test_config_reject_self_references (__main__.H_TestMakeMacros)
Test that <var> self-references are rejected. ... ok
test_config_variable_insertion (__main__.H_TestMakeMacros)
Test that <var> elements insert variables from config_build. ... ok
test_env_and_shell_command (__main__.H_TestMakeMacros)
Test that <env> elements work inside <shell> elements. ... ok
test_environment_variable_insertion (__main__.H_TestMakeMacros)
Test that <env> elements insert environment variables. ... FAIL
test_generic_item (__main__.H_TestMakeMacros)
The macro writer can write out a single generic item. ... ok
test_ignore_non_match (__main__.H_TestMakeMacros)
The macro writer ignores an entry with the wrong machine name. ... ok
test_mach_and_os_beats_mach (__main__.H_TestMakeMacros)
The macro writer chooses the most-specific match possible. ... ok
test_mach_beats_os (__main__.H_TestMakeMacros)
The macro writer chooses machine-specific over os-specific matches. ... ok
test_machine_specific_append_flags (__main__.H_TestMakeMacros)
Test appending flags that are either more or less machine-specific. ... ok
test_machine_specific_base_and_append_flags (__main__.H_TestMakeMacros)
Test that machine-specific base flags coexist with machine-specific append flags. ... ok
test_machine_specific_base_flags (__main__.H_TestMakeMacros)
Test selection among base compiler flag sets based on machine. ... ok
test_machine_specific_base_over_append_flags (__main__.H_TestMakeMacros)
Test that machine-specific base flags override default append flags. ... ok
test_machine_specific_item (__main__.H_TestMakeMacros)
The macro writer can pick out a machine-specific item. ... ok
test_multiple_shell_commands (__main__.H_TestMakeMacros)
Test that more than one <shell> element can be used. ... ok
test_os_specific_item (__main__.H_TestMakeMacros)
The macro writer can pick out an OS-specific item. ... ok
test_reject_ambiguous (__main__.H_TestMakeMacros)
The macro writer dies if given an ambiguous set of matches. ... ok
test_reject_duplicate_defaults (__main__.H_TestMakeMacros)
The macro writer dies if given many defaults. ... ok
test_reject_duplicates (__main__.H_TestMakeMacros)
The macro writer dies if given many matches for a given configuration. ... ok
test_shell_command_insertion (__main__.H_TestMakeMacros)
Test that <shell> elements insert shell command output. ... ok
test_variable_insertion_with_machine_specific_setting (__main__.H_TestMakeMacros)
Test that machine-specific <var> dependencies are correct. ... ok

======================================================================
FAIL: test_environment_variable_insertion (__main__.H_TestMakeMacros)
Test that <env> elements insert environment variables.
----------------------------------------------------------------------
Traceback (most recent call last):
File "./scripts_regression_tests.py", line 2591, in test_environment_variable_insertion
env={"NETCDF": "/path/to/netcdf"})
File "./scripts_regression_tests.py", line 2234, in assert_variable_equals
self.parent.assertEqual(self.query_var(var_name, env, var), value)
AssertionError: '-L/dartfs-hpc/rc/home/3/f0080v3/software/zli[182 chars]tcdf' != '-L/path/to/netcdf -lnetcdf'
- -L/dartfs-hpc/rc/home/3/f0080v3/software/zlib/lib -L/dartfs-hpc/rc/home/3/f0080v3/software/hdf5/lib -L/dartfs-hpc/rc/home/3/f0080v3/software/netcdf/lib -L/dartfs-hpc/rc/home/3/f0080v3/software/netcdf/lib -L/path/to/netcdf -lnetcdf
+ -L/path/to/netcdf -lnetcdf


----------------------------------------------------------------------
Ran 29 tests in 3.532s

FAILED (failures=1)
Detected failures, leaving directory: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251120_014959
 

Attachments

  • config_cesm.txt
    4.2 KB · Views: 1
  • PRE.f19_f19.ADESP_TEST.homebrew_intel.cpl.hi.0001-01-01-18000.nc.base.cprnc.out.txt
    108.2 KB · Views: 0
  • NCK_Ld3.f45_g37_rx1.A.homebrew_intel.cpl.hi.0001-01-04-00000.nc.base.cprnc.out.txt
    58.6 KB · Views: 0
  • LDSTA.f45_g37_rx1.A.homebrew_intel.log.txt
    1.2 KB · Views: 0
  • IRT_N2.f19_g16_rx1.A.homebrew_intel.log.txt
    15.8 KB · Views: 0
  • ERIO.f09_g16.X.homebrew_intel.log.txt
    8.5 KB · Views: 0
  • cesm_compilers.txt
    8.6 KB · Views: 1

jedwards

CSEG and Liaisons
Staff member
You should not overwrite the definition of the machine homebrew - instead make a copy and give it a unique name. Then I think that the easiest way to begin debugging is to pick one of the system tests - say SMS.f19_g16.A and get that working before attempting the entire test suite again.
 

P Banerjee

Priyanka Banerjee
New Member
Thank you for your reply. I have now given a unique machine name and the SMS.f19_g16.A and SMS.f19_g16.X tests are working. However, I keep on getting the same error messages with scripts_regression_tests.py. What am I missing here?
 

jedwards

CSEG and Liaisons
Staff member
Now that you have solved that problem let's focus on another - single issue (not the entire scripts_regression_tests.py) pick one that you want to
solve and post all of the relevant logs and settings.
 

P Banerjee

Priyanka Banerjee
New Member
Alright, I am starting with the first failed test A_RunUnitTests. All the relevant changes made in config_*.xml are shown in the attached config_cesm_discovery.txt and the netcdf/hdf5 configuration details are shown in Discovery_hpc_settings .txt. The code version is in git_describe.txt.

I am getting the following error message. It seems that cesm is detecting machine="derecho" from machines.py
#####################################################################################################################

(cesm-env) [f0080v3@discovery-01 tests]$ ./scripts_regression_tests.py A_RunUnitTests --machine discovery --compiler intel

Testing commit 16b5f7de570d2454af282c1bc57f80c8f211293b
Using cime_model = cesm
Testing machine = discovery
Testing compiler = intel
Test root: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251124_105013

test_CIMEXML_doctests (__main__.A_RunUnitTests) ... FAIL
test_CIME_doctests (__main__.A_RunUnitTests) ... ok
test_resolve_variable_name (__main__.A_RunUnitTests) ... ok
test_unittests (__main__.A_RunUnitTests) ... .........................
----------------------------------------------------------------------
Ran 25 tests in 2.348s

OK
ok

======================================================================
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./scripts_regression_tests.py", line 139, in test_CIMEXML_doctests
run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML"))
File "./scripts_regression_tests.py", line 76, in run_cmd_assert_result
test_obj.assertEqual(stat, expected_stat, msg=msg)
AssertionError: 1 != 0 :
COMMAND: PYTHONPATH=/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1
FROM_DIR: /dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML
SHOULD HAVE WORKED, INSTEAD GOT STAT 1
OUTPUT: **********************************************************************
File "/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML/machines.py", line 258, in machines.Machines.is_valid_compiler
Failed example:
machobj.is_valid_compiler("gnu")
Expected:
True
Got:
False
**********************************************************************
1 items had failures:
1 of 4 in machines.Machines.is_valid_compiler
***Test Failed*** 1 failures.
ERRPUT:


----------------------------------------------------------------------
Ran 4 tests in 5.627s

FAILED (failures=1)
Detected failures, leaving directory: /dartfs/rc/lab/C/ClayCarbon/cesm/cases/scripts_regression_test.20251124_105013

##########################################################################################################################
 

Attachments

  • config_cesm_discovery.txt
    5.4 KB · Views: 0
  • Discovery_hpc_settings.txt
    8.6 KB · Views: 0
  • git_describe.txt
    6.2 KB · Views: 0

jedwards

CSEG and Liaisons
Staff member
The failures in A_RunUnitTests are on us, the machine cori-haswell was removed from the machine definition without
updating the cesm2.1.x tag. Go to the cime directory in your source and run
Code:
git checkout maint-5.6
this will give you the latest cime compatible with cesm2.1.5.
Then try running scripts_regression_tests.py again.
 

P Banerjee

Priyanka Banerjee
New Member
Thanks for your suggestion. However, I am getting the same error message as before with A_RunUnitTests. This is what I did:

(cesm-env) [f0080v3@discovery-01 cime]$ git checkout maint-5.6

M config/cesm/config_inputdata.xml
M config/cesm/machines/config_batch.xml
M config/cesm/machines/config_compilers.xml
M config/cesm/machines/config_machines.xml
M config/cesm/machines/template.case.run
branch 'maint-5.6' set up to track 'origin/maint-5.6'.
Switched to a new branch 'maint-5.6'

(cesm-env) [f0080v3@discovery-01 cime]$ git status
On branch maint-5.6
Your branch is up to date with 'origin/maint-5.6'.

Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: config/cesm/config_inputdata.xml
modified: config/cesm/machines/config_batch.xml
modified: config/cesm/machines/config_compilers.xml
modified: config/cesm/machines/config_machines.xml
modified: config/cesm/machines/template.case.run

Untracked files:
(use "git add <file>..." to include in what will be committed)
config/cesm/machines/config_batch.xml.new
config/cesm/machines/config_compilers.xml.new
config/cesm/machines/config_machines.xml.new
config/cesm/machines/template.case.run.new
config/cesm/machines/template.case.run.old
scripts/jraeco/
scripts/prealpha_list

no changes added to commit (use "git add" and/or "git commit -a")
 

jedwards

CSEG and Liaisons
Staff member
Just to confirm - you are getting this same error after the update?

Code:
======================================================================
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "./scripts_regression_tests.py", line 139, in test_CIMEXML_doctests
   run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML"))
 File "./scripts_regression_tests.py", line 76, in run_cmd_assert_result
   test_obj.assertEqual(stat, expected_stat, msg=msg)
AssertionError: 1 != 0 :
COMMAND: PYTHONPATH=/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1
FROM_DIR: /dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML
SHOULD HAVE WORKED, INSTEAD GOT STAT 1
OUTPUT: **********************************************************************
File "/dartfs-hpc/rc/home/3/f0080v3/my_cesm_sandbox/cime/scripts/lib/CIME/XML/machines.py", line 258, in machines.Machines.is_valid_compiler
Failed example:
   machobj.is_valid_compiler("gnu")
Expected:
   True
Got:
   False

Try again after setting env variable CIME_MODEL=cesm.
 
Top