Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Verifying port with prealpha test and scripts_regression_tests.py

jonwells04

Jon Wells
New Member
CESM 2.2.0 Installed on Linux 8.4 (Oopta), slurm version 20.11.7

The scripts_regression_tests.py results show 5 failing errors (output attached):
  1. test_cime_case_test_custom_project (__main__.K_TestCimeCase): test asks for --machine mappy that doesn't exist, so fails
  2. test_bless_test_results (__main__.Q_TestBlessTestResults): finishes with status 'DIFF'
  3. test_run_restart (__main__.T_TestRunRestart): finishes with status 'DIFF'
  4. test_run_restart_too_many_fails (__main__.T_TestRunRestart): no clear info in fail message
  5. test_full_system (__main__.Z_FullSystemTest): everything builds and submits, generally show completed on batch system, but fails in test result
./create_test --xml-category prealpha --xml-machine cheyenne --xml-compiler intel --machine wind --compiler gnu:
  1. everything seems to have been built and submitted to the Slurm scheduler
  2. Our cluster is too small to run all the tests in a timely manner so I've had to cancel many of them in the queue
  3. The small jobs were passing, some of the larger tests failed for numerous reasons:
    1. missing input data
    2. run timeouts
    3. not enough resources
  4. Overall it seems like the port is working for what we will run.
Is it essential these script_regression_tests pass? Guidance on where to start troubleshooting would be greatly appreciated!

Thanks!
 

Attachments

  • scripts_regression_test_results_64cpu-per-node.txt
    82.2 KB · Views: 3
  • config_machines.zip
    27 KB · Views: 1

jedwards

CSEG and Liaisons
Staff member
If you are satisfied that the port is complete for the problem you will run you're probably okay.

From the tests that fail it looks like you should at least make sure a restart test is working correctly,
for example try running create_test ERS.f19_g17.A
 

jonwells04

Jon Wells
New Member
Thanks for the quick reply Jim!

I ran the ERS.f19_g17.A test, results attached. It seems the STOP_N is causing a failure in MODEL_BUILD.

How do I assure the original and new STOP_N values line up for the test suite? In general we edit this for each case and I know how to use .xmlchange but I don't know where to standardize the value for the test suite.

Thanks!
 

Attachments

  • TestStatus.zip
    2.4 KB · Views: 1

jedwards

CSEG and Liaisons
Staff member
You have a compile error:
ERROR: BUILD FAIL: buildexe failed, cat /scratch/jw2636/cesmoutput/ERS.f19_g17.A.wind_gnu.20220217_104227_21ppl6/bld/cesm.bldlog.220217-104308
 

jonwells04

Jon Wells
New Member
The build log is attached. Thank you for spotting these errors!
 

Attachments

  • cesm.bldlog.zip
    2.4 KB · Views: 5

jedwards

CSEG and Liaisons
Staff member
Can you not read the file and find these errors?

/usr/bin/ld: cannot find -llapack
/usr/bin/ld: cannot find -lblas
 
Top