Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

FAIL: test_full_system

ntandon

Neil Tandon
Member
I am attempting to run scripts_regression_test.py, and test_full_system is failing with the output in test_full_system.txt (attached). There is an error Wrong type for entry id 'NTASKS', but I don't know why I am getting this error. Do I need to set --ntasks in config_batch.xml, and if so how? It seems that the slurm block in config_batch.xml automatically sets --ntasks-per-node, which I thought was sufficient.
 

Attachments

  • config_batch.xml.txt
    564 bytes · Views: 1
  • config_compilers.xml.txt
    652 bytes · Views: 0
  • config_machines.xml.txt
    7 KB · Views: 0
  • describe_version.txt
    4.7 KB · Views: 0
  • test_full_system.txt
    26.2 KB · Views: 0

ntandon

Neil Tandon
Member
For what it's worth, I get the same error when running the standalone test ./create_test SEQ_Ln9.f19_g16_rx1.A.cedar_intel. Output of TestStatus.log below. It looks like there might be an issue with the creation of files in the case2 subfolder, but they all get wiped out after the build fails, and so I don't really know what's going on.

---------------------------------------------------
2020-12-02 17:08:20: SHAREDLIB_BUILD FAILED for test 'SEQ_Ln9.f19_g16_rx1.A.cedar_intel'.
Command: ./case.build --sharedlib-only
Output: b"WARNING: Found difference in test STOP_OPTION: case: nsteps original value ndays\n Successfully created new case SEQ_
Ln9.f19_g16_rx1.A.cedar_intel.20201202_170755_mcicpa from clone case SEQ_Ln9.f19_g16_rx1.A.cedar_intel.20201202_170755_mcicpa \
nSetting resource.RLIMIT_STACK to 16384000 from (8388608, 2147483648)\njob is case.test USER_REQUESTED_WALLTIME None USER_REQUE
STED_QUEUE None WALLTIME_FORMAT %H:%M:%S\nCreating batch scripts\nWriting case.test script from input template /home/ntandon/my
_cesm_sandbox/cime/config/cesm/machines/template.case.test\nCreating file .case.test\nWriting case.st_archive script from input
template /home/ntandon/my_cesm_sandbox/cime/config/cesm/machines/template.st_archive\nCreating file case.st_archive\nIf an old
case build already exists, might want to run 'case.build --clean' before building\nYou can now run './preview_run' to get more
info on how your case will be run\nSetting resource.RLIMIT_STACK to 16384000 from (16384000, 2147483648)\nSuccessfully cleaned
batch script .case.test\njob is case.test USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S\nCre
ating batch scripts\nWriting case.test script from input template /home/ntandon/my_cesm_sandbox/cime/config/cesm/machines/templ
ate.case.test\nCreating file .case.test\nWriting case.st_archive script from input template /home/ntandon/my_cesm_sandbox/cime/
config/cesm/machines/template.st_archive\nCreating file case.st_archive\nIf an old case build already exists, might want to run
'case.build --clean' before building\nYou can now run './preview_run' to get more info on how your case will be run\nSetting r
esource.RLIMIT_STACK to 16384000 from (16384000, 2147483648)\nSetting resource.RLIMIT_STACK to 16384000 from (16384000, 2147483
648)\nBuilding test for SEQ in directory /scratch/ntandon/cesm2_1_3/SEQ_Ln9.f19_g16_rx1.A.cedar_intel.20201202_170755_mcicpa\n/
scratch/ntandon/cesm2_1_3/SEQ_Ln9.f19_g16_rx1.A.cedar_intel.20201202_170755_mcicpa/case2/SEQ_Ln9.f19_g16_rx1.A.cedar_intel.2020
1202_170755_mcicpa/env_mach_specific.xml already exists, delete to replace\nWARNING: Test case setup failed. Case2 has been rem
oved, but the main case may be in an inconsistent state. If you want to rerun this test, you should create a new test rather th
an trying to rerun this one.\nERROR: Wrong type for entry id 'NTASKS'"
 

ntandon

Neil Tandon
Member
I finally got the test to pass, but it involved a bit of hacking. I had to run case.build first under python2, which would fail, then run it again under python3, which would complete fine, then case.submit. If I attempted to run case.build entirely under python3 (without attempting under python2 first), it would fail because the xml files generated under the case2 subdirectory would have problems. So it appears that the code that generates the case2 clone needs to be updated to be python3-friendly.
 
Top