test_full_system error at running scripts_regression_test.py

huazhen · Jun 25, 2019

I am trying to run CESM2 on super computer. I use intel/2017.u2 compiler. I get an error when I run scripts_regression_tests.py Z_FullSystemTest. The output document guided me the following errors: test_full_system (__main__.Z_FullSystemTest) ... FAIL Errors were:[b'Building test for SEQ in directory /data/cephfs/punim0769/scripts_regression_test.20190624_154119/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.fake_testing_only_20190624_154218', b'/data/cephfs/punim0769/scripts_regression_test.20190624_154119/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.fake_testing_only_20190624_154218/case2/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.fake_testing_only_20190624_154218/env_mach_specific.xml already exists, delete to replace', b'WARNING: Test case setup failed. Case2 has been removed, but the main case may be in an inconsistent state. If you want to rerun this test, you should create a new test rather than trying to rerun this one.',b"ERROR: Wrong type for entry id 'NTASKS'"] I guess the problem maybe caused by the baselines setting in $HOME/.cime/config_machines.xml? Is file cesm_baselines will build automatically? I just set the path manually. Because I have no idea how to get the path of . I will attach my $HOME/.cime/config_compiler.xml , config_machines.xml , config_batch.xml and the full output information. Any help is much appreciated. Thanks a lot.

jedwards · Jun 25, 2019

BASELINE_ROOT is an output directory for tests, you just need to create a directory and make it writable.

huazhen · Jun 25, 2019

Thanks a lot for your reply. I have created this directory before I run this test. So the issue not caused by baseline setting. Do you have any suggestions of the failed test I mentioned above?

jedwards · Jun 25, 2019

Perhaps try this test in a standalone manor to better see the error to do this go to the scripts directory and run./create_test SEQ_Ln9.f19_g16_rx1.A.spartan_intel

huazhen · Jun 26, 2019

Thanks for your reply.When I run ./create_test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in the scripts directory, I got the following messages. It seems like there was no error occured when try this test in a standalone manor. But I will get the same error messages again (error messages same as mentioned above) when I run scripts_regression_tests.py Z_FullSystemTest. I have no idea how to fix this issue. Do you have any suggestions? Thanks a lot.Testnames: ['SEQ_Ln9.f19_g16_rx1.A.spartan_intel']No project info availableCreating test directory /data/cephfs/punim0769/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.20190626_134757_d534jkRUNNING TESTS: SEQ_Ln9.f19_g16_rx1.A.spartan_intelStarting CREATE_NEWCASE for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel with 1 procsFinished CREATE_NEWCASE for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in 4.632281 seconds (PASS)Starting XML for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel with 1 procsFinished XML for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in 33.951805 seconds (PASS)Starting SETUP for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel with 1 procsFinished SETUP for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in 16.785348 seconds (PASS)Starting SHAREDLIB_BUILD for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel with 1 procsFinished SHAREDLIB_BUILD for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in 451.944163 seconds (PASS)Starting MODEL_BUILD for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel with 4 procsFinished MODEL_BUILD for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in 227.837094 seconds (PASS)Starting RUN for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel with 1 proc on interactive node and 32 procs on compute nodesFinished RUN for test SEQ_Ln9.f19_g16_rx1.A.spartan_intel in 15.414153 seconds (PEND). [COMPLETED 1 of 1]Due to presence of batch system, create_test will exit before tests are complete.To force create_test to wait for full completion, use --waitAt test-scheduler close, state is:PEND SEQ_Ln9.f19_g16_rx1.A.spartan_intel RUN Case dir: /data/cephfs/punim0769/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.20190626_134757_d534jktest-scheduler took 753.832948923 seconds

jedwards · Jun 26, 2019

Check the files /data/cephfs/punim0769/scripts_regression_test.20190624_154119/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.fake_testing_only_20190624_154218/Teststatus and TestStatus.log For errors during the run phase - I would not expect this test to behave differently in scripts_regression_tests than in standalone.

huazhen · Jun 26, 2019

Thanks for your advise.I still can't figure out where is the cause of the problem after checking the contents of TestStatus and TestStatus.log.The attachments are the contents of TestStatus and TestStatus.log in directories /data/cephfs/punim0769/scripts_regression_test.20190626_110136/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.fake_testing_only_20190626_110258 (not working) and /data/cephfs/punim0769/SEQ_Ln9.f19_g16_rx1.A.spartan_intel.20190626_134757_d534jk (working well)Do you have any suggestions? Thanks a lot.

jedwards · Jun 26, 2019

I don't know - but if the standalone test passes I think you can consider the port passing. What version of python are you using?

huazhen · Jun 26, 2019

Thanks for all your help.The python version I am using is Python/3.6.4-spartan_intel-2017.u2

test_full_system error at running scripts_regression_test.py

huazhen

Member

jedwards

CSEG and Liaisons

huazhen

Member

jedwards

CSEG and Liaisons

huazhen

Member

jedwards

CSEG and Liaisons

huazhen

Member

jedwards

CSEG and Liaisons

huazhen

Member