Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

resubmit issue

minimax

minimax
New Member
Dear colleagues,

i am facing the following problem with cesm2.2.2 (compset f2000climo with modified sea-ice file)-

./xmlchange SSTICE_DATA_FILENAME="$CASEDIR/sst_HadOIBl_bc_1.9x2.5_2000climo_noice.nc"
./xmlchange STOP_N=10,STOP_OPTION=nyears
./xmlchange RESUBMIT=2

when i run the model without modified sea-ice file, it runs fully (2 times resubmit) and finishes correctly.
when i run with modified sea-ice file, it does only 10 years and finishes without creating archive folder to store generated files.
I didn't find any errors in the logs. It also creates all the restart files. After that I can restart the run again without errors. It is also strange that when I run the model for 5 days, it works also correctly and creates all files (./xmlchange STOP_N=1,STOP_OPTION=days ./xmlchange RESUBMIT=4)

How to solve this issue?

thank you!
 
Solution
Hi there,

Usually when the resubmit fails to happen, it's because the model didn't actually finish successfully. Looking at the CaseStatus file here, it looks like you never got a "model execution success" message for the last run, so that adds evidence to this theory. There are reasons that the model fails that don't leave errors in the logs, and the most common one is because it took more time than was requested and the machine killed the process. In your case directory, there should be a "run.[casename].o[projectnumber]" file with the output from the super computer. Check in there to see if the simulation finished successfully or was stopped due to running out of wall-clock time.

minimax

minimax
New Member
CaseStatus file -
2025-03-15 18:48:20: xmlchange success <command> ./xmlchange SSTICE_DATA_FILENAME=/home/luna/cesm/cases/no_ice/sst_HadOIBl_bc_1.9x2.5_2000climo_noice.nc </command>
---------------------------------------------------
2025-03-15 18:48:20: xmlchange success <command> ./xmlchange NTASKS=378 </command>
---------------------------------------------------
2025-03-15 18:48:21: xmlchange success <command> ./xmlchange PIO_TYPENAME=netcdf </command>
---------------------------------------------------
2025-03-15 18:48:21: xmlchange success <command> ./xmlchange DOUT_S=TRUE </command>
---------------------------------------------------
2025-03-15 18:48:21: xmlchange success <command> ./xmlchange STOP_N=10,STOP_OPTION=nyears </command>
---------------------------------------------------
2025-03-15 18:48:21: xmlchange success <command> ./xmlchange RESUBMIT=2 </command>
---------------------------------------------------
2025-03-15 18:48:21: case.setup starting
---------------------------------------------------
2025-03-15 18:48:25: case.setup success
---------------------------------------------------
2025-03-15 18:48:28: case.build starting
---------------------------------------------------
2025-03-15 18:53:01: case.build success
---------------------------------------------------
2025-03-15 18:53:06: case.submit starting
---------------------------------------------------
2025-03-15 18:53:18: case.submit success case.run:12873, case.st_archive:12874
---------------------------------------------------
2025-03-15 18:53:20: case.run starting
---------------------------------------------------
2025-03-15 18:53:35: model execution starting
---------------------------------------------------
2025-03-16 14:06:38: model execution success
---------------------------------------------------
2025-03-16 14:06:38: case.run success
---------------------------------------------------
2025-03-16 21:45:56: case.submit starting
---------------------------------------------------
2025-03-16 21:46:08: case.submit success case.run:12894, case.st_archive:12895
---------------------------------------------------
2025-03-16 21:46:09: case.run starting
---------------------------------------------------
2025-03-16 21:46:20: model execution starting
---------------------------------------------------
 
Vote Upvote 0 Downvote

katec

CSEG and Liaisons
Staff member
Hi there,

Usually when the resubmit fails to happen, it's because the model didn't actually finish successfully. Looking at the CaseStatus file here, it looks like you never got a "model execution success" message for the last run, so that adds evidence to this theory. There are reasons that the model fails that don't leave errors in the logs, and the most common one is because it took more time than was requested and the machine killed the process. In your case directory, there should be a "run.[casename].o[projectnumber]" file with the output from the super computer. Check in there to see if the simulation finished successfully or was stopped due to running out of wall-clock time.
 
Vote Upvote 1 Downvote
Solution
Top