Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Spinup site simulation early exit_CLM 4.5

Hi!I am trying to make a spinup run for a site (single point simulation) using input data created from clm tools. I used a I compset (I_1850_CLM45_CN_4Me) in clm4_0_60 and ran the model for 600 years. It ran fine but crashed after 383 years without any specific error message. I could not figure out why it exited so early.I would be very grateful for any advice you may think of.Here is the detail of my case:./create_newcase -case 1x1pt_MinnesotaWetlandUS_add-Spinup_600yrs -res CLM_USRDAT -compset I_1850_CLM45_CN_4Me -mach yellowstonecd 1x1pt_MinnesotaWetlandUS_add-Spinup_600yrs./xmlchange -append CLM_CONFIG_OPTS="-spinup AD"
./xmlchange STOP_OPTION="nyears"
./xmlchange STOP_N="600"
./xmlchange CLM_USRDAT_NAME="1x1_MinnesotaWetlandUS"
./xmlchange ATM_DOMAIN_FILE="domain.lnd.1x1_MinnesotaWetlandUS_noocean.nc"
./xmlchange LND_DOMAIN_FILE="domain.lnd.1x1_MinnesotaWetlandUS_noocean.nc"
./xmlchange DIN_LOC_ROOT="$MYCSMDATA"
./xmlchange CLM_FORCE_COLDSTART="on"
./xmlchange RESUBMIT="2"I resubmited the job twice with BSUB -W 8:00 (8 hours) which seems reasonable.My case directory: /glade/u/home/rpaudel/point_simulations/const_clm4060/scripts/1x1pt_MinnesotaWetlandUS_add-Spinup_600yrsOutput:
/glade/scratch/rpaudel/1x1pt_MinnesotaWetlandUS_add-Spinup_600yrs/run
 

slevis

Moderator
Staff member
I also do not see something wrong from a quick look, so my next guess is that you've run out of space somewhere. Type myquota to see whether this is the case. Not sure what else to suggest at this point...
Sam
 

slevis

Moderator
Staff member
I also do not see something wrong from a quick look, so my next guess is that you've run out of space somewhere. Type myquota to see whether this is the case. Not sure what else to suggest at this point...
Sam
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
I don't see anything obviously wrong in what you have posted here. Your directory on yellowstone is gone, but I see you've setup other cases.
The main advice I have is to make sure you look in ALL of the different log files as well as the batch log file. Errors could show up in any one of them. You should especially look at the cems.log file. The CLM UG has a chapter on trouble shooting and some suggestions you might find useful. http://www.cesm.ucar.edu/models/cesm1.2/clm/models/lnd/clm/doc/UsersGuide/x13571.html For single point you are already running with a single processor. But, look at the bit about running in DEBUG mode and/or using a debugger. If it saved any core files you can use them to see how it terminated. Check with CISL staff on help for debuggers.  Erik
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
I don't see anything obviously wrong in what you have posted here. Your directory on yellowstone is gone, but I see you've setup other cases.
The main advice I have is to make sure you look in ALL of the different log files as well as the batch log file. Errors could show up in any one of them. You should especially look at the cems.log file. The CLM UG has a chapter on trouble shooting and some suggestions you might find useful. http://www.cesm.ucar.edu/models/cesm1.2/clm/models/lnd/clm/doc/UsersGuide/x13571.html For single point you are already running with a single processor. But, look at the bit about running in DEBUG mode and/or using a debugger. If it saved any core files you can use them to see how it terminated. Check with CISL staff on help for debuggers.  Erik
 
Hi Erik,Thank you very much. It seems like the job was not resubmitted. When I increased the wall clock to 12 hours and ran for 500 years, it ran perfectly.  -Rajendra
 
Hi Erik,Thank you very much. It seems like the job was not resubmitted. When I increased the wall clock to 12 hours and ran for 500 years, it ran perfectly.  -Rajendra
 
Top