Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CESM simulation crashing upon resubmit

lgray

Laura Gray
New Member
Hi everyone,

I've been attempting to run CESM2.2.2 utilizing the F2000Climo compset and ne0CONUSne30x8_ne0CONUSne30x8_mt12 variable resolution grid. I was able to successfully run a startup simulation of 15 months from specified initial conditions, but upon resubmission, the simulation crashes immediately without any xml changes. I think this crash is happening maybe due to issues in the restart files, but I'm unable to identify what exactly is causing the problem. I've been able to find this warning in the CESM log file:

dec1238.hsn.de.hpc.ucar.edu 1882: MCT::m_AttrVect::indexRA_:: FATAL--attribute not found: "Flrl_rofsur" Traceback:

But according to the coupler log, it maybe has already processed something with this variable?

(prep_rof_merge) Summary:
(prep_rof_merge) x2r%Flrl_rofsur = = lfrac*l2x%Flrl_rofsur

The case run directory can be found here: /glade/derecho/scratch/leizhao/coupled_ctrl_nudge/run, and the case directory itself is located here: /glade/work/leizhao/lgray/coupled_ctrl_nudge

Changes to the xml files for this case's startup run include: ./xmlchange PIO_VERSION=1,STOP_OPTION=nmonths,STOP_N=15,RESUBMIT=7,RUN_STARTDATE=2006-01-01,CONTINUE_RUN=FALSE,ATM_NCPL=288. For this particular run, no changes have been made to the code of CESM itself. Any help on why this simulation doesn't want to restart would be greatly appreciated!
 

slevis

Moderator
Staff member
Since "rofsur" seems runoff and possibly river-related, my first thought is to disable MOSART in your simulation. To do that:
- you could start over with a new compset in which you replace MOSART with SROF
- I think you can also accomplish this in your existing case by changing MOSART_MODE in env_build.xml from ACTIVE to NULL

I'm curious if this works...
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Suggestion by @slevis is a good idea.
Also, I can report that I was able to run a 1-month simulation and then restart and run another month successfully with this compset. This was out of the box. See: /glade/work/oleson/release-cesm2.2.2/cime/scripts/cesm222_coupled_test

I think I could also try starting with a short out of the box run as I did above and see if you can run that successfully. Then build back in the namelist changes you've implemented, one step at a time (e.g., the nudging changes, the history output changes, etc).
Also, I'm wondering about your sandbox. When I run ./manage_externals/checkout_externals -S in your sandbox, I get:

? ./cime
e-o ./cime/src/drivers/nuopc/
? ./components/cam
./components/cam/chem_proc
? ./components/cam/src/atmos_phys
? ./components/cam/src/dynamics/fv3/atmos_cubed_sphere
./components/cam/src/physics/carma/base
? ./components/cam/src/physics/clubb
? ./components/cam/src/physics/cosp2/src
? ./components/cam/src/physics/pumas
? ./components/cam/src/physics/silhs
./components/cdeps
...............

I'm not sure what all of the question marks are. I don't get that in my own sandbox. Maybe it's just a function of me running it in your sandbox, but what do you get if you run it...
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Apparently, question marks mean:

* ? : unknown : directory exists but .git or .svn directories are missing

I'm not sure if it's a problem or not though, particulary since you've completed a startup run successfully...
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
I see you are requesting TWS at the landunit level in your h2 file. But this is a gridcell level variable. Can you try your run without requesting TWS in h2?
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
I'm pretty sure that's the problem. My own run had the same restart problem as yours. When I removed TWS in h2, the model restarted properly.
 

lgray

Laura Gray
New Member
Thank you all so much for the suggestions and troubleshooting! Going to attempt to remove TWS and see if that allows for restart. If that doesn't work, I will turn MOSART off and see how that goes. Fingers crossed!
 
Top