Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Restart Crashed Runs Options

Joas Müller

Joas Müller
New Member
What version of the code are you using?
git describe:
cesm3_0_alpha03d-0-g60b024c

I am running CESM simulations using the above CESM source code.
Starting in mid November and running until the end of March. (multiple members)
The simulations are split up into 3x 45 days.
Some of my members now crash due to instabilities.
Is there a general documentation on how to try to safe them? Especially the ones that have compelte 90 days already.
I would like to use pertlim and already tried to "pertlim" my way out of the crash (using the most recent RESTART files), but they seem to "not care" (which makes me think whether my current approach is changing anythin at all) an I am wondering if there is a general documentation or other approaches how to safe such a run?

One run I am currently trying to safe is the following:
/glade/work/jmueller/VRM_CESM/cases/NAO_minus_SSTICE_HiRes.NATLx8/f.e22.F2000.NATLx8.NAO_minus_SSTICE_HiRes.1.001

Sorry if this question is too general and I just did not spot the corresponding document so far!
I appreciate any help or adice.

Thanks in advance!
Cheers,
Joas
 

nusbaume

Jesse Nusbaumer
CSEG and Liaisons
Staff member
Hi Joas,

Thanks for your post! Sadly I am not entirely sure what you are trying to ask. Are you trying to ask how to avoid the instabilities that are occurring in your runs? One common method is to try and decrease the model timestep. You can find instructions for how to do so for each model component here:


Alternatively, if you are asking how to save output from those simulations, then, assuming they haven't output the data you need already, you'll need to re-run them so that they either stop at the correct time, output restart files at the correct time, and/or output the variables you need at the correct frequency. You can find instructions for controlling the model run length here:


If you want to then output restart files at a higher frequency, then you can find a description of the relevant XML variables here:


Also, instructions for how to customize the model output, including increasing the frequency of output (e.g. at a higher frequency then monthly) can be found here (including the relevant subsections):


Finally, I should note that the version of CESM you are using is a developmental version that very likely hasn't been fully vetted. This means that it is possible that there is a bug or badly tuned parameter somewhere in the model that could be causing the model to crash. Thus if this continues to be a problem then I would recommend either using a later CESM3 development tag, or a release version of the model (e.g. CESM2.1.5).

Anyways, I hope that helps, and have a great day!

Jesse
 
Vote Upvote 0 Downvote

Joas Müller

Joas Müller
New Member
Hi Jesse,

Thanks a lot for all the input. Although not exactly what I needed, the links are very helpful and answered some other questions I had.

My main question would be a bit different though:
So I have an ensemble of 20 simulations, 19 of the members sucessfully ren through from November to March (as intended).
But one simulation only ren for 90 instead of 135 days. It has restart files for day 90. So it looks like there is an instability leading to this simulation crashing.
Hence, I would like to "rescue" this simulation to not waste the spent resources. I would like to slightly perturb the state of the model at day 90 and restart to run the remaining 45 days and hope for them to not crash due to the small perturbation.

And similar to how I used "pertlim" for creating the ensemble, I was hoping for a solution that does something similar as pertlim, but at day 90!

So I was wondering if there is a colleciton of potential solutions and methods to use, and how to implement them with XML changes?

Thanks again in advance,
Cheers!
Joas
 
Vote Upvote 0 Downvote
Top