Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Error during restart

jagdishprajapati

Jagdish Prajapati
New Member
Hi there all,
Can any one of you help me to fix the following error attached in the file? I want to use daily snapshot file to restart the model for my work. My daily snapshot file contains following variables as listed in the diag table form.

#"ocean_static", -1, "days", 1, "hours", "time" # ocean_static is a protected name. Do not change this line.
"ocean_daily_snapshot", 1, "days", 1, "hours", "time",
#"ocean_daily_mean", 1, "days", 1, "hours", "time",

# To generate daily_snapshot file
"ocean_model", "ssh", "sfc", "ocean_daily_snapshot", "all", "none", "none",2
"ocean_model_z", "thetao", "Temp", "ocean_daily_snapshot", "all", "none", "none",2
"ocean_model_z", "so", "Salt", "ocean_daily_snapshot", "all", "none", "none",2
"ocean_model_z", "uo", "u", "ocean_daily_snapshot", "all", "none", "none",2
"ocean_model_z", "vo", "v", "ocean_daily_snapshot", "all", "none", "none",2
"ocean_model_z", "rhopot0", "rhopot0", "ocean_daily_snapshot", "all", "none", "none",2
"ocean_model_z", "h", "h", "ocean_daily_snapshot", "all", "none", "none",2
 

Attachments

  • Screenshot from 2024-05-02 18-43-17.png
    Screenshot from 2024-05-02 18-43-17.png
    209.4 KB · Views: 6

marshallward

Marshall Ward
New Member
Very hard to tell from the provided information here. The efp reproducing sum is tuned to work over physically realistic values (as in "small as an atom" to "large as the universe"). It could just be responding to a numerically unstable value. Another possibility is that your sum has incorrectly included the masking values (land mask, or a processor mask), which are often extremely large.

The reproducible sum threshold is also somewhat dependent on the number of processor domains, so perhaps you simply need to use more CPUs.

Maybe the first place to start is to determine which summation is reporting an overflow. You might need a debugger.
 
I see you are saving on the ocean_model_z grid. Does this output have the same vertical structure as your model? For instance, I run on 75 z* levels, but output onto 35 z* levels with "ocean_model_z". With "ocean_model" instead, it would save on the native grid. For a cold start (such as you want to do from the daily snapshot), you have to tell the model the T, S, u, v, SSH fields to read, either on the model grid, or it can remap from say World Ocean Atlas and assume zero velocity and ssh by default.

For example:

Code:
TS_CONFIG = "file"
TS_FILE = "glorys_ic_75z_1993.nc"
TEMP_IC_VAR = "temp"
SALT_IC_VAR = "salt"
DEPRESS_INITIAL_SURFACE = True
SURFACE_HEIGHT_IC_FILE = "glorys_ic_75z_1993.nc"
SURFACE_HEIGHT_IC_VAR = "ssh"
VELOCITY_CONFIG = "file"
VELOCITY_FILE = "glorys_ic_75z_1993.nc"
U_IC_VAR = "u"
V_IC_VAR = "v"
 

jagdishprajapati

Jagdish Prajapati
New Member
Very hard to tell from the provided information here. The efp reproducing sum is tuned to work over physically realistic values (as in "small as an atom" to "large as the universe"). It could just be responding to a numerically unstable value. Another possibility is that your sum has incorrectly included the masking values (land mask, or a processor mask), which are often extremely large.

The reproducible sum threshold is also somewhat dependent on the number of processor domains, so perhaps you simply need to use more CPUs.

Maybe the first place to start is to determine which summation is reporting an overflow. You might need a debugger.
Hi Marshall,
Thanks for the quick response. Today I did the experiment by increasing the number of CPUs but observed the same errors. I will proceed with debugger .
 

jagdishprajapati

Jagdish Prajapati
New Member
I see you are saving on the ocean_model_z grid. Does this output have the same vertical structure as your model? For instance, I run on 75 z* levels, but output onto 35 z* levels with "ocean_model_z". With "ocean_model" instead, it would save on the native grid. For a cold start (such as you want to do from the daily snapshot), you have to tell the model the T, S, u, v, SSH fields to read, either on the model grid, or it can remap from say World Ocean Atlas and assume zero velocity and ssh by default.

For example:

Code:
TS_CONFIG = "file"
TS_FILE = "glorys_ic_75z_1993.nc"
TEMP_IC_VAR = "temp"
SALT_IC_VAR = "salt"
DEPRESS_INITIAL_SURFACE = True
SURFACE_HEIGHT_IC_FILE = "glorys_ic_75z_1993.nc"
SURFACE_HEIGHT_IC_VAR = "ssh"
VELOCITY_CONFIG = "file"
VELOCITY_FILE = "glorys_ic_75z_1993.nc"
U_IC_VAR = "u"
V_IC_VAR = "v"
 

jagdishprajapati

Jagdish Prajapati
New Member
Thanks Kshedstorm@..
Output has same dimension as in model. I am not interested to save these variables from cold start in daily snapshot files. In fact, I am saving these files from hot run (restart). I can use restart file for my work but the purpose of doing so is to use less size on disk because restart file and daily_snapshot _file consume around 1000Mb and 150Mb, respectively.
 
A hot start is what I would consider using the restart file, with all its fields, so that you get perfect restarts, i.e., the same run as if there hadn't been a pause at all in the running. That is the purpose of all those extra fields. Otherwise, it is effectively a cold start from just the u,v,T,S and ssh. For instance, some open boundary conditions apply a running mean of some field and save that on restart. The restratification schemes also do time filtering and save things on restart.
 
Top