Hi all,
When I run an FHIST simulation with Resubmit > 0, it crashes with the error:
Reading restart dataset
(GETFIL): attempting to find local file
f.e22.geotrace.F1850.ne0np4.SAm.VR28.ne30x4.midHolo.006.mosart.rh0.0502-05-01-0
0000.ncA ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢
A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0
¢A ^RüNÉ0¢A ^RüNÉ0
(GETFIL): failed getting file from full path:
./f.e22.geotrace.F1850.ne0np4.SAm.VR28.ne30x4.midHolo.006.mosart.rh0.0502-05-01
-00000.ncA ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ
0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüN
É0¢A ^RüNÉ0¢A ^RüNÉ0
Please find the log file here -- /glade/derecho/scratch/asiyab/f.e22.geotrace.F1850.ne0np4.SAm.VR28.ne30x4.midHolo.006/run/rof.log.7795009.desched1.250219-022049
The simulation runs fine when Resubmit = 0 or for the first month (n_stop=1), but it crashes in the second month of a branch/hybrid simulation when CONTINUE_RUN = TRUE. The mosart.rh0.* file exists in the run directory and is not corrupted. If I run a new case using the same file, it runs successfully for that month but crashes in the next.
I found this thread suggesting this might be a library issue. I tested running a single month after modifying PIO_TYPENAME for compclass="ROF" in env_run.xml to netcdf and it doubled the model cost. My variable-resolution simulation with water tags (/glade/u/home/asiyab/cesm2.2.0.geotrace) saw computational expense rise from ~83,978.82 to 164,422.25 pe-hrs/simulated year.
I’d greatly appreciate any suggestions on resolving this issue without a substantial increase in model cost.
Thanks!
When I run an FHIST simulation with Resubmit > 0, it crashes with the error:
Reading restart dataset
(GETFIL): attempting to find local file
f.e22.geotrace.F1850.ne0np4.SAm.VR28.ne30x4.midHolo.006.mosart.rh0.0502-05-01-0
0000.ncA ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢
A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0
¢A ^RüNÉ0¢A ^RüNÉ0
(GETFIL): failed getting file from full path:
./f.e22.geotrace.F1850.ne0np4.SAm.VR28.ne30x4.midHolo.006.mosart.rh0.0502-05-01
-00000.ncA ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ
0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüNÉ0¢A ^RüN
É0¢A ^RüNÉ0¢A ^RüNÉ0
Please find the log file here -- /glade/derecho/scratch/asiyab/f.e22.geotrace.F1850.ne0np4.SAm.VR28.ne30x4.midHolo.006/run/rof.log.7795009.desched1.250219-022049
The simulation runs fine when Resubmit = 0 or for the first month (n_stop=1), but it crashes in the second month of a branch/hybrid simulation when CONTINUE_RUN = TRUE. The mosart.rh0.* file exists in the run directory and is not corrupted. If I run a new case using the same file, it runs successfully for that month but crashes in the next.
I found this thread suggesting this might be a library issue. I tested running a single month after modifying PIO_TYPENAME for compclass="ROF" in env_run.xml to netcdf and it doubled the model cost. My variable-resolution simulation with water tags (/glade/u/home/asiyab/cesm2.2.0.geotrace) saw computational expense rise from ~83,978.82 to 164,422.25 pe-hrs/simulated year.
I’d greatly appreciate any suggestions on resolving this issue without a substantial increase in model cost.
Thanks!