Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Cannot find some checkpoint files

Hi there,    I'm running CESM2.1.0 and met some problems, the error log is as follows:"calcsize j,iq,jac, lsfrm,lstoo  2  4  2  20  25calcsize j,iq,jac, lsfrm,lstoo  2  5  1  19  23calcsize j,iq,jac, lsfrm,lstoo  2  5  2  19  23  max rss=452.8 MBforrtl: No such file or directoryforrtl: severe (29): file not found, unit 10, file /BIGDATA1/iapcas_mhzhang_fkc/cesm2_1_0/output/fhist/run/./timing/checkpoints/model_timing_19790102_00000_statsImage              PC                Routine            Line        Sourcecesm.exe           00000000029202A9  Unknown               Unknown  Unknowncesm.exe           000000000293F12F  Unknown               Unknown  Unknowncesm.exe           000000000285A95E  perf_mod_mp_t_prf        1326  perf_mod.F90cesm.exe           000000000041739F  cime_comp_mod_mp_        3104  cime_comp_mod.F90cesm.exe           00000000004309E7  MAIN__                    133  cime_driver.F90cesm.exe           000000000041344E  Unknown               Unknown  Unknownlibc-2.17.so       00002B4313B8CB35  __libc_start_main     Unknown  Unknowncesm.exe           0000000000413369  Unknown               Unknown  UnknownFatal error in MPI_Recv: Unknown error class, error stack:MPI_Recv(224).........................: MPI_Recv(buf=0x7ffda8b53b68, count=1, MPI_INTEGER, src=0, tag=99, comm=0x84000006, status=0x1) failed"The compset is FHIST and the res is f09_f09_mg16. I guess the "model_timing_19790102_00000_stats" file in timing /checkpoints is automatically generated after running for the first day, but somehow it can't be generated and thus not found. By the way, the timing/checkpoints in run directory is not generated either, should I mannually create the directories?The logfile is attached below. Thank you in advance!
 

jedwards

CSEG and Liaisons
Staff member
This error usually occurs due to one or more compute nodes not having access to or correct permissions on the output directory  /BIGDATA1/iapcas_mhzhang_fkc/cesm2_1_0/output/fhist/run/ 
 
Top