Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

cesm2 case.run error

huazhen

Member
Hi there,I am trying to run CESM2 on our super computer. I use intel/2017.u2 compiler. I get the following job process messages in $CaseStatus when I am trying to validate a CESM port with prognostic components in http://esmci.github.io/cime/users_guide/porting-cime.html by running "./create_case SMS_D_Ld1.f09_g17.B1850cmip6.spartan_intel.allactive-defaultio_min".  ---------------------------------------------------2019-07-19 22:12:27: case.run starting  ---------------------------------------------------2019-07-19 22:14:04: model execution starting  ---------------------------------------------------2019-07-19 22:14:20: model execution success  ---------------------------------------------------2019-07-19 22:14:20: case.run error ERROR: RUN FAIL: Command 'mpirun  -n 90  /data/cephfs/punim0769/cesm/scratch/SMS_D_Ld1.f09_g17.B1850cmip6.spartan_intel.allactive-defaultio_min.20190719_213423_3p62od/bld/cesm.exe  >> cesm.log.$LID 2>&1 ' failedSee log file for details: /data/cephfs/punim0769/cesm/scratch/SMS_D_Ld1.f09_g17.B1850cmip6.spartan_intel.allactive-defaultio_min.20190719_213423_3p62od/run/cesm.log.10252393.190719-221227 But I still cannot find out the solution of this problem by checking the details in log file (will attach full log file below). Based on the following messages in log file (line 332), I think the problem maybe connect with settings of "openmpi" or "mpirun", but I still don't know how to fix it. Do you have any suggestions? Any help is much appreciated. Thanks a lot.I will attach the following five files./data/cephfs/punim0769/cesm/scratch/SMS_D_Ld1.f09_g17.B1850cmip6.spartan_intel.allactive-defaultio_min.20190719_213423_3p62od/CaseStatus/data/cephfs/punim0769/cesm/scratch/SMS_D_Ld1.f09_g17.B1850cmip6.spartan_intel.allactive-defaultio_min.20190719_213423_3p62od/run/cesm.log.10252393.190719-221227/home/huazhenl/.cime/config_machines.xml/home/huazhenl/.cime/config_compilers.xml/home/huazhenl/.cime/config_batch.xml --------------------------------------------------------------------------MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLDwith errorcode 1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.You may or may not see output from other processes, depending onexactly when Open MPI kills them.--------------------------------------------------------------------------
 

jedwards

CSEG and Liaisons
Staff member
The model is starting correctly but having trouble with this file: /data/cephfs/punim0769/cesm/inputdata/atm/cam/physprops/volc_camRRTMG_byradius_ sigma1.2_mode3_c170214.nc

check that the file is valid with ncdump -k and md5sum.
 

huazhen

Member
Hi jedwards,Thanks for your reply.I got the following messages when I used the method you told me. It seems like the file is valid.[huazhenl@spartan-login2 physprops]$ ncdump -k volc_camRRTMG_byradius_sigma1.2_mode3_c170214.ncclassic[huazhenl@spartan-login2 physprops]$ md5sum volc_camRRTMG_byradius_sigma1.2_mode3_c170214.nce495e9f0f1094ec9af3070bcfe6f6cf9  volc_camRRTMG_byradius_sigma1.2_mode3_c170214.nc Kind regards,Huazhen
 

jedwards

CSEG and Liaisons
Staff member
Maybe it's the next file that it can't find or open - check the atm.log for an error message at the end.
 
Top