Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

FAIL status in the ERI functionality tests

Hello, I'm porting the cesm1.2.2 in a new machine (name: Mahone) with intel OS. I'm running the functionality tests specified in the guide and I'm finding a variety of errors. In particular, when testing ERI.ne30_g16.X, ERI.T31_g37.A and ERI.f19_g16.B1850CN, I obtained the FAIL status in all of them. In TestStatus.out, I find the same thing for all of them, for instance for the first one says:
ref1: doing a 3 ndays initial startup from 0001-01-01
Checking successful completion of init cpl log file
FAIL  ERI.ne30_g16.X.mahone_intel.1
ref1: doing a 3 ndays initial startup from 0001-01-01
Checking successful completion of init cpl log file
ref1: doing a 3 ndays initial startup from 0001-01-01
Checking successful completion of init cpl log file
FAIL  ERI.ne30_g16.X.mahone_intel.1
When checking in the RUNDIR directory ERI.ne30_g16.X.mahone_intel.t04/run, I see that there is indeed no log files. Instead, when I submit the case, two directories ERI.ne30_g16.X.mahone_intel.t04.ref1 and ERI.ne30_g16.X.mahone_intel.t04.ref2 are created in the same directory where ERI.ne30_g16.X.mahone_intel.t04 is, and also in the /scripts/ directory where I created the case. I've seen that the logs are created inside the ref1/run directory. I don't know if the creation of these .ref directories is normal, but apparently it fails to check the existence of the log files. Anyway, inside the cesm.log there is a segmentation fault message: forrtl: severe (174): SIGSEGV, segmentation fault occurred, several error messages forrtl: error (78): process killed (SIGTERM) after that, and then at the end of the log says:
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[22374,1],3]
  Exit code:    174
--------------------------------------------------------------------------
I also tried to run ERI.ne30_g16.X.mahone_intel.t04.test directly (without submision) and the same forrtl segmentation fault happened. Can someone have a look at the logs and tell me what the problem is? And tell me if these .ref directories are supposed to be created? I would appreciate your help a lot, I've been stuck with this problem a whole week.
  
 
Top