f_garcia@ncsu_edu
New Member
Hi,
I am trying to run CESM 1.2 on the supported machine evergreen.
I am using the following options to create a new case:
-mach evergreen
-compset FSDCHM
-res f19_f19
However each time attempt run the default case (cam4_chem_radpsv_geos5), the run is terminated with the following message:
Mon Dec 9 18:14:36 EST 2013 -- CSM EXECUTION BEGINS HERE
=>> PBS: job killed: node 2 (compute-2-36.evergreen.umd.edu) requested job terminate, 'EOF' (code 1099) - received SISTER_EOF attempting to communicate with sister MOM's
Terminated
The model does appear to begin running but the job is killed shortly after (about 20 tsteps into the simulation). Log files are generated for each component, although I cannot identify the error within. Towards the end of the cesm.log file the following is written out before the process is killed:dpcoup cant adjust 89 458 3 2.899606928142302E-039
3.031820011645065E-040 1.007020596805401E-036
dpcoup cant adjust 66 142 4 1.541481934267371E-040
0.000000000000000E+000 1.015449337949024E-036
QNEG3 from convtran2/ISOP:m= 53 lat/lchnk= 803 Min. mixing ratio violated at 1 points. Reset to 1.0E-36 Worst =-1.2E-12 at i,k= 14 54
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 10782 on node compute-6-10.evergreen.umd.edu exited on signal 9 (Killed).
--------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Would anyone know if the error is machine-related, or what might be causing it? Any suggestions on how to address it?
Your help is greatly appreciated,Fernando
I am trying to run CESM 1.2 on the supported machine evergreen.
I am using the following options to create a new case:
-mach evergreen
-compset FSDCHM
-res f19_f19
However each time attempt run the default case (cam4_chem_radpsv_geos5), the run is terminated with the following message:
Mon Dec 9 18:14:36 EST 2013 -- CSM EXECUTION BEGINS HERE
=>> PBS: job killed: node 2 (compute-2-36.evergreen.umd.edu) requested job terminate, 'EOF' (code 1099) - received SISTER_EOF attempting to communicate with sister MOM's
Terminated
The model does appear to begin running but the job is killed shortly after (about 20 tsteps into the simulation). Log files are generated for each component, although I cannot identify the error within. Towards the end of the cesm.log file the following is written out before the process is killed:dpcoup cant adjust 89 458 3 2.899606928142302E-039
3.031820011645065E-040 1.007020596805401E-036
dpcoup cant adjust 66 142 4 1.541481934267371E-040
0.000000000000000E+000 1.015449337949024E-036
QNEG3 from convtran2/ISOP:m= 53 lat/lchnk= 803 Min. mixing ratio violated at 1 points. Reset to 1.0E-36 Worst =-1.2E-12 at i,k= 14 54
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 10782 on node compute-6-10.evergreen.umd.edu exited on signal 9 (Killed).
--------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Would anyone know if the error is machine-related, or what might be causing it? Any suggestions on how to address it?
Your help is greatly appreciated,Fernando