ivanacv@nbi_ku_dk
New Member
Hi,
I was hoping somebody could help with the following problem. We have installed CESM1.0.4 on a new machine and certain compsets do not seem to run. F and some of the B compsets run fine, however none of the E ones do (with CAM4 or CAM5). Resolution is 1.9x2.5_1.9x2.5. Error messages look very much the same in different failed compset cases (E and B). Perhaps somebody else has encountered this before? I would be very grateful for any advice you may think of.
Our machine is a Linux cluster, 16 pes/node, PBS batch system and we use openmpi-1.6.2_intel13.
E1850_CAM5CN.o917:
CCSM BUILDNML SCRIPT STARTING
- Create modelio namelist input files
CCSM BUILDNML SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
CCSM PRESTAGE SCRIPT STARTING
- CCSM input data directory, DIN_LOC_ROOT_CSMDATA, is /data/ra/cesm-inputdata
- Case input data directory, DIN_LOC_ROOT, is /data/ra/cesm-inputdata
- Checking the existence of input datasets in DIN_LOC_ROOT
CCSM PRESTAGE SCRIPT HAS FINISHED SUCCESSFULLY
Fri Nov 9 15:39:19 PST 2012 -- CSM EXECUTION BEGINS HERE
Fri Nov 9 15:41:49 PST 2012 -- CSM EXECUTION HAS FINISHED
Model did not complete - see /data/ra/icv/E1850_CAM5CN/run/cpl.log.121109-153853
-----------------------------------------------------
cpl. log* says Model initialization complete
-----------------------------------------------------
ccsm.log.* :
...
Reading setup_nml
Reading grid_nml
Reading ice_nml
Reading tracer_nml
CalcWorkPerBlock: Total blocks: 128 Ice blocks: 128 IceFree blocks: 0 Land blocks: 0
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
(seq_frac_check) [lnd init] afrac min/max = 1.000000000000000000 1.000000000000000000
(seq_frac_check) [lnd init] lfrac min/max = 0.573489812125174137E-01 1.000000000000000000
(seq_frac_check) [lnd init] lfrin min/max = 0.573489812125174137E-01 1.000000000000000000
(seq_frac_check) [ice init] afrac min/max = 1.000000000000000000 1.000000000000000000
(seq_frac_check) [ice init] ofrac min/max = 0.00000000000000000 1.000000000000000000
(seq_frac_check) [ice init] ifrac min/max = 0.00000000000000000 0.0000000000000000
...
calcsize j,iq,jac, lsfrm,lstoo 2 3 2 13 19
calcsize j,iq,jac, lsfrm,lstoo 2 4 1 16 20
calcsize j,iq,jac, lsfrm,lstoo 2 4 2 16 20
forrtl: error (73): floating divide by zero
Image PC Routine Line Source
ccsm.exe 0000000002D57302 ice_dyn_evp_mp_st 1205 ice_dyn_evp.F90
ccsm.exe 0000000002D46247 ice_dyn_evp_mp_ev 380 ice_dyn_evp.F90
ccsm.exe 0000000002ECF0B3 ice_step_mod_mp_s 664 ice_step_mod.F90
ccsm.exe 0000000002D06EB4 ice_comp_mct_mp_i 554 ice_comp_mct.F90
ccsm.exe 00000000004E18CA ccsm_comp_mod_mp_ 1759 ccsm_comp_mod.F90
ccsm.exe 00000000004EB6AE MAIN__ 91 ccsm_driver.F90
ccsm.exe 00000000004CC20C Unknown Unknown Unknown
libc.so.6 000000347A61ECDD Unknown Unknown Unknown
ccsm.exe 00000000004CC109 Unknown Unknown Unknown
forrtl: error (65): floating invalid
Image PC Routine Line Source
ccsm.exe 0000000002D57302 ice_dyn_evp_mp_st 1205 ice_dyn_evp.F90
ccsm.exe 0000000002D46247 ice_dyn_evp_mp_ev 380 ice_dyn_evp.F90
ccsm.exe 0000000002ECF0B3 ice_step_mod_mp_s 664 ice_step_mod.F90
ccsm.exe 0000000002D06EB4 ice_comp_mct_mp_i 554 ice_comp_mct.F90
ccsm.exe 00000000004E18CA ccsm_comp_mod_mp_ 1759 ccsm_comp_mod.F90
ccsm.exe 00000000004EB6AE MAIN__ 91 ccsm_driver.F90
ccsm.exe 00000000004CC20C Unknown Unknown Unknown
libc.so.6 000000347A61ECDD Unknown Unknown Unknown
ccsm.exe 00000000004CC109 Unknown Unknown Unknown
...
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libpthread.so.0 00000032EE40C173 Unknown Unknown Unknown
libmlx4-rdmav2.so 00007F846EC7BA7B Unknown Unknown Unknown
mca_btl_openib.so 00007F84713904EF Unknown Unknown Unknown
libmpi.so.1 00007F84764AFC69 Unknown Unknown Unknown
libmpi.so.1 00007F84763E24F2 Unknown Unknown Unknown
libmpi.so.1 00007F8476409F04 Unknown Unknown Unknown
libmpi_f77.so.1 00007F8475F6EC45 Unknown Unknown Unknown
ccsm.exe 0000000003039D44 ice_boundary_mp_i 2747 ice_boundary.F90
ccsm.exe 0000000002D4719A ice_dyn_evp_mp_ev 434 ice_dyn_evp.F90
ccsm.exe 0000000002ECF0B3 ice_step_mod_mp_s 664 ice_step_mod.F90
ccsm.exe 0000000002D06EB4 ice_comp_mct_mp_i 554 ice_comp_mct.F90
ccsm.exe 00000000004E18CA ccsm_comp_mod_mp_ 1759 ccsm_comp_mod.F90
ccsm.exe 00000000004EB6AE MAIN__ 91 ccsm_driver.F90
ccsm.exe 00000000004CC20C Unknown Unknown Unknown
libc.so.6 00000032EDC1ECDD Unknown Unknown Unknown
ccsm.exe 00000000004CC109 Unknown Unknown Unknown
------------------------------------------
I was hoping somebody could help with the following problem. We have installed CESM1.0.4 on a new machine and certain compsets do not seem to run. F and some of the B compsets run fine, however none of the E ones do (with CAM4 or CAM5). Resolution is 1.9x2.5_1.9x2.5. Error messages look very much the same in different failed compset cases (E and B). Perhaps somebody else has encountered this before? I would be very grateful for any advice you may think of.
Our machine is a Linux cluster, 16 pes/node, PBS batch system and we use openmpi-1.6.2_intel13.
E1850_CAM5CN.o917:
CCSM BUILDNML SCRIPT STARTING
- Create modelio namelist input files
CCSM BUILDNML SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
CCSM PRESTAGE SCRIPT STARTING
- CCSM input data directory, DIN_LOC_ROOT_CSMDATA, is /data/ra/cesm-inputdata
- Case input data directory, DIN_LOC_ROOT, is /data/ra/cesm-inputdata
- Checking the existence of input datasets in DIN_LOC_ROOT
CCSM PRESTAGE SCRIPT HAS FINISHED SUCCESSFULLY
Fri Nov 9 15:39:19 PST 2012 -- CSM EXECUTION BEGINS HERE
Fri Nov 9 15:41:49 PST 2012 -- CSM EXECUTION HAS FINISHED
Model did not complete - see /data/ra/icv/E1850_CAM5CN/run/cpl.log.121109-153853
-----------------------------------------------------
cpl. log* says Model initialization complete
-----------------------------------------------------
ccsm.log.* :
...
Reading setup_nml
Reading grid_nml
Reading ice_nml
Reading tracer_nml
CalcWorkPerBlock: Total blocks: 128 Ice blocks: 128 IceFree blocks: 0 Land blocks: 0
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
(seq_frac_check) [lnd init] afrac min/max = 1.000000000000000000 1.000000000000000000
(seq_frac_check) [lnd init] lfrac min/max = 0.573489812125174137E-01 1.000000000000000000
(seq_frac_check) [lnd init] lfrin min/max = 0.573489812125174137E-01 1.000000000000000000
(seq_frac_check) [ice init] afrac min/max = 1.000000000000000000 1.000000000000000000
(seq_frac_check) [ice init] ofrac min/max = 0.00000000000000000 1.000000000000000000
(seq_frac_check) [ice init] ifrac min/max = 0.00000000000000000 0.0000000000000000
...
calcsize j,iq,jac, lsfrm,lstoo 2 3 2 13 19
calcsize j,iq,jac, lsfrm,lstoo 2 4 1 16 20
calcsize j,iq,jac, lsfrm,lstoo 2 4 2 16 20
forrtl: error (73): floating divide by zero
Image PC Routine Line Source
ccsm.exe 0000000002D57302 ice_dyn_evp_mp_st 1205 ice_dyn_evp.F90
ccsm.exe 0000000002D46247 ice_dyn_evp_mp_ev 380 ice_dyn_evp.F90
ccsm.exe 0000000002ECF0B3 ice_step_mod_mp_s 664 ice_step_mod.F90
ccsm.exe 0000000002D06EB4 ice_comp_mct_mp_i 554 ice_comp_mct.F90
ccsm.exe 00000000004E18CA ccsm_comp_mod_mp_ 1759 ccsm_comp_mod.F90
ccsm.exe 00000000004EB6AE MAIN__ 91 ccsm_driver.F90
ccsm.exe 00000000004CC20C Unknown Unknown Unknown
libc.so.6 000000347A61ECDD Unknown Unknown Unknown
ccsm.exe 00000000004CC109 Unknown Unknown Unknown
forrtl: error (65): floating invalid
Image PC Routine Line Source
ccsm.exe 0000000002D57302 ice_dyn_evp_mp_st 1205 ice_dyn_evp.F90
ccsm.exe 0000000002D46247 ice_dyn_evp_mp_ev 380 ice_dyn_evp.F90
ccsm.exe 0000000002ECF0B3 ice_step_mod_mp_s 664 ice_step_mod.F90
ccsm.exe 0000000002D06EB4 ice_comp_mct_mp_i 554 ice_comp_mct.F90
ccsm.exe 00000000004E18CA ccsm_comp_mod_mp_ 1759 ccsm_comp_mod.F90
ccsm.exe 00000000004EB6AE MAIN__ 91 ccsm_driver.F90
ccsm.exe 00000000004CC20C Unknown Unknown Unknown
libc.so.6 000000347A61ECDD Unknown Unknown Unknown
ccsm.exe 00000000004CC109 Unknown Unknown Unknown
...
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libpthread.so.0 00000032EE40C173 Unknown Unknown Unknown
libmlx4-rdmav2.so 00007F846EC7BA7B Unknown Unknown Unknown
mca_btl_openib.so 00007F84713904EF Unknown Unknown Unknown
libmpi.so.1 00007F84764AFC69 Unknown Unknown Unknown
libmpi.so.1 00007F84763E24F2 Unknown Unknown Unknown
libmpi.so.1 00007F8476409F04 Unknown Unknown Unknown
libmpi_f77.so.1 00007F8475F6EC45 Unknown Unknown Unknown
ccsm.exe 0000000003039D44 ice_boundary_mp_i 2747 ice_boundary.F90
ccsm.exe 0000000002D4719A ice_dyn_evp_mp_ev 434 ice_dyn_evp.F90
ccsm.exe 0000000002ECF0B3 ice_step_mod_mp_s 664 ice_step_mod.F90
ccsm.exe 0000000002D06EB4 ice_comp_mct_mp_i 554 ice_comp_mct.F90
ccsm.exe 00000000004E18CA ccsm_comp_mod_mp_ 1759 ccsm_comp_mod.F90
ccsm.exe 00000000004EB6AE MAIN__ 91 ccsm_driver.F90
ccsm.exe 00000000004CC20C Unknown Unknown Unknown
libc.so.6 00000032EDC1ECDD Unknown Unknown Unknown
ccsm.exe 00000000004CC109 Unknown Unknown Unknown
------------------------------------------