This site is migrating to a new forum software on Tuesday, September 24th 2019, you may experience a short downtime during this transition

Main menu

Navigation

CESM 1.0.5 porting runtime issues

6 posts / 0 new
Last post
ebuzan@...
CESM 1.0.5 porting runtime issues

I am trying to port CESM 1.0.5 to one of my university's computing clusters. I can build the model with no errors, but I get several errors at runtime depending on the compset I use.

With compset X, the model appears to run to completion but crashes at the following line in cpl.log:
Write restart file at 10106 0
(seq_rest_write) write rpointer file rpointer.drv

With compset A, the model crashes while initializing the atm component at the following line in atm.log:
(shr_strdata_readnml) reading input namelist file: datm_atm_in
(shr_stream_init) Reading file nyf.giss.T62.stream.txt

With compset FW, the model also crashes while initializing atm at the following point:
(GETFIL): attempting to find local file
f40.2000.track1.4deg.001.cam2.i.0013-01-01-00000.nc
(GETFIL): using
/user/temp/c/ebuza001/inputdata/atm/waccm/ic/f40.2000.track1.4deg.001.cam2.i.00
13-01-01-00000.nc

santos

Check the end of ccsm.log and see if there is any other information there. In particular, look for anything containing the string "ERROR" or "ENDRUN", or messages printed by the compiler.

Sean Patrick Santos

CESM Software Engineering Group

ebuzan@...

I don't see any useful error information in ccsm.log, just the following block at the end of the log:


forrtl: severe (168): Program Exception - illegal instruction
Image              PC                Routine            Line        Source             
ccsm.exe           00000000012C62C8  Unknown               Unknown  Unknown
libnetcdff.so.5    00002AAAAAED8DF8  Unknown               Unknown  Unknown
libnetcdff.so.5    00002AAAAAEE11B7  Unknown               Unknown  Unknown
ccsm.exe           00000000011D8A8D  Unknown               Unknown  Unknown
ccsm.exe           00000000010F5469  Unknown               Unknown  Unknown
ccsm.exe           00000000004B0DB7  cam_pio_utils_mp_         613  cam_pio_utils.F90
ccsm.exe           00000000005C3F37  startup_initialco          65  startup_initialconds.F90
ccsm.exe           00000000004D718F  inital_mp_cam_ini          84  inital.F90
ccsm.exe           00000000004796D9  cam_comp_mp_cam_i         147  cam_comp.F90
ccsm.exe           000000000047590E  atm_comp_mct_mp_a         272  atm_comp_mct.F90
ccsm.exe           000000000041DC57  ccsm_comp_mod_mp_         684  ccsm_comp_mod.F90
ccsm.exe           0000000000420956  MAIN__                     90  ccsm_driver.F90
ccsm.exe           000000000040ED8C  Unknown               Unknown  Unknown
libc.so.6          00002AAAAC2DE994  Unknown               Unknown  Unknown
ccsm.exe           000000000040EC99  Unknown               Unknown  Unknown
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

jedwards

> forrtl: severe (168): Program Exception - illegal instruction


This would indicate that your compiler is producing instructions incompatable with your CPU.   There are two ways I know of that this can happen

  1. You are compiling on a front-end or login node which has a different chip set than the node you are running on
  2. You are explicitly setting a compiler flag indicating a chip different than the one you are running on.  

 


CESM Software Engineer

jedwards

Actually looking at that traceback a little closer I see that the crash is actually occuring in

libnetcdff.so.5 

Did you build the netcdf library yourself?   Could it be that this library isn't compatible with your system?

CESM Software Engineer

ebuzan@...

I did build the netcdf library from source since I was having trouble getting the cluster's netcdf module running and to make sure netcdf and the model were built with the same compiler per the user guide. I also ran "make check" on both the Fortran and C libraries and all the tests passed.

Log in or register to post comments

Who's new

  • jwolff
  • tinna.gunnarsdo...
  • sarthak2235@...
  • eolivares@...
  • shubham.gandhi@...