Main menu

Navigation

Porting CESM1.3 to Cheyenne: PIO error after submission

4 posts / 0 new
Last post
yefee90@...
Porting CESM1.3 to Cheyenne: PIO error after submission

Hi all,

 

Recently we are porting CESM with a version of 1.3 to Cheyenne. We follow the guidelines (https://docs.google.com/document/d/1V5_oIA_ZPmLsMKp0rZlQ99CqQshx2pqcZsVQ...) provided by the development team and successfully compile the model on Cheyenne. However, After one-month simulation, the model crashes and returns PIO error. Any suggestions are appreciated.

The PE layout is customized by ourselves.

./xmlchange NTASKS_ATM=468,NTHRDS_ATM=2,ROOTPE_ATM=0

./xmlchange NTASKS_CPL=468,NTHRDS_CPL=2,ROOTPE_CPL=0

./xmlchange NTASKS_ICE=324,NTHRDS_ICE=1,ROOTPE_ICE=144

./xmlchange NTASKS_LND=144,NTHRDS_LND=1,ROOTPE_LND=0

./xmlchange NTASKS_ROF=144,NTHRDS_ROF=1,ROOTPE_ROF=0

 ./xmlchange NTASKS_OCN=128,NTHRDS_OCN=2,ROOTPE_OCN=468

 

Case directory:

/glade/u/home/che43/cases/cheyenne.20ka.itrace.ice_ghg_orb_mwtr.01

Log fie:

/glade/p/cwis0001/iTRACE/cheyenne.20ka.itrace.ice_ghg_orb_mwtr.01/run/cesm.log.171010-091845

 

Error message:

89:Process ID: 67315, Host: r12i4n19, Program: /glade/p/cwis0001/iTRACE/cheyenne.20ka.itrace.ice_ghg_orb_mwtr.01/bld/cesm.exe

89:MPT Version: SGI MPT 2.15  09/03/16 04:15:54

89:

89:MPT: --------stack traceback-------

1:Image              PC                Routine            Line        Source             

1:cesm.exe           000000000309B32D  Unknown               Unknown  Unknown

1:cesm.exe           0000000002AA29A1  pio_support_mp_pi         120  pio_support.F90

1:cesm.exe           0000000002AA0F7E  pio_utils_mp_chec          74  pio_utils.F90

1:cesm.exe           0000000002BAA2C7  pionfwrite_mod_mp         249  pionfwrite_mod.F90.in

1:cesm.exe           0000000002B79F2F  piodarray_mp_writ         643  piodarray.F90.in

1:cesm.exe           0000000002B7CB41  piodarray_mp_writ         221  piodarray.F90.in

1:cesm.exe           000000000221BFCE  ncdio_pio_mp_ncd_        1482  ncdio_pio.F90.in

1:cesm.exe           000000000218AA46  histfilemod_mp_hf        2443  histFileMod.F90

1:cesm.exe           0000000002182B9A  histfilemod_mp_hi        2922  histFileMod.F90

1:cesm.exe           00000000020E19D1  clm_driver_mp_clm         852  clm_driver.F90

1:cesm.exe           00000000020A476C  lnd_comp_mct_mp_l         449  lnd_comp_mct.F90

1:cesm.exe           000000000041F932  component_mod_mp_        1022  component_mod.F90

1:cesm.exe           000000000040B2E4  cesm_comp_mod_mp_        2345  cesm_comp_mod.F90

1:cesm.exe           000000000041D57B  MAIN__                     93  cesm_driver.F90

1:cesm.exe           000000000040915E  Unknown               Unknown  Unknown

1:libc-2.19.so       00002AAAB04C7B25  __libc_start_main     Unknown  Unknown

1:cesm.exe           0000000000409069  Unknown               Unknown  Unknown

 

 

jedwards

1: NetCDF: Numeric conversion not representable

1: pio_support::pio_die:: myrank=          -1 : ERROR: 

1: pionfwrite_mod::write_nfdarray_double:         249 : 

1: NetCDF: Numeric conversion not representable


This error indicates that you are trying to write a value that cannot be represented by the data type specified.  Often this is because you are trying to write a NaN or a value to big for a 4-byte real into the file.   

CESM Software Engineer

yefee90@...

Thanks, jedwards

This error occurs accidentally on Yellowstone when we do simulations. But usually, a resubmission or two would solve this issue. Right now, we always see it on Cheyenne. If the value cannot be represented by the data type of its own, how could it possible to be solved by a resubmission?

jedwards

Solution by resubmission is not a solution it's a bandaid - you need to go into the model and figure out what field it is that is out of spec and how to correct the problem.

Look at your stack trace and figure out what the problem variable is and print out the values.

CESM Software Engineer

Log in or register to post comments

Who's new

  • siyuan
  • yuenai@...
  • petisascom@...
  • xiaoning.wu.1@...
  • nburls@...