Main menu

Navigation

ne120_t12 cice initialization error (allocation error?)

1 post / 0 new
seungbu@...
ne120_t12 cice initialization error (allocation error?)

Hello, I've recently installed CESM1.2.2 on HPC server in the University of Southern California.

We're using Intel compilers, and our MPICH and NetCDF work.

Aquaplanet, ne30_g16 & F1850C5, T31_g37 & B1850C5 simulations are OK.

But ne120_t12 simulation stops during initialization.

We're using 63 nodes having 16 cores (1008 cores no hyperthreading) and each node has ~64GB RAM memory.

 

cpl.log ends with 

(seq_mct_drv) : Initialize each component: atm, lnd, rof, ocn, ice, glc, wav

(seq_mct_drv) : Initialize atm component ATM

(seq_mct_drv) : Initialize lnd component LND

(seq_mct_drv) : Initialize rof component ROF

(seq_mct_drv) : Initialize ocn component OCN

(seq_mct_drv) : Initialize ice component ICE

 

In cesm.log, 

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image              PC                Routine            Line        Source

cesm.exe           000000000244B511  Unknown               Unknown  Unknown

cesm.exe           000000000244964B  Unknown               Unknown  Unknown

cesm.exe           00000000023E02F4  Unknown               Unknown  Unknown

cesm.exe           00000000023E0106  Unknown               Unknown  Unknown

cesm.exe           000000000235C439  Unknown               Unknown  Unknown

cesm.exe           0000000002366EC6  Unknown               Unknown  Unknown

libpthread-2.17.s  00007F23781CE370  Unknown               Unknown  Unknown

cesm.exe           0000000001648250  ice_probability_m         259  ice_probability.F90

cesm.exe           000000000160ABBD  ice_grid_mp_init_         248  ice_grid.F90

cesm.exe           00000000016C29A4  cice_initmod_mp_c         106  CICE_InitMod.F90

cesm.exe           00000000015C777B  ice_comp_mct_mp_i         254  ice_comp_mct.F90

cesm.exe           000000000043EC0B  ccsm_comp_mod_mp_        1155  ccsm_comp_mod.F90

cesm.exe           00000000004426ED  MAIN__                     90  ccsm_driver.F90

cesm.exe           0000000000412A5E  Unknown               Unknown  Unknown

libc-2.17.so       00007F2377B1DB35  __libc_start_main     Unknown  Unknown

cesm.exe           0000000000412969  Unknown               Unknown  Unknown

  259 line of cesm1_2_2/models/ice/cice/src/source/ice_probability.F90 just allocates work_gr array. 

195 subroutine CalcWorkPerBlock(distribution_wght, KMTG,ULATG,work_per_block, prob_per_block,blockType,bStats)

... 

259    allocate(work_gr(nx_global,ny_global))

260    allocate(prob(nblocks_tot),work(nblocks_tot))

261    allocate(nocn(nblocks_tot))

262    allocate(nice005(nblocks_tot),nice010(nblocks_tot),nice050(nblocks_tot), &

263             nice100(nblocks_tot),nice250(nblocks_tot),nice500(nblocks_tot))

   And ice.log ends with the following.

Domain Information

  Horizontal domain: nx =   3600

                     ny =   2400

  No. of categories: nc =      5

  No. of ice layers: ni =      4

  No. of snow layers:ns =      1

  Processors:  total    =    448

  Processor shape:             null

  Distribution type:       blkrobin

  Distribution weight:     latitude

  {min,max}Blocks =            1     8

  Number of ghost cells:       1

 

  read_global           99           1  -1.36924091988070

   1.57079616109396

  read_global           98           1  0.000000000000000E+000

   42.0000000000000

  Is this a problem of trigrid (tx0.1v2)?
Sure, we add "limit stacksize unlimited" in env_mach_specific.USC.I'd appreciate solution or any hint on this. Thanks,Seungbu  Our compiler options are like this.

<compiler MACH="USC">

  <MPIFC> mpiifort </MPIFC>

  <MPICC> mpiicc </MPICC>

  <MPICXX> mpiicpc </MPICXX>

  <ADD_FFLAGS> -heap-arrays </ADD_FFLAGS>

  <ADD_CFLAGS> -heap-arrays </ADD_CFLAGS>

  <ADD_FFLAGS> -mcmodel=medium </ADD_FFLAGS>

  <ADD_FFLAGS DEBUG="FALSE"> -O2  </ADD_FFLAGS>

  <ADD_CFLAGS DEBUG="FALSE"> -O2  </ADD_CFLAGS>

  <NETCDF_PATH>/home/rcf-proj/lds1/sp_582/ncdf-f-4.4.4</NETCDF_PATH>

  <MPI_LIB_NAME> mpi </MPI_LIB_NAME>

  <MPI_PATH>/usr/usc/intel/17.2/compilers_and_libraries_2017.2.174/linux/mpi/intel64</MPI_PATH>

  <ADD_SLIBS> $(shell $(NETCDF_PATH)/bin/nf-config --flibs) </ADD_SLIBS>

</compiler>  and our pes

<pes GRID="a%ne120np4"MACH="USC">

    <NTASKS_ATM>896</NTASKS_ATM><NTHRDS_ATM>1</NTHRDS_ATM>   <ROOTPE_ATM>0</ROOTPE_ATM>      <NINST_ATM>1</NINST_ATM>

    <NTASKS_LND>448</NTASKS_LND><NTHRDS_LND>$NTHRDS_ATM</NTHRDS_LND><ROOTPE_LND>$ROOTPE_ATM</ROOTPE_LND><NINST_LND>1</NINST_LND>

    <NTASKS_ICE>448</NTASKS_ICE><NTHRDS_ICE>$NTHRDS_ATM</NTHRDS_ICE><ROOTPE_ICE>$NTASKS_LND</ROOTPE_ICE><NINST_ICE>1</NINST_ICE>

    <NTASKS_OCN>112</NTASKS_OCN><NTHRDS_OCN>$NTHRDS_ATM</NTHRDS_OCN><ROOTPE_OCN>$NTASKS_ATM</ROOTPE_OCN><NINST_OCN>1</NINST_OCN>

    <NTASKS_CPL>$NTASKS_ATM</NTASKS_CPL><NTHRDS_CPL>$NTHRDS_ATM</NTHRDS_CPL><ROOTPE_CPL>$ROOTPE_ATM</ROOTPE_CPL>

    <NTASKS_GLC>$NTASKS_ATM</NTASKS_GLC><NTHRDS_GLC>$NTHRDS_ATM</NTHRDS_GLC><ROOTPE_GLC>$ROOTPE_ATM</ROOTPE_GLC><NINST_GLC>1</NINST_GLC>

 

    <PIO_NUMTASKS>-1</PIO_NUMTASKS><PIO_STRIDE>-1</PIO_STRIDE><PIO_TYPENAME>netcdf</PIO_TYPENAME><PIO_ROOT>1</PIO_ROOT>

    <PES_LEVEL>0</PES_LEVEL>

</pes>

Who's new

  • rlove@...
  • katsumi@...
  • afox
  • shanru@...
  • yongxiao@...