MPT ERROR: Rank 510 with cesm2.0 compset B1850

Hi to whom it may concern,I encounterd a problem when running a brunch run with cesm2.0, compset B1850 and resolution f09_g17. My case was successfully running for 18 months before it was aborted. I did the similar simulations with the exactly same scripts except for brunching off from different years but didn't have any problems.Thanks.The error messages are attached as below: 510: ice: Vertical thermo error510: ERROR: ice: Vertical thermo error510:Image              PC                Routine            Line        Source510:cesm.exe           00000000034DBF9D  Unknown               Unknown  Unknown510:cesm.exe           0000000002C47B22  shr_abort_mod_mp_         114  shr_abort_mod.F90510:cesm.exe           0000000001726F74  ice_exit_mp_abort          46  ice_exit.F90510:cesm.exe           0000000001944E6C  ice_step_mod_mp_s         569  ice_step_mod.F90510:cesm.exe           00000000018147F8  cice_runmod_mp_ci         186  CICE_RunMod.F90510:cesm.exe           000000000171878A  ice_comp_mct_mp_i         563  ice_comp_mct.F90510:cesm.exe           0000000000425874  component_mod_mp_         728  component_mod.F90510:cesm.exe           000000000040AECB  cime_comp_mod_mp_        2649  cime_comp_mod.F90510:cesm.exe           00000000004255A2  MAIN__                    103  cime_driver.F90510:cesm.exe           0000000000408FDE  Unknown               Unknown  Unknown510:libc-2.19.so       00002AAAB08F9B25  __libc_start_main     Unknown  Unknown 510:cesm.exe           0000000000408EE9  Unknown               Unknown  Unknown
 510:MPT ERROR: Rank 510(g:510) is aborting with error code 1001
510:MPT: --------stack traceback-------510:MPT: Attaching to program: /proc/14888/exe, process 14888510:MPT: done.510:MPT: Try: zypper install -C "debuginfo(build-id)=3d290be00d48b823d3b71df2249e80d881bc473d"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=0ea764119690f32c98faae9a63a73f35ed8b1099"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=5409c48fdb15e90649c1407e444fbe31d6dc8ec1"510:MPT: (no debugging symbols found)...done.510:MPT: [Thread debugging using libthread_db enabled]510:MPT: Using host libthread_db library "/glade/u/apps/ch/os/lib64/libthread_db.so.1".510:MPT: Try: zypper install -C "debuginfo(build-id)=79264652a62453da222372a430cd9351d4bbcbde"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=e97cfdb062d6f0c41073f2109a7605d0ae991c03"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=f43d7754940a14ffe3d9bd8fc9472ffbbfead544"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=15916519d9dbaea26ec88427460b4cedb9c0a6ab"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=4c08f43bb18e99a7df4bad5c4a52bac67ddf9b8d"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=3ae04b58bd81ea7745dba789d89937e719309568"510:MPT: (no debugging symbols found)...done.510:MPT: 0x00002aaaafcc541c in waitpid () from /glade/u/apps/ch/os/lib64/libpthread.so.0 510:MPT: Missing separate debuginfos, use: zypper install glibc-debuginfo-2.19-35.1.x86_64510:MPT: (gdb) #0  0x00002aaaafcc541c in waitpid ()510:MPT:    from /glade/u/apps/ch/os/lib64/libpthread.so.0510:MPT: #1  0x00002aaab060de66 in mpi_sgi_system (510:MPT: #2  MPI_SGI_stacktraceback (510:MPT:     header=header@entry=0x7ffffffd6ce0 "MPT ERROR: Rank 510(g:510) is aborting with error code 1001.ntProcess ID: 14888, Host: r9i2n21, Program: /gpfs/fs1/scratch/kezhou/b.e20.B1850.f09_g17.4xCO2.24_mon08/bld/cesm.exentMPT Version: HPE MPT "...) at sig.c:340510:MPT: #3  0x00002aaab05564c9 in print_traceback (ecode=ecode@entry=1001)510:MPT:     at abort.c:246510:MPT: #4  0x00002aaab055679a in PMPI_Abort (comm=, errorcode=1001)510:MPT:     at abort.c:68510:MPT: #5  0x00002aaab0556a7c in pmpi_abort__ ()510:MPT:    from /glade/u/apps/ch/opt/mpt/2.19/opt/hpe/hpc/mpt/mpt-2.19/lib/libmpi.so510:MPT: #6  0x0000000002d395a9 in shr_mpi_mod_mp_shr_mpi_abort_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/share/util/shr_mpi_mod.F90:2127510:MPT: #7  0x0000000002c47bc8 in shr_abort_mod_mp_shr_abort_abort_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/share/util/shr_abort_mod.F90:69510:MPT: #8  0x0000000001726f74 in ice_exit_mp_abort_ice_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/mpi/ice_exit.F90:46510:MPT: #9  0x0000000001944e6c in ice_step_mod_mp_step_therm1_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/source/ice_step_mod.F90:569 510:MPT: #10 0x00000000018147f8 in cice_runmod_mp_cice_run_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/drivers/cesm/CICE_RunMod.F90:186510:MPT: #11 0x000000000171878a in ice_comp_mct_mp_ice_run_mct_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/drivers/cesm/ice_comp_mct.F90:563510:MPT: #12 0x0000000000425874 in component_mod_mp_component_run_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/drivers/mct/main/component_mod.F90:728510:MPT: #13 0x000000000040aecb in cime_comp_mod_mp_cime_run_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/drivers/mct/main/cime_comp_mod.F90:2649510:MPT: #14 0x00000000004255a2 in MAIN__ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/drivers/mct/main/cime_driver.F90:103510:MPT: #15 0x0000000000408fde in main ()510:MPT: (gdb) A debugging session is active.510:MPT:510:MPT:        Inferior 1 [process 14888] will be detached.510:MPT:510:MPT: Quit anyway? (y or n) [answered Y; input not from terminal]510:MPT: Detaching from program: /proc/14888/exe, process 14888510:510:MPT: -----stack traceback ends-----
-1:MPT ERROR: MPI_COMM_WORLD rank 510 has terminated without calling MPI_Finalize() -1:     aborting job
 
 

jedwards

CSEG and Liaisons
Staff member
> 510: ice: Vertical thermo error> 510: ERROR: ice: Vertical thermo errorThis message indicates a science issue - this has nothing to do with the mpt errors on cheyenne discussed elsewhere on the forum.  You may want to look for unusual/unexpected conditions in the model cice field.
 
Back
Top