Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

MPT ERROR: Rank 510 with cesm2.0 compset B1850

Hi to whom it may concern,I encounterd a problem when running a brunch run with cesm2.0, compset B1850 and resolution f09_g17. My case was successfully running for 18 months before it was aborted. I did the similar simulations with the exactly same scripts except for brunching off from different years but didn't have any problems.Thanks.The error messages are attached as below: 510: ice: Vertical thermo error510: ERROR: ice: Vertical thermo error510:Image              PC                Routine            Line        Source510:cesm.exe           00000000034DBF9D  Unknown               Unknown  Unknown510:cesm.exe           0000000002C47B22  shr_abort_mod_mp_         114  shr_abort_mod.F90510:cesm.exe           0000000001726F74  ice_exit_mp_abort          46  ice_exit.F90510:cesm.exe           0000000001944E6C  ice_step_mod_mp_s         569  ice_step_mod.F90510:cesm.exe           00000000018147F8  cice_runmod_mp_ci         186  CICE_RunMod.F90510:cesm.exe           000000000171878A  ice_comp_mct_mp_i         563  ice_comp_mct.F90510:cesm.exe           0000000000425874  component_mod_mp_         728  component_mod.F90510:cesm.exe           000000000040AECB  cime_comp_mod_mp_        2649  cime_comp_mod.F90510:cesm.exe           00000000004255A2  MAIN__                    103  cime_driver.F90510:cesm.exe           0000000000408FDE  Unknown               Unknown  Unknown510:libc-2.19.so       00002AAAB08F9B25  __libc_start_main     Unknown  Unknown 510:cesm.exe           0000000000408EE9  Unknown               Unknown  Unknown
 510:MPT ERROR: Rank 510(g:510) is aborting with error code 1001
510:MPT: --------stack traceback-------510:MPT: Attaching to program: /proc/14888/exe, process 14888510:MPT: done.510:MPT: Try: zypper install -C "debuginfo(build-id)=3d290be00d48b823d3b71df2249e80d881bc473d"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=0ea764119690f32c98faae9a63a73f35ed8b1099"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=5409c48fdb15e90649c1407e444fbe31d6dc8ec1"510:MPT: (no debugging symbols found)...done.510:MPT: [Thread debugging using libthread_db enabled]510:MPT: Using host libthread_db library "/glade/u/apps/ch/os/lib64/libthread_db.so.1".510:MPT: Try: zypper install -C "debuginfo(build-id)=79264652a62453da222372a430cd9351d4bbcbde"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=e97cfdb062d6f0c41073f2109a7605d0ae991c03"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=f43d7754940a14ffe3d9bd8fc9472ffbbfead544"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=15916519d9dbaea26ec88427460b4cedb9c0a6ab"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=4c08f43bb18e99a7df4bad5c4a52bac67ddf9b8d"510:MPT: (no debugging symbols found)...done.510:MPT: Try: zypper install -C "debuginfo(build-id)=3ae04b58bd81ea7745dba789d89937e719309568"510:MPT: (no debugging symbols found)...done.510:MPT: 0x00002aaaafcc541c in waitpid () from /glade/u/apps/ch/os/lib64/libpthread.so.0 510:MPT: Missing separate debuginfos, use: zypper install glibc-debuginfo-2.19-35.1.x86_64510:MPT: (gdb) #0  0x00002aaaafcc541c in waitpid ()510:MPT:    from /glade/u/apps/ch/os/lib64/libpthread.so.0510:MPT: #1  0x00002aaab060de66 in mpi_sgi_system (510:MPT: #2  MPI_SGI_stacktraceback (510:MPT:     header=header@entry=0x7ffffffd6ce0 "MPT ERROR: Rank 510(g:510) is aborting with error code 1001.ntProcess ID: 14888, Host: r9i2n21, Program: /gpfs/fs1/scratch/kezhou/b.e20.B1850.f09_g17.4xCO2.24_mon08/bld/cesm.exentMPT Version: HPE MPT "...) at sig.c:340510:MPT: #3  0x00002aaab05564c9 in print_traceback (ecode=ecode@entry=1001)510:MPT:     at abort.c:246510:MPT: #4  0x00002aaab055679a in PMPI_Abort (comm=, errorcode=1001)510:MPT:     at abort.c:68510:MPT: #5  0x00002aaab0556a7c in pmpi_abort__ ()510:MPT:    from /glade/u/apps/ch/opt/mpt/2.19/opt/hpe/hpc/mpt/mpt-2.19/lib/libmpi.so510:MPT: #6  0x0000000002d395a9 in shr_mpi_mod_mp_shr_mpi_abort_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/share/util/shr_mpi_mod.F90:2127510:MPT: #7  0x0000000002c47bc8 in shr_abort_mod_mp_shr_abort_abort_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/share/util/shr_abort_mod.F90:69510:MPT: #8  0x0000000001726f74 in ice_exit_mp_abort_ice_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/mpi/ice_exit.F90:46510:MPT: #9  0x0000000001944e6c in ice_step_mod_mp_step_therm1_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/source/ice_step_mod.F90:569 510:MPT: #10 0x00000000018147f8 in cice_runmod_mp_cice_run_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/drivers/cesm/CICE_RunMod.F90:186510:MPT: #11 0x000000000171878a in ice_comp_mct_mp_ice_run_mct_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/components/cice/src/drivers/cesm/ice_comp_mct.F90:563510:MPT: #12 0x0000000000425874 in component_mod_mp_component_run_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/drivers/mct/main/component_mod.F90:728510:MPT: #13 0x000000000040aecb in cime_comp_mod_mp_cime_run_ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/drivers/mct/main/cime_comp_mod.F90:2649510:MPT: #14 0x00000000004255a2 in MAIN__ ()510:MPT:     at /gpfs/u/home/kezhou/my_cesm_sandbox/cime/src/drivers/mct/main/cime_driver.F90:103510:MPT: #15 0x0000000000408fde in main ()510:MPT: (gdb) A debugging session is active.510:MPT:510:MPT:        Inferior 1 [process 14888] will be detached.510:MPT:510:MPT: Quit anyway? (y or n) [answered Y; input not from terminal]510:MPT: Detaching from program: /proc/14888/exe, process 14888510:510:MPT: -----stack traceback ends-----
-1:MPT ERROR: MPI_COMM_WORLD rank 510 has terminated without calling MPI_Finalize() -1:     aborting job
 
 

jedwards

CSEG and Liaisons
Staff member
> 510: ice: Vertical thermo error> 510: ERROR: ice: Vertical thermo errorThis message indicates a science issue - this has nothing to do with the mpt errors on cheyenne discussed elsewhere on the forum.  You may want to look for unusual/unexpected conditions in the model cice field.
 
Top