Hello. I got a failed CESM run possibly related with "Impossible case2 in instratus_condensate" or "NaN" values. I have successfully run similar simulations, but this run always fail at the exact time step.
Any suggestions would be greatly appreciated!
/gpfs/fs1/scratch/shandong/cnh_neon_15_to_11/run/cesm.log.190731-172520
Any suggestions would be greatly appreciated!
/gpfs/fs1/scratch/shandong/cnh_neon_15_to_11/run/cesm.log.190731-172520
15: pLCL iteration is negative and set to psmin in uwshcu.F90
15: 4.264785978263775E-003 -96.4895378322424 78463.5688130133
15: Impossible case2 in instratus_condensate
15: 1.00000000000000 0.000000000000000E+000 0.000000000000000E+000
15: NaN NaN
15: ERROR: Unknown error submitted to shr_sys_abort.
7: pLCL iteration is negative and set to psmin in uwshcu.F90
7: 2.491301086566585E-003 -496.588632379369 61559.5066002056
15:Image PC Routine Line Source
15:cesm.exe 00000000024699BD Unknown Unknown Unknown
15:cesm.exe 0000000001D52C82 shr_sys_mod_mp_sh 282 shr_sys_mod.F90
15:cesm.exe 00000000012CC1F9 cldwat2m_macro_mp 1482 cldwat2m_macro.F90
15:cesm.exe 00000000012C0F9F cldwat2m_macro_mp 1003 cldwat2m_macro.F90
15:cesm.exe 000000000102ACD2 macrop_driver_mp_ 843 macrop_driver.F90
15:cesm.exe 0000000000572F69 physpkg_mp_tphysb 2043 physpkg.F90
15:cesm.exe 000000000056B163 physpkg_mp_phys_r 979 physpkg.F90
15:cesm.exe 00000000004B12C9 cam_comp_mp_cam_r 247 cam_comp.F90
15:cesm.exe 00000000004A28BE atm_comp_mct_mp_a 522 atm_comp_mct.F90
15:cesm.exe 000000000041E873 component_mod_mp_ 1022 component_mod.F90
15:cesm.exe 000000000040B106 ccsm_comp_mod_mp_ 3039 ccsm_comp_mod.F90
15:cesm.exe 000000000041C4CD MAIN__ 93 ccsm_driver.F90
15:cesm.exe 000000000040825E Unknown Unknown Unknown
15:libc.so.6 00002B47F06F36E5 __libc_start_main Unknown Unknown
15:cesm.exe 0000000000408169 Unknown Unknown Unknown
15:MPT ERROR: Rank 15(g:15) is aborting with error code 1001.
15: Process ID: 23189, Host: r9i7n34, Program: /glade/scratch/shandong/cnh_neon_15_to_11/bld/cesm.exe
15: MPT Version: HPE MPT 2.19 02/23/19 05:30:09
15:
15:MPT: --------stack traceback-------
15:MPT: Attaching to program: /proc/23189/exe, process 23189
15:MPT: Try: zypper install -C "debuginfo(build-id)=4e96cf37d52b9c2f3648e691878b682da5abfa42"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=5eb2f40ad3b0125943aba8f08dd08609351a2967"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=4f3d05f200db29c6835a48e466e0378a8541fd36"
15:MPT: (no debugging symbols found)...done.
15:MPT: [Thread debugging using libthread_db enabled]
15:MPT: Using host libthread_db library "/glade/u/apps/ch/os/lib64/libthread_db.so.1".
15:MPT: Try: zypper install -C "debuginfo(build-id)=b115bb26e97505a5bd3b56d70d20459aa1206ac9"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=93c4deac1088eb84fbd01cf2a2c54399f516e9a7"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=5f9ec139af58fa59c33f72d1b3e56f083f1613ae"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=bc347d1c2dd56b51057fbac71e84906135d02da5"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=4c08f43bb18e99a7df4bad5c4a52bac67ddf9b8d"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=3ae04b58bd81ea7745dba789d89937e719309568"
15:MPT: (no debugging symbols found)...done.
15:MPT: done.
15:MPT: 0x00002b47efac26da in waitpid ()
15:MPT: from /glade/u/apps/ch/os/lib64/libpthread.so.0
15:MPT: Missing separate debuginfos, use: zypper install glibc-debuginfo-2.22-49.16.x86_64
15:MPT: (gdb) #0 0x00002b47efac26da in waitpid ()
15:MPT: from /glade/u/apps/ch/os/lib64/libpthread.so.0
15:MPT: #1 0x00002b47f0409db6 in mpi_sgi_system (
15:MPT: #2 MPI_SGI_stacktraceback (
15:MPT: header=header@entry=0x7ffcf074e440 "MPT ERROR: Rank 15(g:15) is aborting with error code 1001.\n\tProcess ID: 23189, Host: r9i7n34, Program: /glade/scratch/shandong/cnh_neon_15_to_11/bld/cesm.exe\n\tMPT Version: HPE MPT 2.19 02/23/19 05:30"...) at sig.c:340
15:MPT: #3 0x00002b47f0352419 in print_traceback (ecode=ecode@entry=1001)
15:MPT: at abort.c:246
15:MPT: #4 0x00002b47f03526ea in PMPI_Abort (comm=<optimized out>, errorcode=1001)
15:MPT: at abort.c:68
15:MPT: #5 0x00002b47f03529cc in pmpi_abort__ ()
15:MPT: from /glade/u/apps/ch/opt/mpt/2.19/lib/libmpi.so
15:MPT: #6 0x0000000001d034c9 in shr_mpi_mod_mp_shr_mpi_abort_ ()
15:MPT: #7 0x0000000001d52d28 in shr_sys_mod_mp_shr_sys_abort_ ()
15:MPT: #8 0x00000000012cc1f9 in cldwat2m_macro_mp_instratus_condensate_ ()
15:MPT: #9 0x00000000012c0f9f in cldwat2m_macro_mp_mmacro_pcond_ ()
15:MPT: #10 0x000000000102acd2 in macrop_driver_mp_macrop_driver_tend_ ()
15:MPT: #11 0x0000000000572f69 in physpkg_mp_tphysbc_ ()
15:MPT: #12 0x000000000056b163 in physpkg_mp_phys_run1_ ()
15:MPT: #13 0x00000000004b12c9 in cam_comp_mp_cam_run1_ ()
15:MPT: #14 0x00000000004a28be in atm_comp_mct_mp_atm_run_mct_ ()
15:MPT: #15 0x000000000041e873 in component_mod_mp_component_run_ ()
15:MPT: #16 0x000000000040b106 in ccsm_comp_mod_mp_ccsm_run_ ()
15:MPT: #17 0x000000000041c4cd in MAIN__ ()
15:MPT: #18 0x000000000040825e in main ()
15:MPT: (gdb) A debugging session is active.
15:MPT:
15:MPT: Inferior 1 [process 23189] will be detached.
15:MPT:
15:MPT: Quit anyway? (y or n) [answered Y; input not from terminal]
15:MPT: Detaching from program: /proc/23189/exe, process 23189
15:
15:MPT: -----stack traceback ends-----
-1:MPT ERROR: MPI_COMM_WORLD rank 15 has terminated without calling MPI_Finalize()
-1: aborting job