Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

error related with "Impossible case2 in instratus_condensate" / "NaN "

shandong

Xiao
New Member
Hello. I got a failed CESM run possibly related with "Impossible case2 in instratus_condensate" or "NaN" values. I have successfully run similar simulations, but this run always fail at the exact time step.

Any suggestions would be greatly appreciated!


/gpfs/fs1/scratch/shandong/cnh_neon_15_to_11/run/cesm.log.190731-172520

15: pLCL iteration is negative and set to psmin in uwshcu.F90
15: 4.264785978263775E-003 -96.4895378322424 78463.5688130133
15: Impossible case2 in instratus_condensate
15: 1.00000000000000 0.000000000000000E+000 0.000000000000000E+000
15: NaN NaN
15: ERROR: Unknown error submitted to shr_sys_abort.
7: pLCL iteration is negative and set to psmin in uwshcu.F90
7: 2.491301086566585E-003 -496.588632379369 61559.5066002056
15:Image PC Routine Line Source
15:cesm.exe 00000000024699BD Unknown Unknown Unknown
15:cesm.exe 0000000001D52C82 shr_sys_mod_mp_sh 282 shr_sys_mod.F90
15:cesm.exe 00000000012CC1F9 cldwat2m_macro_mp 1482 cldwat2m_macro.F90
15:cesm.exe 00000000012C0F9F cldwat2m_macro_mp 1003 cldwat2m_macro.F90
15:cesm.exe 000000000102ACD2 macrop_driver_mp_ 843 macrop_driver.F90
15:cesm.exe 0000000000572F69 physpkg_mp_tphysb 2043 physpkg.F90
15:cesm.exe 000000000056B163 physpkg_mp_phys_r 979 physpkg.F90
15:cesm.exe 00000000004B12C9 cam_comp_mp_cam_r 247 cam_comp.F90
15:cesm.exe 00000000004A28BE atm_comp_mct_mp_a 522 atm_comp_mct.F90
15:cesm.exe 000000000041E873 component_mod_mp_ 1022 component_mod.F90
15:cesm.exe 000000000040B106 ccsm_comp_mod_mp_ 3039 ccsm_comp_mod.F90
15:cesm.exe 000000000041C4CD MAIN__ 93 ccsm_driver.F90
15:cesm.exe 000000000040825E Unknown Unknown Unknown
15:libc.so.6 00002B47F06F36E5 __libc_start_main Unknown Unknown
15:cesm.exe 0000000000408169 Unknown Unknown Unknown
15:MPT ERROR: Rank 15(g:15) is aborting with error code 1001.
15: Process ID: 23189, Host: r9i7n34, Program: /glade/scratch/shandong/cnh_neon_15_to_11/bld/cesm.exe
15: MPT Version: HPE MPT 2.19 02/23/19 05:30:09
15:
15:MPT: --------stack traceback-------
15:MPT: Attaching to program: /proc/23189/exe, process 23189
15:MPT: Try: zypper install -C "debuginfo(build-id)=4e96cf37d52b9c2f3648e691878b682da5abfa42"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=5eb2f40ad3b0125943aba8f08dd08609351a2967"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=4f3d05f200db29c6835a48e466e0378a8541fd36"
15:MPT: (no debugging symbols found)...done.
15:MPT: [Thread debugging using libthread_db enabled]
15:MPT: Using host libthread_db library "/glade/u/apps/ch/os/lib64/libthread_db.so.1".
15:MPT: Try: zypper install -C "debuginfo(build-id)=b115bb26e97505a5bd3b56d70d20459aa1206ac9"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=93c4deac1088eb84fbd01cf2a2c54399f516e9a7"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=5f9ec139af58fa59c33f72d1b3e56f083f1613ae"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=bc347d1c2dd56b51057fbac71e84906135d02da5"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=4c08f43bb18e99a7df4bad5c4a52bac67ddf9b8d"
15:MPT: (no debugging symbols found)...done.
15:MPT: Try: zypper install -C "debuginfo(build-id)=3ae04b58bd81ea7745dba789d89937e719309568"
15:MPT: (no debugging symbols found)...done.
15:MPT: done.
15:MPT: 0x00002b47efac26da in waitpid ()
15:MPT: from /glade/u/apps/ch/os/lib64/libpthread.so.0
15:MPT: Missing separate debuginfos, use: zypper install glibc-debuginfo-2.22-49.16.x86_64
15:MPT: (gdb) #0 0x00002b47efac26da in waitpid ()
15:MPT: from /glade/u/apps/ch/os/lib64/libpthread.so.0
15:MPT: #1 0x00002b47f0409db6 in mpi_sgi_system (
15:MPT: #2 MPI_SGI_stacktraceback (
15:MPT: header=header@entry=0x7ffcf074e440 "MPT ERROR: Rank 15(g:15) is aborting with error code 1001.\n\tProcess ID: 23189, Host: r9i7n34, Program: /glade/scratch/shandong/cnh_neon_15_to_11/bld/cesm.exe\n\tMPT Version: HPE MPT 2.19 02/23/19 05:30"...) at sig.c:340
15:MPT: #3 0x00002b47f0352419 in print_traceback (ecode=ecode@entry=1001)
15:MPT: at abort.c:246
15:MPT: #4 0x00002b47f03526ea in PMPI_Abort (comm=<optimized out>, errorcode=1001)
15:MPT: at abort.c:68
15:MPT: #5 0x00002b47f03529cc in pmpi_abort__ ()
15:MPT: from /glade/u/apps/ch/opt/mpt/2.19/lib/libmpi.so
15:MPT: #6 0x0000000001d034c9 in shr_mpi_mod_mp_shr_mpi_abort_ ()
15:MPT: #7 0x0000000001d52d28 in shr_sys_mod_mp_shr_sys_abort_ ()
15:MPT: #8 0x00000000012cc1f9 in cldwat2m_macro_mp_instratus_condensate_ ()
15:MPT: #9 0x00000000012c0f9f in cldwat2m_macro_mp_mmacro_pcond_ ()
15:MPT: #10 0x000000000102acd2 in macrop_driver_mp_macrop_driver_tend_ ()
15:MPT: #11 0x0000000000572f69 in physpkg_mp_tphysbc_ ()
15:MPT: #12 0x000000000056b163 in physpkg_mp_phys_run1_ ()
15:MPT: #13 0x00000000004b12c9 in cam_comp_mp_cam_run1_ ()
15:MPT: #14 0x00000000004a28be in atm_comp_mct_mp_atm_run_mct_ ()
15:MPT: #15 0x000000000041e873 in component_mod_mp_component_run_ ()
15:MPT: #16 0x000000000040b106 in ccsm_comp_mod_mp_ccsm_run_ ()
15:MPT: #17 0x000000000041c4cd in MAIN__ ()
15:MPT: #18 0x000000000040825e in main ()
15:MPT: (gdb) A debugging session is active.
15:MPT:
15:MPT: Inferior 1 [process 23189] will be detached.
15:MPT:
15:MPT: Quit anyway? (y or n) [answered Y; input not from terminal]
15:MPT: Detaching from program: /proc/23189/exe, process 23189
15:
15:MPT: -----stack traceback ends-----
-1:MPT ERROR: MPI_COMM_WORLD rank 15 has terminated without calling MPI_Finalize()
-1: aborting job
 

fischer

CSEG and Liaisons
Staff member
Hi, what version of cesm are you using, and what configuration. I tried looking at your run on cheyenne, but
it looks like the scrubber has removed several of the files. So could you do a rerun. You can also try doing
your run with DEBUG turned on to catch errors. You can turn on debugging by doing
./xmlchange DEBUG=TRUE
Then rebuilding.
 

shandong

Xiao
New Member
Hi, what version of cesm are you using, and what configuration. I tried looking at your run on cheyenne, but
it looks like the scrubber has removed several of the files. So could you do a rerun. You can also try doing
your run with DEBUG turned on to catch errors. You can turn on debugging by doing
./xmlchange DEBUG=TRUE
Then rebuilding.

I just finished re-run the case with debug ON. How do you think? Thank you very much!

The configuration is like following:
./create_newcase -case /glade/scratch/shandong/cesm_cases/*** -res f19_g16 -user_compset 2000_CAM5_CLM45%BGC_CICE_DOCN%SOM_RTM_SGLC_SWAV -mach cheyenne


log:
/gpfs/fs1/scratch/shandong/cnh_neon_15_to_11/run/cesm.log.200205-213951

case folder:
/gpfs/fs1/scratch/shandong/cesm_cases/cnh_neon_15_to_11
 

fischer

CSEG and Liaisons
Staff member
You exceeded the wallclock limit on the run. It looks like the model was running just fine, but was running slower because of
debugging. Try running again like you normally would. It looks like your problems were caused by system issues on cheyenne.
 
Top