Hi all,
I am running CESM 2.1.5 with the F2000climo compset. The compiler is Intel 2021. After running for a few days, the model stops, and the following message appears in the cesm.log file.
[b3203r3n6:4519 :0:4519] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n7:119552:0:119552] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n8:48120:0:48120] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r4n1:10843:0:10843] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128749:0:128749] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n7:119536:0:119536] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n4:38143:0:38143] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n4:38127:0:38127] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n7:119544:0:119544] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n5:98690:0:98690] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n4:38135:0:38135] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n8:48111:0:48111] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4526 :0:4526] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4534 :0:4534] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4518 :0:4518] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n5:98674:0:98674] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r4n1:10835:0:10835] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128756:0:128756] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n5:98698:0:98698] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4549 :0:4549] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4557 :0:4557] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128748:0:128748] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n5:25038:0:25038] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128764:0:128764] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
/data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/aer_src/rrtmg_lw_rtrnmc.f90: [ rrtmg_lw_rtrnmc_mp_rtrnmc_() ]
...
372 else
373 tblind = odepth/(bpade+odepth)
374 itr = tblint*tblind+0.5_r8
==> 375 transc = exp_tbl(itr)
376 atrans(lev) = 1._r8-transc
377 tausfac = tfn_tbl(itr)
378 bbd = plfrac*(blay+tausfac*dplankdn)
373 tblind = odepth/(bpade+odepth)
378 bbd = plfrac*(blay+tausfac*dplankdn)
==== backtrace (tid: 10843) ====
0 0x0000000000b14eda rrtmg_lw_rtrnmc_mp_rtrnmc_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/aer_src/rrtmg_lw_rtrnmc.f90:375
1 0x0000000000b10afc rrtmg_lw_rad_mp_rrtmg_lw_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/aer_src/rrtmg_lw_rad.f90:386
2 0x000000000072d368 radlw_mp_rad_rrtmg_lw_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/radlw.F90:191
3 0x0000000000714902 radiation_mp_radiation_tend_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/radiation.F90:1153
4 0x00000000006d04a9 physpkg_mp_tphysbc_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/cam/physpkg.F90:2272
5 0x00000000006c96f8 physpkg_mp_phys_run1_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/cam/physpkg.F90:1057
6 0x0000000000506cac cam_comp_mp_cam_run1_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/control/cam_comp.F90:258
7 0x00000000004f8762 atm_comp_mct_mp_atm_run_mct_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/cpl/atm_comp_mct.F90:454
8 0x000000000043d0a3 component_mod_mp_component_run_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/cime/src/drivers/mct/main/component_mod.F90:728
9 0x0000000000422e9e cime_comp_mod_mp_cime_run_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/cime/src/drivers/mct/main/cime_comp_mod.F90:3465
10 0x000000000043ccf7 MAIN__() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/cime/src/drivers/mct/main/cime_driver.F90:125
11 0x000000000041f222 main() ???:0
12 0x00000000000223d5 __libc_start_main() ???:0
13 0x000000000041f129 _start() ???:0
=================================
In another run, the error messages are from a different F90 file.
/data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/dynamics/fv/tp_core.F90: [ tp_core_mp_xtpv_() ]
...
454 do i=1,im
455 iu = real(i,r8) - cv(i,j)
456 fxv(i,j) = mfxv(i,j)*qtmpv(iu,j)
==> 457 enddo
458 else
459
460 qtmpv(-1,j) = qv(im-1,j)
Does anyone have suggestions on how to diagnose or resolve this issue? Thank you!
Jian
I am running CESM 2.1.5 with the F2000climo compset. The compiler is Intel 2021. After running for a few days, the model stops, and the following message appears in the cesm.log file.
[b3203r3n6:4519 :0:4519] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n7:119552:0:119552] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n8:48120:0:48120] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r4n1:10843:0:10843] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128749:0:128749] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n7:119536:0:119536] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n4:38143:0:38143] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n4:38127:0:38127] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n7:119544:0:119544] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n5:98690:0:98690] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n4:38135:0:38135] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n8:48111:0:48111] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4526 :0:4526] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4534 :0:4534] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4518 :0:4518] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n5:98674:0:98674] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r4n1:10835:0:10835] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128756:0:128756] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b2102r4n5:98698:0:98698] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4549 :0:4549] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n6:4557 :0:4557] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128748:0:128748] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n5:25038:0:25038] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
[b3203r3n4:128764:0:128764] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc09c4f360)
/data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/aer_src/rrtmg_lw_rtrnmc.f90: [ rrtmg_lw_rtrnmc_mp_rtrnmc_() ]
...
372 else
373 tblind = odepth/(bpade+odepth)
374 itr = tblint*tblind+0.5_r8
==> 375 transc = exp_tbl(itr)
376 atrans(lev) = 1._r8-transc
377 tausfac = tfn_tbl(itr)
378 bbd = plfrac*(blay+tausfac*dplankdn)
373 tblind = odepth/(bpade+odepth)
378 bbd = plfrac*(blay+tausfac*dplankdn)
==== backtrace (tid: 10843) ====
0 0x0000000000b14eda rrtmg_lw_rtrnmc_mp_rtrnmc_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/aer_src/rrtmg_lw_rtrnmc.f90:375
1 0x0000000000b10afc rrtmg_lw_rad_mp_rrtmg_lw_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/aer_src/rrtmg_lw_rad.f90:386
2 0x000000000072d368 radlw_mp_rad_rrtmg_lw_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/radlw.F90:191
3 0x0000000000714902 radiation_mp_radiation_tend_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/rrtmg/radiation.F90:1153
4 0x00000000006d04a9 physpkg_mp_tphysbc_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/cam/physpkg.F90:2272
5 0x00000000006c96f8 physpkg_mp_phys_run1_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/physics/cam/physpkg.F90:1057
6 0x0000000000506cac cam_comp_mp_cam_run1_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/control/cam_comp.F90:258
7 0x00000000004f8762 atm_comp_mct_mp_atm_run_mct_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/cpl/atm_comp_mct.F90:454
8 0x000000000043d0a3 component_mod_mp_component_run_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/cime/src/drivers/mct/main/component_mod.F90:728
9 0x0000000000422e9e cime_comp_mod_mp_cime_run_() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/cime/src/drivers/mct/main/cime_comp_mod.F90:3465
10 0x000000000043ccf7 MAIN__() /data1/elpt_2023_000223/jzheng/cesm-2.1.5/cime/src/drivers/mct/main/cime_driver.F90:125
11 0x000000000041f222 main() ???:0
12 0x00000000000223d5 __libc_start_main() ???:0
13 0x000000000041f129 _start() ???:0
=================================
In another run, the error messages are from a different F90 file.
/data1/elpt_2023_000223/jzheng/cesm-2.1.5/components/cam/src/dynamics/fv/tp_core.F90: [ tp_core_mp_xtpv_() ]
...
454 do i=1,im
455 iu = real(i,r8) - cv(i,j)
456 fxv(i,j) = mfxv(i,j)*qtmpv(iu,j)
==> 457 enddo
458 else
459
460 qtmpv(-1,j) = qv(im-1,j)
Does anyone have suggestions on how to diagnose or resolve this issue? Thank you!
Jian