Dear colleagues and community members,
We try to run a startup model with cesm 1.3 and tag 'cesm-ihesp-hires1.0.44' with compset 1850_CAM50_CLM40%SP_CICE_POP2_RTM_SGLC_SWAV at resolution ne30_g16. However, we encountered the following errors that seem to be related to pop initialization and overflows.
We also tried compset BRCP85C5, hybrid run, or cesm tags cesm-ihesp-hires1.0.45 or 46 but the errors persist. Attached are full cesm logs with more details on the errors.
We appreciate any help and suggestions. Many thanks!
Sincerely
Dunyu
-- exerpt of the error log from cesm log.
[186] ==== backtrace (tid: 36330) ====
[186] 0 0x00000000001349c1 tbb::internal::atomic_backoff::atomic_backoff() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../include/tbb/tbb_machine.h:355
[186] 1 0x00000000001349c1 __TBB_TryLockByte() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../include/tbb/tbb_machine.h:914
[186] 2 0x00000000001349c1 __TBB_LockByte() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../include/tbb/tbb_machine.h:921
[186] 3 0x00000000001349c1 MallocMutex::scoped_lock::scoped_lock() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/Synchronize.h:39
[186] 4 0x00000000001349c1 rml::internal::Bin::addPublicFreeListBlock() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:1283
[186] 5 0x00000000001368d4 rml::internal::freeSmallObject() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2522
[186] 6 0x00000000001368d4 rml::internal::internalPoolFree() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2621
[186] 7 0x00000000001368d4 rml::internal::internalFree() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2644
[186] 8 0x00000000001368d4 scalable_free() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2932
[186] 9 0x00000000025cd27d for_dealloc_allocatable() ???:0
[186] 10 0x0000000001b65104 overflows_mp_ovf_solvers_9pt_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/overflows.F90:5722
[186] 11 0x0000000001d5a715 initial_mp_pop_init_phase1_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/initial.F90:347
[186] 12 0x0000000001c19177 pop_initmod_mp_pop_initialize1_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/POP_InitMod.F90:102
[186] 13 0x0000000001b43d1f ocn_comp_mct_mp_ocn_init_mct_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/ocn_comp_mct.F90:255
[186] 14 0x0000000000435049 component_mod_mp_component_init_cc_() /work2/07931/dunyuliu/frontera/software/cesm-ihesp-hires1.0.45/cime/src/drivers/mct/main/component_mod.F90:235
[186] 15 0x0000000000425a30 cesm_comp_mod_mp_cesm_init_() /work2/07931/dunyuliu/frontera/software/cesm-ihesp-hires1.0.45/cime/src/drivers/mct/main/cesm_comp_mod.F90:1040
[186] 16 0x00000000004307da MAIN__() /work2/07931/dunyuliu/frontera/software/cesm-ihesp-hires1.0.45/cime/src/drivers/mct/main/cesm_driver.F90:92
[186] 17 0x000000000041a892 main() ???:0
[186] 18 0x0000000000022555 __libc_start_main() ???:0
[186] 19 0x000000000041a7a9 _start() ???:0
[186] =================================
We try to run a startup model with cesm 1.3 and tag 'cesm-ihesp-hires1.0.44' with compset 1850_CAM50_CLM40%SP_CICE_POP2_RTM_SGLC_SWAV at resolution ne30_g16. However, we encountered the following errors that seem to be related to pop initialization and overflows.
We also tried compset BRCP85C5, hybrid run, or cesm tags cesm-ihesp-hires1.0.45 or 46 but the errors persist. Attached are full cesm logs with more details on the errors.
We appreciate any help and suggestions. Many thanks!
Sincerely
Dunyu
-- exerpt of the error log from cesm log.
[186] ==== backtrace (tid: 36330) ====
[186] 0 0x00000000001349c1 tbb::internal::atomic_backoff::atomic_backoff() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../include/tbb/tbb_machine.h:355
[186] 1 0x00000000001349c1 __TBB_TryLockByte() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../include/tbb/tbb_machine.h:914
[186] 2 0x00000000001349c1 __TBB_LockByte() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../include/tbb/tbb_machine.h:921
[186] 3 0x00000000001349c1 MallocMutex::scoped_lock::scoped_lock() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/Synchronize.h:39
[186] 4 0x00000000001349c1 rml::internal::Bin::addPublicFreeListBlock() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:1283
[186] 5 0x00000000001368d4 rml::internal::freeSmallObject() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2522
[186] 6 0x00000000001368d4 rml::internal::internalPoolFree() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2621
[186] 7 0x00000000001368d4 rml::internal::internalFree() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2644
[186] 8 0x00000000001368d4 scalable_free() /nfs/site/proj/openmp/promo/20190814/cet-enabled/tbbmalloc/build/linux_intel64_icc_cc4.4.6_libc2.12_kernel2.6.32_release/../../src/tbbmalloc/frontend.cpp:2932
[186] 9 0x00000000025cd27d for_dealloc_allocatable() ???:0
[186] 10 0x0000000001b65104 overflows_mp_ovf_solvers_9pt_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/overflows.F90:5722
[186] 11 0x0000000001d5a715 initial_mp_pop_init_phase1_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/initial.F90:347
[186] 12 0x0000000001c19177 pop_initmod_mp_pop_initialize1_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/POP_InitMod.F90:102
[186] 13 0x0000000001b43d1f ocn_comp_mct_mp_ocn_init_mct_() /scratch1/07931/dunyuliu/b.e13.B1850.ne30_g16.cesm-ihesp-hires1.0.45.test.startup/bld/ocn/source/ocn_comp_mct.F90:255
[186] 14 0x0000000000435049 component_mod_mp_component_init_cc_() /work2/07931/dunyuliu/frontera/software/cesm-ihesp-hires1.0.45/cime/src/drivers/mct/main/component_mod.F90:235
[186] 15 0x0000000000425a30 cesm_comp_mod_mp_cesm_init_() /work2/07931/dunyuliu/frontera/software/cesm-ihesp-hires1.0.45/cime/src/drivers/mct/main/cesm_comp_mod.F90:1040
[186] 16 0x00000000004307da MAIN__() /work2/07931/dunyuliu/frontera/software/cesm-ihesp-hires1.0.45/cime/src/drivers/mct/main/cesm_driver.F90:92
[186] 17 0x000000000041a892 main() ???:0
[186] 18 0x0000000000022555 __libc_start_main() ???:0
[186] 19 0x000000000041a7a9 _start() ???:0
[186] =================================