qiong_yang@noaa_gov
New Member
I'm running CESM1.2.2 with compset BRCP85C5CN -res f09_g16 on local cluster. The run completed successfully if I use 6 nodes (each node has 12 cores). However, if I increase the number of nodes (e.g., 12 nodes), the run failed with the following error message:/gscratch/coenv/qiongy/inputdata/atm/cam/chem/trop_mozart/dvel/regrid_vegetatio n.nc 1441792forrtl: error (78): process killed (SIGTERM)Image PC Routine Line Sourcecesm.exe 00000000023B5031 Unknown Unknown Unknowncesm.exe 00000000023B3787 Unknown Unknown Unknownlibnetcdff.so.6 00002ACE0BE8F912 Unknown Unknown Unknownlibnetcdff.so.6 00002ACE0BE8F766 Unknown Unknown Unknownlibnetcdff.so.6 00002ACE0BE7630C Unknown Unknown Unknownlibnetcdff.so.6 00002ACE0BE7A343 Unknown Unknown Unknownlibpthread.so.0 00002ACE105437E0 Unknown Unknown Unknowncesm.exe 0000000000591CCD mo_drydep_mp_inte 2219 mo_drydep.F90cesm.exe 000000000058D9F3 mo_drydep_mp_dvel 1879 mo_drydep.F90cesm.exe 000000000055AC2A mo_chemini_mp_che 215 mo_chemini.F90cesm.exe 000000000051A121 chemistry_mp_chem 1010 chemistry.F90cesm.exe 000000000073DC59 physpkg_mp_phys_i 745 physpkg.F90cesm.exe 00000000004C66AE cam_comp_mp_cam_i 181 cam_comp.F90cesm.exe 00000000004C253F atm_comp_mct_mp_a 276 atm_comp_mct.F90cesm.exe 000000000042B625 ccsm_comp_mod_mp_ 1058 ccsm_comp_mod.F90cesm.exe 000000000042DCB3 MAIN__ 90 ccsm_driver.F90cesm.exe 000000000040B10E Unknown Unknown Unknownlibc.so.6 00002ACE1076FD5D Unknown Unknown Unknown cesm.exe 000000000040B019 Unknown Unknown Unknownforrtl: error (78): process killed (SIGTERM)Image PC Routine Line Sourcecesm.exe 00000000023B5031 Unknown Unknown Unknowncesm.exe 00000000023B3787 Unknown Unknown Unknownlibnetcdff.so.6 00002ACA1040F912 Unknown Unknown Unknownlibnetcdff.so.6 00002ACA1040F766 Unknown Unknown Unknownlibnetcdff.so.6 00002ACA103F630C Unknown Unknown Unknownlibnetcdff.so.6 00002ACA103FA343 Unknown Unknown Unknownlibpthread.so.0 00002ACA14AC37E0 Unknown Unknown Unknownlibc.so.6 00002ACA14DB0113 Unknown Unknown Unknownlibopen-pal.so.6 00002ACA17FDB95A Unknown Unknown Unknownlibopen-pal.so.6 00002ACA17FD163B Unknown Unknown Unknownlibopen-pal.so.6 00002ACA17F88C3D Unknown Unknown Unknownmca_pml_ob1.so 00002ACA1FDBBE9E Unknown Unknown Unknownlibmpi.so.1 00002ACA13DF7531 Unknown Unknown Unknownlibmpi_mpifh.so.2 00002ACA146025F2 Unknown Unknown Unknowncesm.exe 000000000104019F mpiscatterv_ 976 wrap_mpi.F90cesm.exe 00000000006C897D phys_grid_mp_scat 2076 phys_grid.F90cesm.exe 0000000000592763 mo_drydep_mp_inte 2286 mo_drydep.F90cesm.exe 000000000058D9F3 mo_drydep_mp_dvel 1879 mo_drydep.F90cesm.exe 000000000055AC2A mo_chemini_mp_che 215 mo_chemini.F90cesm.exe 000000000051A121 chemistry_mp_chem 1010 chemistry.F90cesm.exe 000000000073DC59 physpkg_mp_phys_i 745 physpkg.F90cesm.exe 00000000004C66AE cam_comp_mp_cam_i 181 cam_comp.F90cesm.exe 00000000004C253F atm_comp_mct_mp_a 276 atm_comp_mct.F90cesm.exe 000000000042B625 ccsm_comp_mod_mp_ 1058 ccsm_comp_mod.F90cesm.exe 000000000042DCB3 MAIN__ 90 ccsm_driver.F90cesm.exe 000000000040B10E Unknown Unknown Unknownlibc.so.6 00002ACA14CEFD5D Unknown Unknown Unknowncesm.exe 000000000040B019 Unknown Unknown Unknown I also include the full files of ccsm.log, atm.log and cpl.log. Anyone has any clue? The compiler I used is icc_15.0.2-ompi_1.8.4. Thanks!