Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

KILLED BY SIGNAL: 9 (Killed)

lixiao

李霄
New Member
Dear all,

I was trying to run a regional case with CLM5.0 but failed.
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 92 PID 11662 RUNNING AT whuxj-PowerEdge-R7525
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================
I checked the log files but I can't find the problem.

Here are my commands:

./create_newcase --case CLM50Bgc_GSWP_EX3 --res f09_g17 --compset 2010_DATM%GSWP3v1_CLM50%BGC_SICE_SOCN_SROF_SGLC_SWAV --compiler intel --mach myintel --run-unsupported

./xmlchange DATM_CLMNCEP_YR_START=2014
./xmlchange DATM_CLMNCEP_YR_END=2014
./xmlchange RUN_STARTDATE=2014-01-01

./xmlchange STOP_OPTION=nyears
./xmlchange STOP_N=1
./xmlchange REST_OPTION=nyears # 输出时间单位,设置为year
./xmlchange REST_N=1
./xmlchange DEBUG=TRUE
./xmlchange LND_DOMAIN_FILE=domain.lnd.fv0.9x1.25_gx1v7.151020_EX3.nc
./xmlchange ATM_DOMAIN_FILE=domain.lnd.fv0.9x1.25_gx1v7.151020_EX3.nc

./case.setup
./preview_run
./preview_namelists
./check_input_data
./case.build --skip-provenance-check
./case.submit
 

Attachments

  • cesm.log.220926-153140.txt
    59.9 KB · Views: 6
  • atm.log.220926-153140.txt
    13.1 KB · Views: 1
  • lnd.log.220926-153140.txt
    149.5 KB · Views: 2
  • cpl.log.220926-153140.txt
    53.3 KB · Views: 0
  • env_mach_pes.xml.txt
    6.9 KB · Views: 0

oleson

Keith Oleson
CSEG and Liaisons
Staff member
There is a traceback in your cesm log:

forrtl: error (73): floating divide by zero
Image PC Routine Line Source
cesm.exe 0000000003A8441B Unknown Unknown Unknown
libpthread-2.27.s 000014BC92DAE980 Unknown Unknown Unknown
cesm.exe 0000000003B5914F Unknown Unknown Unknown
cesm.exe 00000000017CF001 humanindexmod_mp_ 896 HumanIndexMod.F90
cesm.exe 00000000027617D2 urbanfluxesmod_mp 875 UrbanFluxesMod.F90
cesm.exe 0000000000899AD8 clm_driver_mp_clm 561 clm_driver.F90
cesm.exe 0000000000858624 lnd_comp_mct_mp_l 456 lnd_comp_mct.F90
cesm.exe 00000000004743AC component_mod_mp_ 728 component_mod.F90
cesm.exe 0000000000440B36 cime_comp_mod_mp_ 2662 cime_comp_mod.F90
cesm.exe 000000000045BEF8 MAIN__ 103 cime_driver.F90
cesm.exe 0000000000417502 Unknown Unknown Unknown
libc-2.27.so 000014BC9262EC87 __libc_start_main Unknown Unknown
cesm.exe 00000000004173EA Unknown Unknown Unknown

I would start by looking at line 896 of HumanIndexMod.F90:

tl = (1._r8/((1._r8/((T1 - 55._r8))) - (log(relhum/100._r8)/2840._r8))) + 55._r8

and working backward from there to troubleshoot.
 

lixiao

李霄
New Member
There is a traceback in your cesm log:

forrtl: error (73): floating divide by zero
Image PC Routine Line Source
cesm.exe 0000000003A8441B Unknown Unknown Unknown
libpthread-2.27.s 000014BC92DAE980 Unknown Unknown Unknown
cesm.exe 0000000003B5914F Unknown Unknown Unknown
cesm.exe 00000000017CF001 humanindexmod_mp_ 896 HumanIndexMod.F90
cesm.exe 00000000027617D2 urbanfluxesmod_mp 875 UrbanFluxesMod.F90
cesm.exe 0000000000899AD8 clm_driver_mp_clm 561 clm_driver.F90
cesm.exe 0000000000858624 lnd_comp_mct_mp_l 456 lnd_comp_mct.F90
cesm.exe 00000000004743AC component_mod_mp_ 728 component_mod.F90
cesm.exe 0000000000440B36 cime_comp_mod_mp_ 2662 cime_comp_mod.F90
cesm.exe 000000000045BEF8 MAIN__ 103 cime_driver.F90
cesm.exe 0000000000417502 Unknown Unknown Unknown
libc-2.27.so 000014BC9262EC87 __libc_start_main Unknown Unknown
cesm.exe 00000000004173EA Unknown Unknown Unknown

I would start by looking at line 896 of HumanIndexMod.F90:

tl = (1._r8/((1._r8/((T1 - 55._r8))) - (log(relhum/100._r8)/2840._r8))) + 55._r8

and working backward from there to troubleshoot.
Sorry, I don't understand how to solve the problem. Does " floating divide by zero" refers to the fault in my surfdata?
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
It means that there is a division by zero in the equation represented by this line of code:

tl = (1._r8/((1._r8/((T1 - 55._r8))) - (log(relhum/100._r8)/2840._r8))) + 55._r8

So possibly, T1 has a value of 55 (T1-55=0).
T1 is an input to the subroutine Wet_Bulb. The subroutine Wet_Bulb is called by various modules, e.g., CanopyFluxesMod.F90:

call Wet_Bulb(t_ref2m(p), vap_ref2m(p), forc_pbot(c), rh_ref2m(p), q_ref2m(p), &
teq_ref2m(p), ept_ref2m(p), wb_ref2m(p))

t_ref2m corresponds to T1. So t_ref2m (the 2-m reference height temperature) in one of those modules must be 55 Kelvin. That is far outside the normal values for t_ref2m under normal atmospheric forcing conditions.
Another possibility is that relhum (relative humidity) is zero which again shouldn't happen under normal atmospheric forcing conditions. Although I'm not sure that would manifest itself as a divide by zero error.
There is a known problem with the GSWP3 atmospheric forcing for year 2014, in which the forcing specific humidity is zero:


That hasn't been known to crash the model, just give bad results, but you could try a different year of atmospheric forcing.
 
Top