Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

error (65): floating invalid for CTSM run with customized forcing

Status
Not open for further replies.

yifanc17

Yifan Cheng
New Member
Hi all,

I have encountered some issues when running ctsm5.2.005 with customized forcing (ERA5 Land 9km hourly data) with --compset 2000_DATM%NLDAS2_CLM50%NWP-SP_SICE_SOCN_MOSART_SGLC_SWAV --res nldas2_rnldas2_mnldas. There are two cases, ctrl and test, sharing exactly the same configurations except for the fsurdat. I did below changes to the xml variables for both cases:
Code:
RUN_STARTDATE=2005-01-01,STOP_N=3,STOP_OPTION=nyears,RESUBMIT=3,DATM_YR_START=2005,DATM_YR_ALIGN=2005,DATM_YR_END=2014

For the first round of simulation (2005-01-01 to 2007-12-31), both cases worked fine. After resubmitting once, the ctrl case continued running smoothly while the test case stopped at 2008-07-11 with below error in cesm.log:

dec0475.hsn.de.hpc.ucar.edu 1658: forrtl: error (65): floating invalid
dec0475.hsn.de.hpc.ucar.edu 1658: Image PC Routine Line Source
dec0475.hsn.de.hpc.ucar.edu 1658: libpthread-2.31.s 000014B92E5D08C0 Unknown Unknown Unknown
dec0475.hsn.de.hpc.ucar.edu 1658: cesm.exe 000000000489AD72 Unknown Unknown Unknown
dec0475.hsn.de.hpc.ucar.edu 1658: cesm.exe 000000000489FF71 Unknown Unknown Unknown
dec0475.hsn.de.hpc.ucar.edu 1658: cesm.exe 000000000211AD82 humanindexmod_mp_ 1377 HumanIndexMod.F90
dec0475.hsn.de.hpc.ucar.edu 1658: cesm.exe 00000000021192DC humanindexmod_mp_ 904 HumanIndexMod.F90
dec0475.hsn.de.hpc.ucar.edu 1658: cesm.exe 00000000032DFF7C urbanfluxesmod_mp 906 UrbanFluxesMod.F90
dec0475.hsn.de.hpc.ucar.edu 1658: cesm.exe 0000000000A27309 clm_driver_mp_clm 737 clm_driver.F90

Tracing back to line 1377 HumanIndexMod.F90, it seemed to encounter some issues within subroutine Qsat_2 where it computes saturation mixing ratio and the change in saturation mixing ratio with respect to temperature, which can be further traced back to line 906 in UrbanFluxesMod.F90 where it calculates wet bulb temperature:

1376 ! Calculations for used to calculate f(T,ndimpress)
1377 foftk = ((Cf/T_k)**lambd_a)*(1._r8 - es_mb/p0ndplam)**(vkp*lambd_a)* &
1378 exp(-lambd_a*goftk)

906 call Wet_Bulb(t_ref2m(p), vap_ref2m(p), forc_pbot(g), rh_ref2m(p), q_ref2m(p), &
907 teq_ref2m(p), ept_ref2m(p), wb_ref2m(p))

I saw this post KILLED BY SIGNAL: 9 (Killed) saying it could be potential issues from the forcing, however in my two cases, ctrl and test are sharing the exact same forcing. If it works for ctrl, I'm not sure why it's not working for test. I have attached the lnd.log and atm.log below, the cesm.log is too large to upload and it's stored here at `/glade/derecho/scratch/yifanc17/i2000Clm50Sp.CTSM5.2.ERA5Land.0.125nldas2.TEST.c240603/run/` . The ctrl and test case directory is `/glade/work/yifanc17/cases/i2000Clm50Sp.CTSM5.2.ERA5Land.0.125nldas2.CTRL.c240602/` `/glade/work/yifanc17/cases/i2000Clm50Sp.CTSM5.2.ERA5Land.0.125nldas2.TEST.c240603/`. Any suggestions would be appreciated!

Thanks,
Yifan
 

Attachments

  • lnd.log.4693219.desched1.240604-043609.txt
    293.8 KB · Views: 0
  • atm.log.4693219.desched1.240604-043609.txt.zip
    83.8 KB · Views: 1

oleson

Keith Oleson
CSEG and Liaisons
Staff member
I agree that the forcing shouldn't be a problem as long as you aren't changing the landmask in the two runs and not accounting for that in the mesh file. In looking through your TEST surface dataset, I see that one thing you have changed is HT_ROOF. However, the WIND_HGT_CANYON appears to be specified as the default settings. The WIND_HGT_CANYON should be 1/2 * HT_ROOF (something that is not well documented). I wonder if this inconsistency is causing a problem with the canyon air temperature/humidity calculations which could feed into the wet bulb calculation.
 

yifanc17

Yifan Cheng
New Member
I agree that the forcing shouldn't be a problem as long as you aren't changing the landmask in the two runs and not accounting for that in the mesh file. In looking through your TEST surface dataset, I see that one thing you have changed is HT_ROOF. However, the WIND_HGT_CANYON appears to be specified as the default settings. The WIND_HGT_CANYON should be 1/2 * HT_ROOF (something that is not well documented). I wonder if this inconsistency is causing a problem with the canyon air temperature/humidity calculations which could feed into the wet bulb calculation.
Oh right I forgot about that! Thank you for the headsup! I'll try to correct the surfdata and test the case again.
 

yifanc17

Yifan Cheng
New Member
I've tried to use the corrected surface dataset + ERA5 forcing to start running from 2008-07-01 and it did run through that month without any error, however, when starting from 2005-01-01, it stopped again at 2008-07-11 with the float error as above. Is it because of the fluxes (thus the calculation of temperature) are dependent on the previous time step? So for a land-only run spanning from 2005-01-01 to 2008-01-01 and another land-only run spanning from 2007-12-01 to 2008-01-01 we would expect different results at the same time step?
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
If the 2007-12-01 to 2008-01-01 run begins as a restart using the restarts from 2005-01-01 to 2008-01-01, then I would expect the runs to be the same if nothing else was changed. But if you are using different surface datasets then I wouldn't expect them to be the same.
It doesn't sound like the problem was fixed by setting WIND_HGT_CANYON appropriately since your complete rerun (2005-01-01 to ...) crashed in the same place?
 

yifanc17

Yifan Cheng
New Member
If the 2007-12-01 to 2008-01-01 run begins as a restart using the restarts from 2005-01-01 to 2008-01-01, then I would expect the runs to be the same if nothing else was changed. But if you are using different surface datasets then I wouldn't expect them to be the same.
It doesn't sound like the problem was fixed by setting WIND_HGT_CANYON appropriately since your complete rerun (2005-01-01 to ...) crashed in the same place?
I see. Fixing WIND_HGT_CANYON didn't solve the problem. I'm still curious why the test run from 2008-07-01 to 2008-08-01 using the fixed dataset would work while the complete run failed, is it because these two simulations (both using cold start) have different initical conditions?
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
If you are saying that both simulations (complete and partial) started from cold initial conditions, then yes they would be different because of the evolving state variables and one could fail while the other does not.
If you point me to the case of your complete run that failed I can look at it further when I get a chance. There must be another problem with the surface data or elsewhere. You could also try setting calc_human_stress_indices = 'NONE' or to 'FAST', it looks like it is failing in the QSat_2 routine which is used when calc_human_stress_indices = 'ALL'
 

yifanc17

Yifan Cheng
New Member
Thank you Keith! Yes earlier today I have tried to set calc_human_stress_indices='NONE' and removed all relevant variables from user_nl_clm and the case run through 2008 successfully. I'll try to look into it further to see why the WBT can't be calculated properly, the case directory is `/glade/work/yifanc17/cases/i2000Clm50Sp.CTSM5.2.ERA5Land.0.125nldas2.TEST.c240603` if you want to take a look some time.
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
We have had problems with that calculation previously although it seemed to have been fixed:


It could be there is still a problem with that calculation or with the inputs to that calculation (2-m temperature, specific humidity, etc.).
In either case, it would be good to know why it's failing. Probably have to insert some write statements to see what the inputs are.
If you want heat stress indices from your simulation, you can still output some of them if you set calc_human_stress_indices = 'FAST'.
 

yifanc17

Yifan Cheng
New Member
We have had problems with that calculation previously although it seemed to have been fixed:


It could be there is still a problem with that calculation or with the inputs to that calculation (2-m temperature, specific humidity, etc.).
In either case, it would be good to know why it's failing. Probably have to insert some write statements to see what the inputs are.
If you want heat stress indices from your simulation, you can still output some of them if you set calc_human_stress_indices = 'FAST'.
Thank you Keith! I'll look into it and update here once I find out the issue!
 
Status
Not open for further replies.
Top