Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Output from CLM5 to the coupler are NaN

johanna_teresa

Johanna Malle
New Member
Hi,
I am having problems with a regional run, more specifically I am trying to run quite a high resolution (~1km) domain with my own meteorological forcing data. I have created a surface datasets and domain file (used for ATM_DOMAIN_FILE and LND_DOMAIN_FILE, see attached) which I am also using as my domain in the datm.stream files. Lat and long in my met data forcing files are identical to lat and long in the surface dataset. The created surface dataset in combination with lnd and atmospheric domain runs without problems with the CRUNCEP v7 meteorological input data.
The model usually runs fine for 10-1000 time steps before it crashes due to "ERROR: One or more of the output from CLM to the coupler are NaN". Sl_t, the radiative temperature (lnd2atm_inst%t_rad_grc(g) = sqrt(sqrt(lnd2atm_inst%eflx_lwrad_out_grc(g)/sb))) seems to be the problem, and after writing out all sorts of diagnostic variables I could confirm that it is actually due to outgoing longwave radiation (eflx_lwrad_out_grc(g)) turning negative, which then gives an NaN. It always seems to crash at a slightly different grid cell - have double checked my input data and it all seems sensible. When I attempted to mask out certain grid cells via the mask variable in the domain file (=set to 0) it seemed to still run over the entire domain. I have also tried with and without longwave radiation as a direct input variable, which doesn't make a difference. If I run in DEBUG mode, it crashes in... clm/src/main/subgridAveMod.F90 at line 1067 during the c2g_2d subroutine. Any help/suggestions would be greatly appreciated.
Thanks!
 

Attachments

  • version.txt
    7.4 KB · Views: 2
  • cesm.log.3738657.txt
    11.1 KB · Views: 7
  • atm.log.3738657.txt
    555.5 KB · Views: 4
  • datm_in.txt
    1.7 KB · Views: 3
  • drv_in.txt
    6.4 KB · Views: 2
  • lnd_in.txt
    7.8 KB · Views: 2
  • lnd.log.3738657.txt.zip
    54.9 KB · Views: 6
  • domain.lnd.CH_1km_navy.210407.nc.zip
    220.7 KB · Views: 3

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Well you've already tried a number of things I would have tried.
Have you tried running with mapalgo set to 'nn' instead of 'bilinear'? Although I guess if the forcing and surface grids are identical it shouldn't matter.
You could also try running with a cold start, maybe there is something in how your initial file is being mapped to the new domain (CLM_FORCE_COLDSTART = on in env_run.xml).
Otherwise, I'd suggest that since you know the gridcell id (41238), you could try looking at the longwave in the code and see what pft/column(s) associated with that gridcell might be causing the problem. Maybe here in SoilFluxesMod.F90:

eflx_lwrad_out(p) = ulrad(p) &
+ (1-frac_veg_nosno(p))*(1.-emg(c))*forc_lwrad(c) &
+ (1-frac_veg_nosno(p))*emg(c)*sb*lw_grnd &
+ 4._r8*emg(c)*sb*t_grnd0(c)**3*tinc(c)

In particular, you could verify here that the incoming longwave (forc_lwrad) is reasonable and see what is driving negative eflx_lwrad_out.
 
Top