Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

During branch runs: NaN found in field Fall_sen

mariuslam

Marius Lambert
New Member
Hello,

I recently uploaded recent CTSM and FATES versions: ctsm5.1.dev092 and fates-sci.1.57.0_api.23.

When I start a branch run from a restart file (after the spin-up was successful). I get the error:
clm: leaving fates model 1 1
# of NaNs = 1
Which are NaNs = T
NaN found in field Fall_sen at gridcell index 1
ERROR: ERROR: One or more of the output from CLM to the coupler are NaN


Does anyone has experienced this before or has some ideas of how to tackle this ?

Best regards,

Marius
 

glemieux

New Member
Hi Marius, I'm seeing something similar on my local workstation:
Code:
clm: leaving fates model           1           1
 # of NaNs =            1
 Which are NaNs =  T
 NaN found in field Faxa_lwdn at gridcell index            1
 ERROR:  ERROR: One or more of the output from CLM to the coupler are NaN

I'm seeing this failure after 7 years of a 10 year run for a simple 1x1_brazil case using the nuopc driver. I'm reverting back to the mct driver to see if that replicates the same issue.
 

glemieux

New Member
I can confirm that this completed successfully on the mct driver. I'm going to try and replicate this on a different machine next (Cheyenne).
 

mariuslam

Marius Lambert
New Member
Hi @glemieux ,

I have tried
1) mct driver and did not succeed:
--------lnd.log------------------
clm: leaving fates model 1 1
# of NaNs = 2
Which are NaNs = F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
F F F F F F T F F F F F F F T F F F F F F F F F F F F F F F
NaN found in field ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ at gridcell index
38
NaN found in field ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ at gridcell index
46
ERROR: ERROR: One or more of the output from CLM to the coupler are NaN
-----------------------------------------------
2) DEBUG=true did not succeed
-------------cesm.log-------------------------
Caught signal 8 (Floating point exception: floating-point invalid operation)

/cluster/home/marlam/CTSM_single_site_hardening_pv_cosmo_28_04/src/fates/biogeophys/FatesPlantHydraulicsMod.F90: [ fatesplanthydraulicsmod_mp_fusecohorthydraulics_() ]
...
1303 do k=1,n_hypool_ag
1304 vol_c1 = currentCohort%n*ccohort_hydr%th_ag(k)*ccohort_hydr%v_ag_init(k)
1305 vol_c2 = nextCohort%n*ncohort_hydr%th_ag(k)*ncohort_hydr%v_ag(k)
==> 1306 ccohort_hydr%th_ag(k) = (vol_c1+vol_c2)/(ccohort_hydr%v_ag(k)*newn)
1307 end do
1308
1309 vol_c1 = currentCohort%n*ccohort_hydr%th_troot*ccohort_hydr%v_troot_init

==== backtrace (tid: 40087) ====
-----------------------------------------------

So It might not be the same issue? seems like hydro might be involved.
Cheers,
Marius
 
Top