Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

ERROR: One or more of the CTSM cap export_1D fields are NaN

Status
Not open for further replies.

Yuan Sun

Yuan Sun
Member
Hi all,

I am runing IHIST at 0.05° (lnd), 0.5° (datm), UK domain. I met a similar error for several times.
# of NaNs = 1
Which are NaNs = F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F T F F F F F F F F F F F F F F F F F F
NaN found in field Sl_lfrin at gridcell index/lon/lat: 113 354.82499999999999 59.575000000000003
ERROR: ERROR: One or more of the CTSM cap export_1D fields are NaN

The gridcells found NaN varied over simulation. For example,
# of NaNs = 1
Which are NaNs = F F F F F F F F F F F F F F F F F F F T F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
NaN found in field Sl_lfrin at gridcell index/lon/lat: 20 0.42500000000000010 49.974999999999994
ERROR: ERROR: One or more of the CTSM cap export_1D fields are NaN

# of NaNs = 1
Which are NaNs = F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F T F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F
NaN found in field Sl_lfrin at gridcell index/lon/lat: 172 354.27499999999998 58.875000000000000
ERROR: ERROR: One or more of the CTSM cap export_1D fields are NaN

It seemed that the mediator got Nan from the land model. This error might result from the PE layout that bad datm values were sent to clm. I tried to adjust the PE layout using different schemes for many times. After I add ./xmlchange NTHRDS_ATM=4 and the error disappears.

The simulation works with:
NTASKS: ['CPL:8', 'ATM:8', 'LND:8', 'ICE:8', 'OCN:8', 'ROF:8', 'GLC:8', 'WAV:8', 'ESP:8']
ROOTPE: ['CPL:0', 'ATM:0', 'LND:0', 'ICE:0', 'OCN:0', 'ROF:0', 'GLC:0', 'WAV:0', 'ESP:0']
NTHRDS: ['CPL:1', 'ATM:4', 'LND:1', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']
nodes: 2
total tasks: 8
tasks per node: 4
thread count: 4
ngpus per node: 0

But I am not sure of the principle behind it. The PE layout looks like a mystery to me. Maybe DATM needs more interpolation threads for 0.5° forcing to 0.05° land grid cells?

Thanks for any comments.

Best,
Yuan
 

Yuan Sun

Yuan Sun
Member
I checked the output and found that the spin-up outputs (i ran from a cold start for the UK region) contain nan values. How to solve it?

Best,
Yuan
 

Attachments

  • 截屏2024-06-10 11.02.37.png
    截屏2024-06-10 11.02.37.png
    212.8 KB · Views: 6

slevis

Moderator
Staff member
Did you generate the input files (surface and datm) for the UK domain? If so, I recommend looking for issues with your input files. It could be useful to compare your input files with the model's default input files that work. This may give you insight into the problem.
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
To add to this, I see this in your output:

NaN found in field Sl_lfrin at gridcell index/lon/lat

The lfrin field is land model fraction so maybe there is a problem with how that is being computed/specified.
 

Yuan Sun

Yuan Sun
Member
To add to this, I see this in your output:

NaN found in field Sl_lfrin at gridcell index/lon/lat

The lfrin field is land model fraction so maybe there is a problem with how that is being computed/specified.
Hi Keith and Sam,

Thanks for your insight. I tried several PE layouts on another machine. One PE layout works using 1 node (128 cores).


NTASKS: ['CPL:48', 'ATM:16', 'LND:48', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']
ROOTPE: ['CPL:16', 'ATM:0', 'LND:16', 'ICE:0', 'OCN:0', 'ROF:0', 'GLC:0', 'WAV:0', 'ESP:0']
NTHRDS: ['CPL:1', 'ATM:2', 'LND:1', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']
nodes: 1
total tasks: 64
tasks per node: 64
thread count: 2
ngpus per node: 0

Best,
Yuan
 

Attachments

  • 截屏2024-06-13 09.32.59.png
    截屏2024-06-13 09.32.59.png
    120.9 KB · Views: 3
Status
Not open for further replies.
Top