Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

ERROR: component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global

Amit K Sharma

New Member
Hi,
I am trying to run a simulation using user defined compset with CAM6.3 (cesm2.2.0), with nudged meteorology. I am getting following error in the cesm log file.

"""
Reading setup_nml
Reading grid_nml
Reading tracer_nml
Reading thermo_nml
Reading dynamics_nml
Reading shortwave_nml
Reading ponds_nml
Reading forcing_nml
Reading zbgc_nml
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
calcsize j,iq,jac, lsfrm,lstoo 1 1 1 26 21
calcsize j,iq,jac, lsfrm,lstoo 1 1 2 26 21
calcsize j,iq,jac, lsfrm,lstoo 1 2 1 22 15
calcsize j,iq,jac, lsfrm,lstoo 1 2 2 22 15
calcsize j,iq,jac, lsfrm,lstoo 1 3 1 24 17
calcsize j,iq,jac, lsfrm,lstoo 1 3 2 24 17
calcsize j,iq,jac, lsfrm,lstoo 1 4 1 25 20
calcsize j,iq,jac, lsfrm,lstoo 1 4 2 25 20
calcsize j,iq,jac, lsfrm,lstoo 1 5 1 23 19
calcsize j,iq,jac, lsfrm,lstoo 1 5 2 23 19
calcsize j,iq,jac, lsfrm,lstoo 2 1 1 21 26
calcsize j,iq,jac, lsfrm,lstoo 2 1 2 21 26
calcsize j,iq,jac, lsfrm,lstoo 2 2 1 15 22
calcsize j,iq,jac, lsfrm,lstoo 2 2 2 15 22
calcsize j,iq,jac, lsfrm,lstoo 2 3 1 17 24
calcsize j,iq,jac, lsfrm,lstoo 2 3 2 17 24
calcsize j,iq,jac, lsfrm,lstoo 2 4 1 20 25
calcsize j,iq,jac, lsfrm,lstoo 2 4 2 20 25
calcsize j,iq,jac, lsfrm,lstoo 2 5 1 19 23
calcsize j,iq,jac, lsfrm,lstoo 2 5 2 19 23

ccm kohlerc - no real(r8) solution found (quartic)
roots = (-4.602846360405086E-003,3.469704834913952E-003)
(-1.867239287602387E-002,-7.669720073640333E-006)
(7.361637797088324E-003,1.613662217523574E-006)
(-4.600839267865783E-003,-3.463648777057835E-003)
p0-p3 = -4.562937802347023E-009 -8.896751268877116E-007 0.000000000000000E+000
2.051444070720641E-002
rh= 0.942839343260192
setting radius to dry radius= 6.058916340292734E-003
ccm kohlerc - no real(r8) solution found (quartic)
roots = (-4.602846360405086E-003,3.469704834913952E-003)
(-1.867239287602387E-002,-7.669720073640333E-006)
(7.361637797088324E-003,1.613662217523574E-006)
(-4.600839267865783E-003,-3.463648777057835E-003)
p0-p3 = -4.562937802347023E-009 -8.896751268877116E-007 0.000000000000000E+000
2.051444070720641E-002
rh= 0.942839343260192
setting radius to dry radius= 6.058916340292734E-003
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 27886
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 27595
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 27885
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 28460
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 27312
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 27013
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 27014
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 28171
Image PC Routine Line Source
libnetcdf.so.19.0 00002AF8BD8CF0CA tracebackqq_ Unknown Unknown
cesm.exe 0000000003695A8D shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000003695B4D shr_abort_mod_mp_ 61 shr_abort_mod.F90
cesm.exe 000000000044A949 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000445C42 component_mod_mp_ 740 component_mod.F90
cesm.exe 0000000000427579 cime_comp_mod_mp_ 2823 cime_comp_mod.F90
cesm.exe 000000000044537B MAIN__ 133 cime_driver.F90
cesm.exe 0000000000424C32 Unknown Unknown Unknown
libc-2.17.so 00002AF8CC388495 __libc_start_main Unknown Unknown
cesm.exe 0000000000424B29 Unknown Unknown Unknown
Abort(1001) on node 308 (rank 308 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 308

"""
Can anyone suggest what might be the issue with the setup?

Thank you
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
I see a few other recent posts about NaN values in Sa_z. You can search the forums for "Sa_z" to see some of these recent posts and some suggested ideas – for example Startup run error in B1850 and ERROR: component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global and Error:component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global.

The next step I would suggest would be starting with a simpler, supported configuration – a predefined compset without nudged meteorology. If you get past the error there, then make one change at a time until you hit the error, to narrow down which change is responsible for this problem.

I am transferring this to the atmosphere forums, where someone may have more insight. However, if you would like further support, please include all of the information requested here: Information to include in help requests
 

Amit K Sharma

New Member
I see a few other recent posts about NaN values in Sa_z. You can search the forums for "Sa_z" to see some of these recent posts and some suggested ideas – for example Startup run error in B1850 and ERROR: component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global and Error:component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global.

The next step I would suggest would be starting with a simpler, supported configuration – a predefined compset without nudged meteorology. If you get past the error there, then make one change at a time until you hit the error, to narrow down which change is responsible for this problem.

I am transferring this to the atmosphere forums, where someone may have more insight. However, if you would like further support, please include all of the information requested here: Information to include in help requests
Sorry for the delayed response and thank you very much for your suggestion.

Actually, my simulations requires the change in refractive index of the anthropogenic aerosols. The above ERROR was also the result from the same experiment.

As suggested by you, I tried different sets of simulations with predefined compset (FHIST) without using nudged meteorological conditions. But, now I am facing different ERROR as stated below:

"""
ERROR: shr_assert_in_domain: state%t has invalid value NaN
at location: 15 1
Expected value to be a number.
ERROR: NaN produced in physics_state by package radheat.
ERROR: shr_assert_in_domain: state%t has invalid value NaN
at location: 15 1
Expected value to be a number.
ERROR: NaN produced in physics_state by package radheat.
Image PC Routine Line Source
libnetcdf.so.19.0 00002B1720BC60CA tracebackqq_ Unknown Unknown
cesm.exe 0000000003648C8D shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000003648D4D shr_abort_mod_mp_ 61 shr_abort_mod.F90
cesm.exe 00000000036521C1 shr_assert_mod_mp 325 shr_assert_mod.F90.in
cesm.exe 0000000000720F5B physics_types_mp_ 543 physics_types.F90
cesm.exe 000000000071E390 physics_types_mp_ 433 physics_types.F90
cesm.exe 00000000007442A4 physpkg_mp_tphysb 2626 physpkg.F90
cesm.exe 00000000007360C4 physpkg_mp_phys_r 1078 physpkg.F90
cesm.exe 000000000051DB4F cam_comp_mp_cam_r 259 cam_comp.F90
cesm.exe 000000000050F1FD atm_comp_mct_mp_a 521 atm_comp_mct.F90
cesm.exe 000000000044537F component_mod_mp_ 737 component_mod.F90
cesm.exe 0000000000427139 cime_comp_mod_mp_ 2823 cime_comp_mod.F90
cesm.exe 0000000000444F3B MAIN__ 133 cime_driver.F90
cesm.exe 00000000004247F2 Unknown Unknown Unknown
libc-2.17.so 00002B172F67F3D5 __libc_start_main Unknown Unknown
cesm.exe 00000000004246E9 Unknown Unknown Unknown
Abort(1001) on node 188 (rank 188 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 188
"""
Further, I tried different simulations by changing the refractive index of single aerosol species (for Black carbon, organics, and sulfate) one at a time, out of which simulations with BC and organics proceeded without any error, however, the one with changed refractive index of sulfate aerosols crashed with the same error mentioned above.

It will be of great help, if you could give some suggestions to me in order to perform these simulations.

Thank you

-Amit
 

Amit K Sharma

New Member
I see a few other recent posts about NaN values in Sa_z. You can search the forums for "Sa_z" to see some of these recent posts and some suggested ideas – for example Startup run error in B1850 and ERROR: component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global and Error:component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global.

The next step I would suggest would be starting with a simpler, supported configuration – a predefined compset without nudged meteorology. If you get past the error there, then make one change at a time until you hit the error, to narrow down which change is responsible for this problem.

I am transferring this to the atmosphere forums, where someone may have more insight. However, if you would like further support, please include all of the information requested here: Information to include in help requests
Also, the ./describe_version command gives this as an output.

------------------------------------------------------------------------
git describe:
cesm2.2.0-0-g332937b
------------------------------------------------------------------------

------------------------------------------------------------------------
git status:
# Not currently on any branch.
nothing to commit, working directory clean
------------------------------------------------------------------------

WARNING:root:WARNING: Ignoring unknown branch property, in /home/soft/centOS/apps/cesm/2.2.0/my_cesm_sandbox/cime/src/drivers/nuopc/.gitmodules
WARNING:root:WARNING: Ignoring unknown branch property, in /home/soft/centOS/apps/cesm/2.2.0/my_cesm_sandbox/cime/src/drivers/nuopc/.gitmodules
ERROR:root:SVN returned invalid XML message
Traceback (most recent call last):
File "./describe_version", line 71, in <module>
main()
File "./describe_version", line 67, in main
universal_newlines=True)
File "/home/soft/intel2020u4/intelpython3/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/home/soft/intel2020u4/intelpython3/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['manage_externals/checkout_externals', '--status', '--verbose', '--verbose']' returned non-zero exit status 1.
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
I'm not sure what's wrong with the describe_version command on your machine, but the information it was able to provide is a helpful start; the missing information would tell us whether you made any changes to the CAM or other model code; please tell us if you did.

My understanding from your last post is that you are able to run successfully if you don't make any changes, but if you change the refractive index of certain species, then CAM's radiation code produces a NaN (not-a-number) value for temperature. Often getting a NaN value is an indication that a parameterization is being run with inputs significantly outside the range for which it has been developed. In your case, the first thing I would try would be making smaller changes to the refractive indices. However, I don't have any experience with this part of the code myself, so I'll see if anyone else has anything else to add.
 

Amit K Sharma

New Member
Thank you for the response.

The error with the describe_version command may be because, rather than creating subversion repository i provided the inputdata files manually by downloading them.

For the error related to the refractive indices, I will try simulations as per the recommendation by implementing small changes in refractive indices first, in order to understand the issue more thoroughly.

I will notify on further updates.

Thank you

-Amit
 

Amit K Sharma

New Member
Dear Bill,

I wasn’t able to respond earlier due to some technical problems. As suggested by you earlier, I performed series of simulations while reducing the refractive indices of “soa” and “so4” aerosols individually. Different simulations were performed by reducing the refractive indices (RI) by 50%, 60%, 75%, 70% and 90%.

Out of these simulations:

so4 RI 50% reduction : ran for 1 month : crashed with “ ERROR: shr_assert_in_domain: state%t has invalid value NaN”
soa RI 50% reduction : ran successfully for 1 year successfully

soa RI 60% reduced : ran successfully for 4 months successfully
so4 RI 60% reduced : ran for 20 days before crashing with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

soa RI 70% reduced : ran successfully for 4 months successfully
so4 RI 70% reduced : ran for 1month 20 days , crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

soa RI 75%reduced : ran for 4days, crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”
so4 RI 75% reduced : ran for 20 days, crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

soa RI 90%reduced : crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”
so4 RI 90%reduced : crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

Also, the atm log from "so4 RI 60% reduced" simulation is attached herewith for the reference. Please suggest if I can try someting different to run these simulations.
 

Attachments

  • atm.log.2502020.pbshpc.211001-150803.txt
    918.5 KB · Views: 3

CallanGwd

Wendong Ge
New Member
Dear Bill,

I wasn’t able to respond earlier due to some technical problems. As suggested by you earlier, I performed series of simulations while reducing the refractive indices of “soa” and “so4” aerosols individually. Different simulations were performed by reducing the refractive indices (RI) by 50%, 60%, 75%, 70% and 90%.

Out of these simulations:

so4 RI 50% reduction : ran for 1 month : crashed with “ ERROR: shr_assert_in_domain: state%t has invalid value NaN”
soa RI 50% reduction : ran successfully for 1 year successfully

soa RI 60% reduced : ran successfully for 4 months successfully
so4 RI 60% reduced : ran for 20 days before crashing with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

soa RI 70% reduced : ran successfully for 4 months successfully
so4 RI 70% reduced : ran for 1month 20 days , crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

soa RI 75%reduced : ran for 4days, crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”
so4 RI 75% reduced : ran for 20 days, crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

soa RI 90%reduced : crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”
so4 RI 90%reduced : crashed with “ERROR: shr_assert_in_domain: state%t has invalid value NaN”

Also, the atm log from "so4 RI 60% reduced" simulation is attached herewith for the reference. Please suggest if I can try someting different to run these simulations.
Dear Amit,
These days I performed some simulations that were very similar to yours. And I also met the same error "component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global". The version of CESM I used is 2.1.3.

I tried to add the wet deposition of SO4 in wetdep.F90. Firstly, I succeeded to increase the wet deposition of SO4 to 2, 4, and 8 times the original, respectively, and ran two whole years successfully. However, when I tried to further x9 x10 x12 x15, even only x8.5, they were all failed after running different simulation days (maybe one month and a half, or maybe just a few days, according to the multiple. The more I increased, the less it ran). The only differences of the codes between successful cases and failed ones are the wet deposition multiple numbers.

I tried to adjust and optimize my modification in different ways, and also tried other compsets I used successfully before and tried to change the RUN_STARTDATE. Things could be a little bit better and cases could run with longer time. But they would still interrupt sooner or later and cannot run even a single year.

By the way, I also noticed that there are several posts that are related to the bug "component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global" and all of them were found in CESM2.1.3. Thus I wonder if this error is related to the model version?​

So I want to consult you that whether you have some ideas or suggestions these days?

Thank you!

(The original error is:
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 34735
)
 

Amit K Sharma

New Member
Dear Wedong Ge,

I would like to help but, I haven't had any lead on the current ERROR. I am also searching for solutions in order to ran these simulations successfully. I tried some of the solutions (which were relevant for me), the link of which are provided above in thread by Bill, but I was not successful.

Thanks
 

CallanGwd

Wendong Ge
New Member
A few days ago, I tried another very old compset FMOZ (I used before) and both so4_wetdep_x10 and x12 cases ran succesfully! So maybe you could consider trying more different compsets and try again~
 

CESM researcher

HW doctor
New Member
Sorry for the delayed response and thank you very much for your suggestion.

Actually, my simulations requires the change in refractive index of the anthropogenic aerosols. The above ERROR was also the result from the same experiment.

As suggested by you, I tried different sets of simulations with predefined compset (FHIST) without using nudged meteorological conditions. But, now I am facing different ERROR as stated below:

"""
ERROR: shr_assert_in_domain: state%t has invalid value NaN
at location: 15 1
Expected value to be a number.
ERROR: NaN produced in physics_state by package radheat.
ERROR: shr_assert_in_domain: state%t has invalid value NaN
at location: 15 1
Expected value to be a number.
ERROR: NaN produced in physics_state by package radheat.
Image PC Routine Line Source
libnetcdf.so.19.0 00002B1720BC60CA tracebackqq_ Unknown Unknown
cesm.exe 0000000003648C8D shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000003648D4D shr_abort_mod_mp_ 61 shr_abort_mod.F90
cesm.exe 00000000036521C1 shr_assert_mod_mp 325 shr_assert_mod.F90.in
cesm.exe 0000000000720F5B physics_types_mp_ 543 physics_types.F90
cesm.exe 000000000071E390 physics_types_mp_ 433 physics_types.F90
cesm.exe 00000000007442A4 physpkg_mp_tphysb 2626 physpkg.F90
cesm.exe 00000000007360C4 physpkg_mp_phys_r 1078 physpkg.F90
cesm.exe 000000000051DB4F cam_comp_mp_cam_r 259 cam_comp.F90
cesm.exe 000000000050F1FD atm_comp_mct_mp_a 521 atm_comp_mct.F90
cesm.exe 000000000044537F component_mod_mp_ 737 component_mod.F90
cesm.exe 0000000000427139 cime_comp_mod_mp_ 2823 cime_comp_mod.F90
cesm.exe 0000000000444F3B MAIN__ 133 cime_driver.F90
cesm.exe 00000000004247F2 Unknown Unknown Unknown
libc-2.17.so 00002B172F67F3D5 __libc_start_main Unknown Unknown
cesm.exe 00000000004246E9 Unknown Unknown Unknown
Abort(1001) on node 188 (rank 188 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 188
"""
Further, I tried different simulations by changing the refractive index of single aerosol species (for Black carbon, organics, and sulfate) one at a time, out of which simulations with BC and organics proceeded without any error, however, the one with changed refractive index of sulfate aerosols crashed with the same error mentioned above.

It will be of great help, if you could give some suggestions to me in order to perform these simulations.

Thank you

-Amit
Hi Amit

I had a similar problem recently,I want tried different simulations by changing the refractive index of single aerosol species (for Black carbon, organics, and sulfate) one at a time.(cesm1.2.2-cam5)
It will be of great help, if you could give some guidance to me。How to Set parameters?
 
Top