Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Model stopped due to methane flux errors

KeerZ

Member
Hi All,

I am running a simulation using I2000Clm50BgcCropGs compset. This simulation was driven by ERA5 atm forcing and a modified irrigation scheme. And I am using cesm2.1.2.

I made the following changes to the irrigation scheme to imitate flooding irrigation of rice paddies (Trying to modify crop parameters to imitate flooding irrigation):
1. I changed all rainfed rice in the surface data into irrigated rice (because both rainfed and irrigated rice are flooded).
2. The irrigated rice is always irrigated throughout the year, no matter how small its LAI is.
3. I changed the target soil moisture of the irrigated rice to be the saturated soil moisture of that soil layer (set the relsat_target_col=1.0). Also, the irrig_length for rice is always 86400s (one day).

Basically, these modifications successfully increased the irrigation flux and available soil moisture of rice paddies.

After I spin up the model for 36 years, the simulation crashed with an error in methane flux and a warning of soil balance error like this:

489: Negative conc. in ch4tran. c,j,deficit (mol): 1239857 1
489: 1.031848063130897E-003

9: WARNING: BalanceCheck: soil balance error (W/m2)
9: nstep = 1261573
9: errsoi_col = -1.000239310278150E-005
0: size=17231646 rss=149139 share=11153 text=6093 datastack=0
0: size=17231646 rss=149143 share=11153 text=6093 datastack=0
0: size=17231646 rss=149191 share=11201 text=6093 datastack=0
0: size=17231646 rss=149191 share=11201 text=6093 datastack=0
0: size=17231646 rss=149191 share=11201 text=6093 datastack=0
0: size=17231646 rss=149191 share=11201 text=6093 datastack=0
0: size=17231646 rss=149191 share=11201 text=6093 datastack=0
0: size=17231646 rss=149205 share=11215 text=6093 datastack=0


Please find the attached files for log files. I am wondering why the intensified irrigation activities for rice paddies will cause an error in methane flux? If my research only focuses on the biophysical variables, can I simply turn off the methane model by setting use_lch4 = .false. and ignore this error? As for the soil balance error warning, I noticed that the warnings only happen in several steps and the errsoi_col is just slightly larger than the threshold of 1e-5, is it ok to continue running the model regardless of these warnings?

Any help is appreciated! Thank you very much!
 

Attachments

  • atm.log.2752114.chadmin1.ib0.cheyenne.ucar.edu.220204-222840.zip
    76.9 KB · Views: 5
  • cesm.log.2752114.chadmin1.ib0.cheyenne.ucar.edu.220204-222840.zip
    16.7 KB · Views: 4
  • cpl.log.2752114.chadmin1.ib0.cheyenne.ucar.edu.220204-222840.zip
    9.1 KB · Views: 2
  • lnd.log.2752114.chadmin1.ib0.cheyenne.ucar.edu.220204-222840.zip
    57.4 KB · Views: 5

KeerZ

Member
Also, I think the ERA5 atm forcing data I created is not problematic because I successfully ran a test simulation with the default irrigation scheme driven by the new atm forcing.
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
The methane message is not really an error message and isn't necessarily causing the model to crash. The message is informative in that it is telling you that it has found a negative concentration. It then sets it to zero per this code in ch4Mod.F90:

if (conc_ch4_rel(c,j) < 0._r8) then
deficit = - conc_ch4_rel(c,j)*epsilon_t(c,j,1)*dz(c,j) ! Mol/m^2 added
if (deficit > 1.e-3_r8 * scale_factor_gasdiff) then
if (deficit > 1.e-2_r8) then
write(iulog,*)'Note: sink > source in ch4_tran, sources are changing '// &
' quickly relative to diffusion timestep, and/or diffusion is rapid.'
g = col%gridcell(c)
write(iulog,*)'Latdeg,Londeg=',grc%latdeg(g),grc%londeg(g)
write(iulog,*)'This typically occurs when there is a larger than normal '// &
' diffusive flux.'
write(iulog,*)'If this occurs frequently, consider reducing land model (or '// &
' methane model) timestep, or reducing the max. sink per timestep in the methane model.'
end if
write(iulog,*) 'Negative conc. in ch4tran. c,j,deficit (mol):',c,j,deficit
end if
conc_ch4_rel(c,j) = 0._r8
! Subtract deficit
ch4_surf_diff(c) = ch4_surf_diff(c) - deficit/dtime_ch4
end if

On the other hand, those messages plus the soil balance error, despite being only slightly larger than the threshold, may be an early indication that something is going wrong with the simulation.
The reason the model is crashing is because of NaNs being passed from CLM to the coupler:

22: # of NaNs = 13
22: Which are NaNs = F F F F T T T T T F T F F F F F F F F F F F F F F F F F F F F
22: F F F T T T T T T F F F F F T F F F F F F F F F F F F F F F F F F F F F F F
22: Sl_tref
22: Sl_qref
22: Sl_t
22: Sl_fv
22: Sl_ram1
22: Sl_u10
22: Fall_taux
22: Fall_tauy
22: Fall_lat
22: Fall_sen
22: Fall_lwup
22: Fall_evap
22: Flrl_rofgwl
22: gridcell index = 12153
22: ENDRUN:
22: ERROR:
22: lnd_export ERROR: One or more of the output from CLM to t
22: he coupler are NaN

So, there are 13 fields that are NaNs.
You could try turning off the methane model, but I'm not sure that would fix anything, it shouldn't be affecting the rest of the model.
If you are sure the atmospheric forcing is ok, I guess you could try to look at output for that gridcell (12153) to see what is going wrong.
 

KeerZ

Member
The methane message is not really an error message and isn't necessarily causing the model to crash. The message is informative in that it is telling you that it has found a negative concentration. It then sets it to zero per this code in ch4Mod.F90:

if (conc_ch4_rel(c,j) < 0._r8) then
deficit = - conc_ch4_rel(c,j)*epsilon_t(c,j,1)*dz(c,j) ! Mol/m^2 added
if (deficit > 1.e-3_r8 * scale_factor_gasdiff) then
if (deficit > 1.e-2_r8) then
write(iulog,*)'Note: sink > source in ch4_tran, sources are changing '// &
' quickly relative to diffusion timestep, and/or diffusion is rapid.'
g = col%gridcell(c)
write(iulog,*)'Latdeg,Londeg=',grc%latdeg(g),grc%londeg(g)
write(iulog,*)'This typically occurs when there is a larger than normal '// &
' diffusive flux.'
write(iulog,*)'If this occurs frequently, consider reducing land model (or '// &
' methane model) timestep, or reducing the max. sink per timestep in the methane model.'
end if
write(iulog,*) 'Negative conc. in ch4tran. c,j,deficit (mol):',c,j,deficit
end if
conc_ch4_rel(c,j) = 0._r8
! Subtract deficit
ch4_surf_diff(c) = ch4_surf_diff(c) - deficit/dtime_ch4
end if

On the other hand, those messages plus the soil balance error, despite being only slightly larger than the threshold, may be an early indication that something is going wrong with the simulation.
The reason the model is crashing is because of NaNs being passed from CLM to the coupler:

22: # of NaNs = 13
22: Which are NaNs = F F F F T T T T T F T F F F F F F F F F F F F F F F F F F F F
22: F F F T T T T T T F F F F F T F F F F F F F F F F F F F F F F F F F F F F F
22: Sl_tref
22: Sl_qref
22: Sl_t
22: Sl_fv
22: Sl_ram1
22: Sl_u10
22: Fall_taux
22: Fall_tauy
22: Fall_lat
22: Fall_sen
22: Fall_lwup
22: Fall_evap
22: Flrl_rofgwl
22: gridcell index = 12153
22: ENDRUN:
22: ERROR:
22: lnd_export ERROR: One or more of the output from CLM to t
22: he coupler are NaN

So, there are 13 fields that are NaNs.
You could try turning off the methane model, but I'm not sure that would fix anything, it shouldn't be affecting the rest of the model.
If you are sure the atmospheric forcing is ok, I guess you could try to look at output for that gridcell (12153) to see what is going wrong.
I see. Thank you, Keith! I now realized that the logs of many previous years also reported the warnings of 'Negative conc. in ch4tran' and 'soil balance error'.

I guess the model crashed because the changes I made in the irrigation scheme may lead to some unreasonable soil states and they are accumulated during the simulation. I'll try setting DEBUG=TRUE and see what I can do.
 

KeerZ

Member
The methane message is not really an error message and isn't necessarily causing the model to crash. The message is informative in that it is telling you that it has found a negative concentration. It then sets it to zero per this code in ch4Mod.F90:

if (conc_ch4_rel(c,j) < 0._r8) then
deficit = - conc_ch4_rel(c,j)*epsilon_t(c,j,1)*dz(c,j) ! Mol/m^2 added
if (deficit > 1.e-3_r8 * scale_factor_gasdiff) then
if (deficit > 1.e-2_r8) then
write(iulog,*)'Note: sink > source in ch4_tran, sources are changing '// &
' quickly relative to diffusion timestep, and/or diffusion is rapid.'
g = col%gridcell(c)
write(iulog,*)'Latdeg,Londeg=',grc%latdeg(g),grc%londeg(g)
write(iulog,*)'This typically occurs when there is a larger than normal '// &
' diffusive flux.'
write(iulog,*)'If this occurs frequently, consider reducing land model (or '// &
' methane model) timestep, or reducing the max. sink per timestep in the methane model.'
end if
write(iulog,*) 'Negative conc. in ch4tran. c,j,deficit (mol):',c,j,deficit
end if
conc_ch4_rel(c,j) = 0._r8
! Subtract deficit
ch4_surf_diff(c) = ch4_surf_diff(c) - deficit/dtime_ch4
end if

On the other hand, those messages plus the soil balance error, despite being only slightly larger than the threshold, may be an early indication that something is going wrong with the simulation.
The reason the model is crashing is because of NaNs being passed from CLM to the coupler:

22: # of NaNs = 13
22: Which are NaNs = F F F F T T T T T F T F F F F F F F F F F F F F F F F F F F F
22: F F F T T T T T T F F F F F T F F F F F F F F F F F F F F F F F F F F F F F
22: Sl_tref
22: Sl_qref
22: Sl_t
22: Sl_fv
22: Sl_ram1
22: Sl_u10
22: Fall_taux
22: Fall_tauy
22: Fall_lat
22: Fall_sen
22: Fall_lwup
22: Fall_evap
22: Flrl_rofgwl
22: gridcell index = 12153
22: ENDRUN:
22: ERROR:
22: lnd_export ERROR: One or more of the output from CLM to t
22: he coupler are NaN

So, there are 13 fields that are NaNs.
You could try turning off the methane model, but I'm not sure that would fix anything, it shouldn't be affecting the rest of the model.
If you are sure the atmospheric forcing is ok, I guess you could try to look at output for that gridcell (12153) to see what is going wrong.
Hello Keith, I am not sure how to find the latitude and longitude of the gridcell (12153). I have pft-level output files of the simulation. Is it correct that the 12153rd outputs in grid1d_lat and grid1d_lon are the lat and lon of gridcell (12153)? Thank you!
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
I think that would be correct. There are cases where the gridcell/landunit/column/patch indices are actually local not global across all processors. But I think that is a global index in this case.
But one way to be sure you find the right lat/lon is to add something like this to the lnd_import_export code in the error message:

write(iulog,*)'latdeg: ',grc%latdeg(g)
write(iulog,*)'londeg: ',grc%londeg(g)

You'd also need to add this to the lnd_import_export module:

use GridcellType , only : grc

Once you find the lat/lon, you could use the point_of_interest module to help find what patch, column, etc. might be causing the problem as described here:


You could also maybe simplify your troubleshooting by creating a single point case that runs over the offending grid cell only, instead of troubleshooting the global simulation.
 

KeerZ

Member
I think that would be correct. There are cases where the gridcell/landunit/column/patch indices are actually local not global across all processors. But I think that is a global index in this case.
But one way to be sure you find the right lat/lon is to add something like this to the lnd_import_export code in the error message:

write(iulog,*)'latdeg: ',grc%latdeg(g)
write(iulog,*)'londeg: ',grc%londeg(g)

You'd also need to add this to the lnd_import_export module:

use GridcellType , only : grc

Once you find the lat/lon, you could use the point_of_interest module to help find what patch, column, etc. might be causing the problem as described here:


You could also maybe simplify your troubleshooting by creating a single point case that runs over the offending grid cell only, instead of troubleshooting the global simulation.
Got it! Thanks for the detailed explanations!
 
Top