Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

model crashes after 2 years in F_1850_WACCM compset

Dear all,

I am currently running the F1850W compset of WACCM4/CESM103 with constant solar MINIMUM conditions. The only one change I have made in order to run this experiment is in the "&solar_inparm" part of the atm_in namelist.

After 2 years of simulation, the model crashes on DATE=0003/01/09 with the following error (from the ccsm.log-file):

"QNEG3 from TPHYSBCb:m= 64 lat/lchnk= 422 Min. mixing ratio violated at 2 points. Reset to 1.0E+00 Worst = 9.6E-01 at i,k= 1 19
QNEG3 from TPHYSBCb:m= 64 lat/lchnk= 569 Min. mixing ratio violated at 1 points. Reset to 1.0E+00 Worst = 9.8E-01 at i,k= 1 22
QNEG3 from TPHYSBCb:m= 64 lat/lchnk= 198 Min. mixing ratio violated at 2 points. Reset to 1.0E+00 Worst = 9.6E-01 at i,k= 1 22

DADADJ: Convergence criterion doubled to EPS=.4000E-04 for
DRY CONVECTIVE ADJUSTMENT at Lat,Lon= 2 74

DADADJ: Convergence criterion doubled to EPS=.8000E-04 for
DRY CONVECTIVE ADJUSTMENT at Lat,Lon= 2 74

QNEG3 from TPHYSBCb:m= 64 lat/lchnk= 485 Min. mixing ratio violated at 1 points. Reset to 1.0E+00 Worst = 9.5E-01 at i,k= 1 21
BalanceCheck: soil balance error nstep = 35450 point = 11152 imbalance = -0.000001 W/m2
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 2.112939840929127E+019
1.266144134906977E+019 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 1 53
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 1.266144134906977E+019
7.375644122989000E+018 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 1 54
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 7.375644122989000E+018
3.995298524930791E+018 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 1 55
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 3.995298524930791E+018
2.010472255632644E+018 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 1 56
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 2.010472255632644E+018
1.003819814245374E+018 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 1 57
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 1.003819814245374E+018
5.169447083921386E+017 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 1 58
imp_sol: Time step 1.8000000000000E+03 failed to converge @ (lchnk,lev,col,nstep) = 538 15 1 35450
imp_sol : @ (lchnk,lev,col) = 538 15 1 failed
1 times
calc_o2srb : o2col(k:k+1),xscho2(k:k+1,i) = 1.742126696235637E+016
1.655020361423855E+016 0.000000000000000E+000 0.000000000000000E+000
@ i,k = 16 65
forrtl: severe (174): SIGSEGV, segmentation fault occurred

No errors are printed in the other (atm-ocn-ice-lnd) log files.
It seems like there is a convergence problem at Lat,Lon= 2 74 (south pole?). I have checked the DOCN-SST file and all other input files for any missing values or so on on the t-step 030109 but found none. The co2vmr value (287E-06) is set in the atm_in namelist and is read correctly by the model.

Do you know what is going on in this experiment?

Has this compset been validated on a long-(30yr) run ?

Thank you!

Regards
 

santos

Member
Hi Gabriel,

Sorry about the delay in responding. Are you still struggling with this? I am not familiar with this bug, but I can try to check for you. I believe that for a release such as 1.0.3, a longer run probably has been done, but that was before I was really involved with CESM, so I would need to ask around.

Unfortunately SIGSEGV is not a very specific error, unless you also had some useful information in files from a core dump. Was a traceback generated anywhere (like "core" files in the run directory?). In the meantime, have you tried to repeat this run with the DEBUG flag turned on? This should turn on floating-point trapping, which may make the source of the problem more clear.

The convergence problem should not have led to this error, since the chemistry package usually will reduce the time step and try again. I would hesitate to say that this line is related to the segmentation fault, without finding a clearer connection.

If you already found a solution to this problem, please let me know so that I can make a note of it, and so that it will show up in future searches here. We are planning some improvements/fixes to these forums behind the scene, hoping to make them easier to navigate, and easier for us to monitor reliably.
 

fvitt

CSEG and Liaisons
Staff member
We have ran this compset 9 years without any problems using the 1.0.3 release version. This is using the default namelist settings one gets with the F1850W compset. Can you tell us specifically what solar inputs you are using?
 
Top