Dear
@Chiru,
1) The "WARNING: water balance error" message appears to be a model warning (not an error that causes a crash), which distracts from the real problem. Here is how I determined that:
a) The warnings appear in the cesm.log from the beginning of the simulation, long before the crash.
b) The traceback at the end of the cesm.log points you to line 3561 of SnowHydrologyMod.
c) Scroll some lines up in the cesm.log and you will see:
ERROR: capping procedure failed (negative mass remaining) c = 159876
h2osoi_ice_bottom = 1.746391353663057E-005 h2osoi_liq_bottom =
-4.059418274617033E-013
calling getglobalwrite with decomp_index= 159876 and clmlevel= column
local column index = 159876
global column index = 241284
global landunit index = 94112
global gridcell index = 28563
gridcell longitude = 75.8500000000000
gridcell latitude = 36.1500000000000
column type = 215
landunit type = 2
ENDRUN:
ERROR: ERROR in SnowHydrologyMod.F90 at line 3561
I don't know if the lon/lat are within the domain that you intended to simulate.
I don't know what may cause the negative
h2osoi_liq_bottom.
d) The atm.log ends with:
(shr_dmodel_readstrm) file ub: /scratch/civil/phd/cez218275/IMDAA_025_deg_2000_2001/Solar/Regrid_ncum_imdaa_reanl_HR_DSWRF-sfc_2000010100-2000123123.nc 4686
(shr_dmodel_readstrm) file ub: /scratch/civil/phd/cez218275/IMDAA_025_deg_2000_2001/TPWL/regrid_ncum_imdaa_reanl_HR_TPWL_2000010100-2000123123.nc 4686
(datm_comp_run) atm: model date 20000714 19800s
(datm_comp_run) atm: model date 20000714 21600s
(shr_dmodel_readstrm) file ub: /scratch/civil/phd/cez218275/IMDAA_025_deg_2000_2001/Solar/Regrid_ncum_imdaa_reanl_HR_DSWRF-sfc_2000010100-2000123123.nc 4687
(shr_dmodel_readstrm) file ub: /scratch/civil/phd/cez218275/IMDAA_025_deg_2000_2001/TPWL/regrid_ncum_imdaa_reanl_HR_TPWL_2000010100-2000123123.nc 4687
(datm_comp_run) atm: model date 20000714 23400s
If this is truly the last output from the atmosphere (which it may NOT be if some output remained in a buffer and didn't make it to the atm.log), then the negative
h2osoi_liq_bottom may be caused by some inconsistency in one or more of the last datm files read by the model.
2) The investigation in (1) is an example of the first steps of troubleshooting that you may need to perform to resolve this and other model errors.
@Chiru I would like to take this opportunity to make a kind request of everyone on the CESM forum:
Please spend time troubleshooting things yourself before turning to the forum for help. I say this for two reasons:
1) All model users and model developers (including the most experienced) run into problems when they try new things. The way you become truly experienced is by troubleshooting things that don't work. This forum is good for posting questions if you have spent time troubleshooting and are still stuck.
2) The number of model users is constantly increasing (this is good) and
@oleson and I already answer dozens of questions per week. Ideally we should only need to answer questions for users who cannot make progress in their work after all their troubleshooting.
@Chiru I apologize if my comment does not apply to you and you already did significant troubleshooting before posting the last question, and please understand that I am trying to make this point as tactfully as possible.
Sincerely,
Sam Levis