Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

FHIST_BGC stop after running 7 years

Hi all,

I am running FHIST_BGC from 1982 to 1992 and I modified the flbe_file in order to cycle co2 level in 1850 and flanduse_timeseries to prescribe pft in 1982. It runs successfully from 1982-01-01 to 1989-12-01. But it stopped after that.

The following is my script:
1: =====spin up for CNTL 1982 LULC and CO2 preindustrial (284.7 ppm), compset is FHIST_BGC ======
cd /glade/p/cesm/releases/cesm2_1_1/cime/scripts
./create_newcase --case ~/cases/FHISTspin --res f09_g17 --compset FHIST_BGC --project UDRT0013 --run-unsupported
cd ~/cases/FHISTspin/
./xmlchange INFO_DBUG=3,STOP_N=2,STOP_OPTION=nyears

./xmlchange RUN_STARTDATE=1982-01-01
./xmlchange DOUT_S_SAVE_INTERIM_RESTART_FILES=TRUE,REST_OPTION=nmonths,REST_N=1

env_mach_specific.xml
erased "labelstdout" argument (-p "%g:") in env_mach_specific.xml
mpiexec_mpt -p "%g:" omplace ${CASEROOT}/bld/cesm.exe >&! cesm.log.$LID

./case.setup

edit/check user_nl_clm

flanduse_timeseries = '/glade/scratch/yahe/lulcdata/landuse.timeseries_0.9x1.25_hist_78pfts_CMIP6_simyr1982_c170824_netet_floor_c4g.nc'

edit/check user_nl_cam (not user_nl_datm)

flbc_type = 'CYCLICAL'
flbc_cycle_yr = 1850
flbc_file = '/glade/p/cesmdata/cseg/inputdata/atm/waccm/lb/LBC_1850climo_CMIP6_0p5degLat_c180227.nc'

INITHIST = 'MONTHLY'

./preview_namelists
qcmd -- ./case.build
./case.submit

Finished first 3 years, resubmit

./xmlchange CONTINUE_RUN=TRUE,STOP_N=8
./case.submit

There are several errors.
In cesm.log and glc.log, it shows:
dmass/dt error (Gt/y) 0.0000000000000000E+00
In cam.long, it shows:
2108 Q_qneg3 kg/kg 32 I Specific humidity QNEG3 error (cell)
2109 Q_qneg3_col kg/kg 1 I Specific humidity QNEG3 error (column)
2110 CLDLIQ_qneg3 kg/kg 32 I Grid box averaged cloud liquid amount QNEG3 error (cell)
2111 CLDLIQ_qneg3_col kg/kg 1 I Grid box averaged cloud liquid amount QNEG3 error (column)
2112 CLDICE_qneg3 kg/kg 32 I Grid box averaged cloud ice amount QNEG3 error (cell)
2113 CLDICE_qneg3_col kg/kg 1 I Grid box averaged cloud ice amount QNEG3 error (column)
2114 NUMLIQ_qneg3 kg/kg 32 I Grid box averaged cloud liquid number QNEG3 error (cell)
2115 NUMLIQ_qneg3_col kg/kg 1 I Grid box averaged cloud liquid number QNEG3 error (column)
2116 NUMICE_qneg3 kg/kg 32 I Grid box averaged cloud ice number QNEG3 error (cell)
2117 NUMICE_qneg3_col kg/kg 1 I Grid box averaged cloud ice number QNEG3 error (column)
2118 RAINQM_qneg3 kg/kg 32 I Grid box averaged rain amount QNEG3 error (cell)
2119 RAINQM_qneg3_col kg/kg 1 I Grid box averaged rain amount QNEG3 error (column)
2120 SNOWQM_qneg3 kg/kg 32 I Grid box averaged snow amount QNEG3 error (cell)
2121 SNOWQM_qneg3_col kg/kg 1 I Grid box averaged snow amount QNEG3 error (column)
2122 NUMRAI_qneg3 kg/kg 32 I Grid box averaged rain number QNEG3 error (cell)
2123 NUMRAI_qneg3_col kg/kg 1 I Grid box averaged rain number QNEG3 error (column)
2124 NUMSNO_qneg3 kg/kg 32 I Grid box averaged snow number QNEG3 error (cell)
2125 NUMSNO_qneg3_col kg/kg 1 I Grid box averaged snow number QNEG3 error (column)
2126 H2O2_qneg3 kg/kg 32 I H2O2 QNEG3 error (cell)
2127 H2O2_qneg3_col kg/kg 1 I H2O2 QNEG3 error (column)
2128 H2SO4_qneg3 kg/kg 32 I H2SO4 QNEG3 error (cell)
2129 H2SO4_qneg3_col kg/kg 1 I H2SO4 QNEG3 error (column)
2130 SO2_qneg3 kg/kg 32 I SO2 QNEG3 error (cell)
2131 SO2_qneg3_col kg/kg 1 I SO2 QNEG3 error (column)
2132 DMS_qneg3 kg/kg 32 I DMS QNEG3 error (cell)
2133 DMS_qneg3_col kg/kg 1 I DMS QNEG3 error (column)
2134 SOAG_qneg3 kg/kg 32 I SOAG QNEG3 error (cell)
2135 SOAG_qneg3_col kg/kg 1 I SOAG QNEG3 error (column)
2136 so4_a1_qneg3 kg/kg 32 I so4_a1 QNEG3 error (cell)
2137 so4_a1_qneg3_col kg/kg 1 I so4_a1 QNEG3 error (column)
2138 pom_a1_qneg3 kg/kg 32 I pom_a1 QNEG3 error (cell)
2139 pom_a1_qneg3_col kg/kg 1 I pom_a1 QNEG3 error (column)
2140 soa_a1_qneg3 kg/kg 32 I soa_a1 QNEG3 error (cell)
2141 soa_a1_qneg3_col kg/kg 1 I soa_a1 QNEG3 error (column)
2142 bc_a1_qneg3 kg/kg 32 I bc_a1 QNEG3 error (cell)
2143 bc_a1_qneg3_col kg/kg 1 I bc_a1 QNEG3 error (column)
2144 dst_a1_qneg3 kg/kg 32 I dst_a1 QNEG3 error (cell)
2145 dst_a1_qneg3_col kg/kg 1 I dst_a1 QNEG3 error (column)
2146 ncl_a1_qneg3 kg/kg 32 I ncl_a1 QNEG3 error (cell)
2147 ncl_a1_qneg3_col kg/kg 1 I ncl_a1 QNEG3 error (column)
2148 num_a1_qneg3 kg/kg 32 I num_a1 QNEG3 error (cell)
2149 num_a1_qneg3_col kg/kg 1 I num_a1 QNEG3 error (column)
2150 so4_a2_qneg3 kg/kg 32 I so4_a2 QNEG3 error (cell)
2151 so4_a2_qneg3_col kg/kg 1 I so4_a2 QNEG3 error (column)
2152 dst_a2_qneg3 kg/kg 32 I dst_a2 QNEG3 error (cell)
2153 dst_a2_qneg3_col kg/kg 1 I dst_a2 QNEG3 error (column)
2154 soa_a2_qneg3 kg/kg 32 I soa_a2 QNEG3 error (cell)
2155 soa_a2_qneg3_col kg/kg 1 I soa_a2 QNEG3 error (column)
2156 ncl_a2_qneg3 kg/kg 32 I ncl_a2 QNEG3 error (cell)
2157 ncl_a2_qneg3_col kg/kg 1 I ncl_a2 QNEG3 error (column)
2158 num_a2_qneg3 kg/kg 32 I num_a2 QNEG3 error (cell)
2159 num_a2_qneg3_col kg/kg 1 I num_a2 QNEG3 error (column)
2160 dst_a3_qneg3 kg/kg 32 I dst_a3 QNEG3 error (cell)
2161 dst_a3_qneg3_col kg/kg 1 I dst_a3 QNEG3 error (column)
2162 ncl_a3_qneg3 kg/kg 32 I ncl_a3 QNEG3 error (cell)
2163 ncl_a3_qneg3_col kg/kg 1 I ncl_a3 QNEG3 error (column)
2164 so4_a3_qneg3 kg/kg 32 I so4_a3 QNEG3 error (cell)
2165 so4_a3_qneg3_col kg/kg 1 I so4_a3 QNEG3 error (column)
2166 num_a3_qneg3 kg/kg 32 I num_a3 QNEG3 error (cell)
2167 num_a3_qneg3_col kg/kg 1 I num_a3 QNEG3 error (column)
2168 pom_a4_qneg3 kg/kg 32 I pom_a4 QNEG3 error (cell)
2169 pom_a4_qneg3_col kg/kg 1 I pom_a4 QNEG3 error (column)
2170 bc_a4_qneg3 kg/kg 32 I bc_a4 QNEG3 error (cell)
2171 bc_a4_qneg3_col kg/kg 1 I bc_a4 QNEG3 error (column)
2172 num_a4_qneg3 kg/kg 32 I num_a4 QNEG3 error (cell)
2173 num_a4_qneg3_col kg/kg 1 I num_a4 QNEG3 error (column)


In ice.log, it shows:
arwt heat error = 2.03405352514443615E-05 7.64007048303753716E-04
arwt swdn error = 1.79740741291286557E-04 1.63254534876007594E-03
arwt salt flx error = 3.26340688563296071E-05 -8.93625214842077610E-04
water flux error = -3.30904117920034131E-05 8.95038518795987200E-04


Can anyone help to fix the problem? Thanks a lot.

Best,
Yaqian
 

nusbaume

Jesse Nusbaumer
CSEG and Liaisons
Staff member
Hi Yaqian,

Sadly none of the errors you listed should result in the model failing (they are all just general diagnostics that are output even if the model is running well). Often the "critical" error shows up near the bottom of the cesm.log.* file (usually after a bunch of system messages stating that the model is being killed). I would also recommend checking the last few lines of each component log (e.g. atm.log.*) file to see if a unique error message shows up there as well. If you are unable to find the error then feel free to create a text file with the last ~2000 lines of the cesm.log file and attach it to this thread (or the entire cesm.log file if the file isn't too large).

Finally, I am slightly confused by the model dates you provided, but were you able to restart the model successfully at least once before? If you were never able to restart the model then it is likely a problem with your restart files or paths.

Thanks!

Jesse
 
Dear Jesse,

Thanks so much for your help! Yes, I ran successfully for 2 years, then I resubmit for 8 years, after around ~5 years, it stopped.

I have checked the last few lines of all log files according your suggestion, but I could not find other errors. I attached last 2500 lines from cesm.log. Can you help me to have a look it?

Thanks a lot.

Best regards,
Yaqian
 

nusbaume

Jesse Nusbaumer
CSEG and Liaisons
Staff member
Hi Yaqian,

Thanks for the info! Sadly it doesn't look like your cesm.log file attached correctly. Can you try again? The button to do so should be in the lower left side of your post. If you are having issues with size then feel free to compress the file using gzip or some other compression method.

Also, if doable, please attach the last few hundred lines of your atm.log and lnd.log files as well. That way I can also check to make sure a specific component error didn't slip by.

Thanks, and have a great day!

Jesse
 

nusbaume

Jesse Nusbaumer
CSEG and Liaisons
Staff member
Hi Yaqian,

It looks like your run is getting a "Signal 15", which usually means that something external has killed your run. This is likely a consequence of your model simulation going over the wallclock time allowed by your batch scheduler queue. If you have a standard output or standard error file (which should show up in your case directory), then it should state in there whether or not you went over the wallclock limit.

To fix this, I would simply reduce your model run length to, say, 3 years (which can be done by setting "STOP_N" to 3 and "STOP_OPTION" to "nyears" in your "env_run.xml" file), and then re-submitting the simulation. If it runs properly with that set-up then you'll know that your original run simply went over the max wallclock time. Of course if the model still fails please let me know.

Thanks, and good luck!

Jesse
 
Top