Hello, everyone!Thanks for opening this thread. I need your help. I am running fully coupled climate simulation (use B1850 compesite) with CESM1.2.
The model has been running successfully for 3000 years.When I change to 'branch' continue to run the next, running successfully for six months and failed at the seventh month. The error message I got is mainly from CAM4 module.
Here is the last few lines from cesm.log file:
--------------------------------------------------------------------------
1823 SPHDEP: ****** MODEL IS BLOWING UP: CFL condition likely violated *********
1824 SPHDEP: ****** MODEL IS BLOWING UP: CFL condition likely violated *********
1825
1826
1827 Parcel associated with longitude 85, level 9 and latitude 9 is outside the model domain.
1828 Possible solutions: a) reduce time step
1829 b) if initial run, set "DIVDAMPN = 1." in namelist and r
1830 erun
1831 c) modified code may be in error
1832 (shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping
1833
1834
1835 Parcel associated with longitude 87, level 10 and latitude 8 is outside the model domain.
1856 --------------------------------------------------------------------------
1857 slurmstepd: error: *** STEP 6181587.0 ON h02r2n12 CANCELLED AT 2022-10-26T20:20:37 ***
1858 srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
1859 srun: error: h02r2n12: task 2: Killed
1860 srun: launch/slurm: _step_signal: Terminating StepId=6181587.0
1861 srun: error: h02r2n12: task 8: Exited with exit code 233
1862 srun: error: h02r2n12: tasks 0-1,3-7,9-29: Killed
1863 srun: error: h02r2n13: tasks 30-59: Killed
--------------------------------------------------------------------------
I also checked the atm.log file:
1556 *** Original Courant limit exceeded at k,lat= 1 9 (estimate = 1.174) ***
1557 *** Original Courant limit exceeded at k,lat= 2 9 (estimate = 1.118) ***
1558 *** Original Courant limit exceeded at k,lat= 3 9 (estimate = 1.052) ***
1559 NSTEP =31544704 8.685526565645961E-05 9.114958076179490E-06 275.333 9.87396E+04 5.278570600512381E+01 1.69 0.93
1560 nstep, te 31544705 0.37097820527521362E+10 0.37097951667912240E+10 0.36258312668091194E-03 0.98739623381941856E+05
1561 COURLIM: *** Courant limit exceeded at k,lat= 1 9 (estimate = 1.168), solution has been truncated to wavenumber 26 ***
1562 COURLIM: *** Courant limit exceeded at k,lat= 2 9 (estimate = 1.117), solution has been truncated to wavenumber 27 ***
1563 COURLIM: *** Courant limit exceeded at k,lat= 3 9 (estimate = 1.038), solution has been truncated to wavenumber 29 ***
1564 *** Original Courant limit exceeded at k,lat= 1 9 (estimate = 1.168) ***
1565 *** Original Courant limit exceeded at k,lat= 2 9 (estimate = 1.117) ***
1566 *** Original Courant limit exceeded at k,lat= 3 9 (estimate = 1.038) ***
1567 NSTEP =31544705 8.697680900706860E-05 1.045402640107233E-05 275.331 9.87396E+04 5.278784553213968E+01 1.83 1.18
1568 nstep, te 31544706 0.37097809943776278E+10 0.37097951667912240E+10 0.39184547030246612E-03 0.98739646779716568E+05
--------------------------------------------------------------------------
Attached is my cesm.log and atm.log file
I believe as the tip says, I triggered the "Courant limit " in the seventh month begin.
A "Courant limit exceeded" message is issued whenever the algorithm is employed.
There are two reasons I am aware of that trigger the limiter:
1) the wind fields can get extremely strong in the middle atmosphere at the winter pole. This is a natural phenoma and occurs in the real world as well as the model. The limiter will kick in under these circumstances and reduce the wind speed to maintain stability. This is a perfectly normal occurance, and nothing to worry about.
2) If there is an instability generated by any other aspect of the model (for instance you might have introduced a bug), then it can amplify, and occasionally the Courant limiter will begin firing, The model will soon halt (something will go terribly wrong) You may see other manifestations of such an instability (ie other warning messages will begin appearing).
--------------------------------------------------------------------------
For information I try running with dtime 1200, 600, 300 and DIVDAMPN = 1. but the run still aborted.
Do you have any good solutions to this limitation to keep the model running?
I hope to get your help. I would really appreciate your help!
Thank you!
Best,Chen
The model has been running successfully for 3000 years.When I change to 'branch' continue to run the next, running successfully for six months and failed at the seventh month. The error message I got is mainly from CAM4 module.
Here is the last few lines from cesm.log file:
--------------------------------------------------------------------------
1823 SPHDEP: ****** MODEL IS BLOWING UP: CFL condition likely violated *********
1824 SPHDEP: ****** MODEL IS BLOWING UP: CFL condition likely violated *********
1825
1826
1827 Parcel associated with longitude 85, level 9 and latitude 9 is outside the model domain.
1828 Possible solutions: a) reduce time step
1829 b) if initial run, set "DIVDAMPN = 1." in namelist and r
1830 erun
1831 c) modified code may be in error
1832 (shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping
1833
1834
1835 Parcel associated with longitude 87, level 10 and latitude 8 is outside the model domain.
1856 --------------------------------------------------------------------------
1857 slurmstepd: error: *** STEP 6181587.0 ON h02r2n12 CANCELLED AT 2022-10-26T20:20:37 ***
1858 srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
1859 srun: error: h02r2n12: task 2: Killed
1860 srun: launch/slurm: _step_signal: Terminating StepId=6181587.0
1861 srun: error: h02r2n12: task 8: Exited with exit code 233
1862 srun: error: h02r2n12: tasks 0-1,3-7,9-29: Killed
1863 srun: error: h02r2n13: tasks 30-59: Killed
--------------------------------------------------------------------------
I also checked the atm.log file:
1556 *** Original Courant limit exceeded at k,lat= 1 9 (estimate = 1.174) ***
1557 *** Original Courant limit exceeded at k,lat= 2 9 (estimate = 1.118) ***
1558 *** Original Courant limit exceeded at k,lat= 3 9 (estimate = 1.052) ***
1559 NSTEP =31544704 8.685526565645961E-05 9.114958076179490E-06 275.333 9.87396E+04 5.278570600512381E+01 1.69 0.93
1560 nstep, te 31544705 0.37097820527521362E+10 0.37097951667912240E+10 0.36258312668091194E-03 0.98739623381941856E+05
1561 COURLIM: *** Courant limit exceeded at k,lat= 1 9 (estimate = 1.168), solution has been truncated to wavenumber 26 ***
1562 COURLIM: *** Courant limit exceeded at k,lat= 2 9 (estimate = 1.117), solution has been truncated to wavenumber 27 ***
1563 COURLIM: *** Courant limit exceeded at k,lat= 3 9 (estimate = 1.038), solution has been truncated to wavenumber 29 ***
1564 *** Original Courant limit exceeded at k,lat= 1 9 (estimate = 1.168) ***
1565 *** Original Courant limit exceeded at k,lat= 2 9 (estimate = 1.117) ***
1566 *** Original Courant limit exceeded at k,lat= 3 9 (estimate = 1.038) ***
1567 NSTEP =31544705 8.697680900706860E-05 1.045402640107233E-05 275.331 9.87396E+04 5.278784553213968E+01 1.83 1.18
1568 nstep, te 31544706 0.37097809943776278E+10 0.37097951667912240E+10 0.39184547030246612E-03 0.98739646779716568E+05
--------------------------------------------------------------------------
Attached is my cesm.log and atm.log file
I believe as the tip says, I triggered the "Courant limit " in the seventh month begin.
A "Courant limit exceeded" message is issued whenever the algorithm is employed.
There are two reasons I am aware of that trigger the limiter:
1) the wind fields can get extremely strong in the middle atmosphere at the winter pole. This is a natural phenoma and occurs in the real world as well as the model. The limiter will kick in under these circumstances and reduce the wind speed to maintain stability. This is a perfectly normal occurance, and nothing to worry about.
2) If there is an instability generated by any other aspect of the model (for instance you might have introduced a bug), then it can amplify, and occasionally the Courant limiter will begin firing, The model will soon halt (something will go terribly wrong) You may see other manifestations of such an instability (ie other warning messages will begin appearing).
--------------------------------------------------------------------------
For information I try running with dtime 1200, 600, 300 and DIVDAMPN = 1. but the run still aborted.
Do you have any good solutions to this limitation to keep the model running?
I hope to get your help. I would really appreciate your help!
Thank you!
Best,Chen