There are a few cases where POP2 crashes during a restart due to what appears to be a convergence issue, but is actually a problem caused by reading corrupt data from a restart file. The apparent error in the restart log:
The problem is due to a POP2 namelist setting that is not updated correctly.Some compsets have a binary init_ts_file, and appropriate format:init_ts_file_fmt = 'bin'Upon restarting, the new file is a netCDF file, and this setting should be used:init_ts_file_fmt = 'nc'In some early CESM1.1 betas, this problem happened all the time (?). After it was fixed, the problem would still occur if there was a problem in preview_namelists. There are two known cases where this has happened:
Code:
POP Exiting...
POP_SolversChronGear: solver not converged
POP_SolverRun: error in ChronGear
POP_BarotropicDriver: error in solver
Step: error in barotropic
- Until recently (i.e. until later CESM1.2 betas), the chemistry preprocessor would cause CAM's configure script to fail on yellowstone batch nodes, which would indirectly prevent POP's build-namelist from running. The solution in this case is to simply run preview_namelists after the initial run and before the first restart.
- Ryan Neely has encountered this problem on Zeus in CESM1.0.5. It is not yet clear whether this is a bug in CESM1.0.5 itself, or a problem with the port to Zeus.