Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Guidance on increasing nlev to 60 for spcam

wchuang

Wayne
New Member
What version of the code are you using?
CESM 2.2.2, located at
Code:
/glade/work/wchuang/sandboxes/cesm2_2_2_FTorch

Have you made any changes to files in the source tree?
Notable changes:
  • components/cam/bld/configure
    • Added spcam_nz as an input option
  • components/cam/bld/build-namelist
    • Changed
      Code:
      (!$simple_phys and $cfg->get('nlev') >= 60))
      to
      Code:
      (!$simple_phys and $cfg->get('nlev') > 60))
  • components/cam/bld/namelist_files/namelist_defaults_cam.xml
    • Changed
      Code:
      <effgw_oro      nlev="60"     >0.0625D0</effgw_oro>
      to
      Code:
      <effgw_oro      nlev="61"     >0.0625D0</effgw_oro>

Additional compset SPCAMSML for spcam_ml was added but currently points to spcam_sam1mom and is not called during this issue, but changed files are provided here in case:
  • cime_config/config_component.xml
  • cime_config/config_compsets.xml
  • src/physics/cam/physpkg.F90
  • bld/config_files/definition.xml
  • bld/namelist_defaults_cam.xml
  • bld/build-namelist
  • bld/configure
Describe every step you took leading up to the problem:
I am running the script
Code:
/glade/work/wchuang/cesm_scripts/spcam_control_1mSAM.csh
with a new ncdata file,
Code:
/glade/u/home/wchuang/cami_1987-01-01_0.9x1.25_L26_c060703.MMF_L60_c20241219.nc
that has 60 levels.

The compset is FSPCAMS, the resolution is f09_f09_mg17, and I have a few xmlchanges:
Code:
./xmlchange --append --id CAM_CONFIG_OPTS --val " -nlev 60 "
./xmlchange --append --id CAM_CONFIG_OPTS --val " -spcam_nz 50 "

If this is a port to a new machine: Please attach any files you added or changed for the machine port (e.g., config_compilers.xml, config_machines.xml, and config_batch.xml) and tell us the compiler version you are using on this machine.
Please attach any log files showing error messages or other useful information.

It is not a port to a new machine.

Describe your problem or question:
I am creating an ML version of SPCAM that uses a saved model from E3SM. The saved model has 60 levels, so I want to test a non-ML SPCAM run (FSPCAMS) on CESM using an ncdata file with 60 levels. This way I can also compare this SPCAM control case with SPCAM-ML runs.

I made the changes noted above then began an FSPCAMS run at
Code:
/glade/derecho/scratch/wchuang/spcam_controlrun_2_2_2_nlev60_FSPCAMS.debug/run/
The case is located at
Code:
/glade/work/wchuang/cesm_dev/cases/spcam_controlrun_2_2_2_nlev60_FSPCAMS.debug
This resulted in an error
Code:
dec0061.hsn.de.hpc.ucar.edu 1792: forrtl: severe (408): fort: (2): Subscript #1 of the array QTMPV has value 390 which is greater than the upper bound of 384
dec0061.hsn.de.hpc.ucar.edu 1792:
dec0061.hsn.de.hpc.ucar.edu 1792: Image              PC                Routine            Line        Source             
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000002C569EC  tp_core_mp_xtpv_          388  tp_core.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000002C9B87B  tp_core_mp_tpcc_         1469  tp_core.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000002BC2141  sw_core_mp_c_sw_          477  sw_core.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000001B84D2D  cd_core_                  796  cd_core.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000000B53C09  dyn_comp_mp_dyn_r        1870  dyn_comp.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000001473D67  stepon_mp_stepon_         315  stepon.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           000000000085F841  cam_comp_mp_cam_r         265  cam_comp.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           000000000081FD0B  atm_comp_mct_mp_a         354  atm_comp_mct.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           0000000000466636  component_mod_mp_         257  component_mod.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           000000000043BC53  cime_comp_mod_mp_        2206  cime_comp_mod.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           000000000045D737  MAIN__                    122  cime_driver.F90
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           000000000041AB7D  Unknown               Unknown  Unknown
dec0061.hsn.de.hpc.ucar.edu 1792: libc-2.31.so       000014A258E6229D  __libc_start_main     Unknown  Unknown
dec0061.hsn.de.hpc.ucar.edu 1792: cesm.exe           000000000041AAAA  Unknown               Unknown  Unknown
dec0058.hsn.de.hpc.ucar.edu 1727: forrtl: severe (408): fort: (3): Subscript #1 of the array QTMPV has value -204 which is less than the lower bound of -96
dec0058.hsn.de.hpc.ucar.edu 1727:
dec0058.hsn.de.hpc.ucar.edu 1727: Image              PC                Routine            Line        Source             
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000002C54E4F  tp_core_mp_xtpv_          367  tp_core.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000002C9B87B  tp_core_mp_tpcc_         1469  tp_core.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000002BC2141  sw_core_mp_c_sw_          477  sw_core.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000001B84D2D  cd_core_                  796  cd_core.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000000B53C09  dyn_comp_mp_dyn_r        1870  dyn_comp.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000001473D67  stepon_mp_stepon_         315  stepon.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           000000000085F841  cam_comp_mp_cam_r         265  cam_comp.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           000000000081FD0B  atm_comp_mct_mp_a         354  atm_comp_mct.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           0000000000466636  component_mod_mp_         257  component_mod.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           000000000043BC53  cime_comp_mod_mp_        2206  cime_comp_mod.F90
dec0058.hsn.de.hpc.ucar.edu 1727: cesm.exe           000000000045D737  MAIN__                    122  cime_driver.F90

I also attempted an FHIST run using the same ncdata file, and received the following error:
Code:
ERROR:
 chem_init: do not know how to set water vapor upper boundary when model top is
 near mesopause
 

aherring

Adam
Member
I looked into this a bit, but I still don't have a very clear idea of what's going on. I'm not that familiar with the fv dycore code, and I found tp_core.F90 hard to follow, but I was able to recreate the error with a clean checkout of cesm2.2.2 and a lower processor count. I then decided to try and run a 70 level configuration of FSPCAMS, using our default waccm inic, and it did get to the time-stepping, and therefore surpassed the error. If you want to look through that case directory, it's here: /glade/derecho/scratch/aherring/spcam_test.004. Because the 70 level grid has a top above 1 Pa, I had to modify the source code a bit (you can search "arh" in SourceMods/src.cam/), and I had to reduce the time-stepping because these higher top grids have faster winds (see my user_nl_cam). You shouldn't have to worry about the former (your grid has a top of 5 Pa), but you will probably have to worry about the latter once you get to that stage (the default 26 level FSPCAMS has a model top around 200 Pa).

This tells me that FV does not like your 60 level grid, either because there is something finicky in the FV dycore, or there is something wrong with your ncdata file. Speaking of, the metadata for your ncdata file has an ncks command with hyam,hybm,hyai,hybi in the -v argument. I wonder if it is trying to remap the hybrid coefficients instead of taking them directly from the --vrt_out file? I run into analogous issues when I use nco to horizontally regrid, in that it remaps the lat/lon arrays which are not then identical to the grid file definition used by the model.

If that's not the issue, I can ping someone else at ncar who might know more, although since this is not an in-house dycore it's possible you may need to contact the developers. Another option is to try another dycore -- I see that we regularly test FSPCAMS with the FV3 dycore in cesm2.2.2. Specifically we test C48_C48_mg17, but you would probably want to use the 1deg grid C96_C96_mg17.

Your FHIST case is probably bombing out because it runs with -chem trop_mam4, which then looks for upper boundary conditions b/c your model top is 5 Pa. You might get past that init error if you set -chem none, which is what FSPCAMS uses. I also tried testing your ncdata file using the FMTHIST compset in our latest code base, and that gave me a strange error about level 24 in the vertical remapping, which I don't think is cfl related, and may also indicate an issue with your vertical levels. If you want, you can look at that run: /glade/derecho/scratch/aherring/cam7_nochem_FMTHIST_f09_f09_mg17_768pes_250103_nlev60.001.
 
Top