Creating consistent cases between ELM_USRDAT and CLM_USRDAT

jonwells04 · Nov 1, 2022

I've been running sparse grid cases in E3SM-ELM and want to make comparable cases in CESM-CLM5. I've attached both ELM and CLM setup scripts that build my case. I know there will be some differences since the 4.5 fork to ELM but I'm hoping the CIME infrastructure is still similar enough to generally run the same kind of sparse grids for both models. I've made the sparse grid domain and surface data for CLM5 using this github repo: GitHub - bishtgautam/matlab-script-for-clm-sparse-grid: MATLAB scripts to create sparse grid surface dataset and domain file for E3SM Land Model and CLM

The setup:
on Cori I have a compset "I1850GSWCNPECACNTBC" , long name: 1850_DATM%GSWP3v1_ELM%CNPECACNTBC_SICE_SOCN_MOSART_SGLC_SWAV
on Cheyenne there is no equivalent so I've chosen long name 1850_DATM%GSWP3v1_CLM50%BGC_SICE_SOCN_MOSART_SGLC_SWAV

Both systems will use --res USRDAT as I test the effects of multiple reanalysis products on model responses for a 112x1 sparse grid of sites. The ELM script runs and builds a case that works. The CLM script has some issues that I believe are related to the compset and differences between ELM_USRDAT and CLM_USRDAT default values, but I'm no expert here and could use some insight.

The key issues to fix for my CLM script:

ATM_DOMAIN_FILE does not appear to be exported to env_run.xml in CLM_USRDAT. Where would I set this in a CLM5 case?
DATM_ELMNCEP_YR_ALIGN, DATM_ELMNCEP_YR_START, and DATM_ELMNCEP_YR_END don't appear to have corollaries in CLM5's env_run.xml. Is there a CLM5 equivalent? If not how do I go about setting this.
Finally when the CLM script starts to create namelists it errors out on the buildnml step with: "ERROR: No default value found for streamslist with attributes {'model_grid': 'CLM_USRDAT', 'datm_mode': 'CLMCRUNCEP', 'datm_co2_tseries': 'none', 'datm_presaero': 'clim_1850'}." This doesn't happen in ELM USRDAT setup and I'm wondering what the differences here are for USRDAT setup and if this is the issue or something else that I've missed.

The ELM setup script and commandline output from setup/build/submit are attached. The CLM script and commandline errors during setup/build/submit are also attached. Thank you!

oleson · Nov 2, 2022

I suspect you might be using a newer version of CTSM/CLM that uses the NUOPC driver instead of the MCT driver. See README.NUOPC_driver.md in the top level directory for an explanation.
NUOPC does away with domain files and instead a mesh file is required. So there is no setting for ATM_DOMAIN_FILE. We have been working on methods to generate a mesh file for a sparse grid. See this issue for current status, you might post on that issue to get an update and possibly an example sparse grid case on cheyenne that uses a mesh file.

Mesh file for sparse grid for the NUOPC coupler · Issue #1731 · ESCOMP/CTSM

We need a mesh file that can be used with the NUOPC coupler for the sparse grid. Here's a sample case for the MCT coupler: /glade/work/oleson/PPE.n11_ctsm5.1.dev030/cime/scripts/ctsm51c8BGC_PPE...

github.com

There are also changes to the datm. New variables include:

"DATM_YR_ALIGN"
"DATM_YR_START"
"DATM_YR_END"

The streams files has been combined into a single file: datm.streams.xml

jonwells04 · Nov 3, 2022

Hi Keith,

Thank you for the great info! I am using one of the newest versions so I think that you've nailed my issue. My current solution will be to roll back to an earlier CLM5 until mesh files are worked out for sparse grids.

Could you recommend a version number that is pre-NUOPC? Or can I define MCT somehow in the case setup to revert to a run based from domain files instead of mesh files?

Thanks!
Jon

oleson · Nov 3, 2022

NUOPC became the default driver in ctsm5.1.dev062, so tags prior to that should use MCT by default. You can specify MCT by using "--driver mct" in create_newcase. MCT will be deprecated soon however.

jonwells04 · Nov 4, 2022

Hi Keith,

setting "--driver mct" in the create_newcase call was enough to allow me to set most things similarly to ELM.

A few changes to be aware of for anyone doing something similar:

"-bgc_spinup" changed to "-clm_accelerated_spinup" in CLM_BLDNML_OPTS
1. options like "-nutrient", "-nutrient_comp_pathway", "-soil_decomp", and "-methane" have all been removed in the updated CTSM releases since divergence of the models.
The datm.streams.txt.CLMCRUNCEP.* file name describing DATM_MODE=CLMCRUNCEP data locations (TPQ, etc.), that you can change by adding a copy to your case folder before building (e.g. directions here), are not the same in ELM/CLM5.
1. CLM5: datm.streams.txt.CLMCRUNCEP.TPQW
2. ELM: datm.streams.txt.CLMCRUNCEP.TPQWF
I replaced files that use the CLMCRUNCEP DATM_MODE because my netcdfs are setup similarly. You can check into how the different DATM_MODEs interpolate each data streams and how your netcdfs should be setup in terms of timesteps, here, if you plan to create your own forcing files.

I may update if I have any runtime errors but the case was setup/built/submitted on Cheyenne.

Thanks again!

jonwells04 · Nov 5, 2022

2nd point is actually wrong. Both files are actually datm.streams.txt.CLMCRUNCEP.TPQW after further testing for both models.

jonwells04 · Nov 10, 2022

Hi Keith,

Everything is running and outputs are generated but I'm having an issue with the initial job not ending. The requested 30 simulation years finish at the 11hour mark with a "SUCESSFUL TERMINATION OF CPL7-cesm" in the cpl.log, log files are copied to $CIME_OUTPUT_ROOT/archive/$CASE/logs, and rpointer files are created in $CASEROOT, but from there nothing seems to happen and the job runs for another hour exiting with "Exit_status=271" as it surpasses the wallclock. The run over the wallclock errors out the st.archive job, and the resubmits don't happen. Did I set something up incorrectly that is interfering with the job ending?

File locations on Cheyenne:
ctsm directory (copied from Sean Swenson): /glade/work/jonw/ctsm
case directory: /glade/u/home/jonw/cases/ TundraWarmingSites_CRUJRA_1901to2020_CLM_AD
run directory: /glade/scratch/jonw/cesm_scratch/output/ TundraWarmingSites_CRUJRA_1901to2020_CLM_AD

The final script that sets my xml settings is attached. I can't find a good example of what's wrong as there is essentially no error output in any of the log files. Thanks!

oleson · Nov 10, 2022

The model won't resubmit unless the short-term archiver completes successfully. The short-term archiver job ran out of wallclock time. Normally the default walltime is 20 minutes which is usually enough time to complete the job. However, the short-term archiver runs in the share queue and I've noticed problems with that queue the last couple of days.

jonwells04 · Nov 10, 2022

If I change the REST_N to 30 does that change the frequency of restart files to 1 every 30 years? And would that lessen the work the st.archiver has to do? Or should I simply increase st.archievr job to closer to an hour?

Thanks!

oleson · Nov 10, 2022

Oh, I didn't know you were asking for restart files every year. But it should probably still complete in 20 minutes regardless. You could set REST_N=$STOP_N which would generate restart files at the end of each model run segment. Increasing the st.archiver wallclock time might be a good idea also given recent behavior.

Creating consistent cases between ELM_USRDAT and CLM_USRDAT

jonwells04

Jon Wells

New Member

Attachments

oleson

Keith Oleson

CSEG and Liaisons

Mesh file for sparse grid for the NUOPC coupler · Issue #1731 · ESCOMP/CTSM

jonwells04

Jon Wells

New Member

oleson

Keith Oleson

CSEG and Liaisons

jonwells04

Jon Wells

New Member

jonwells04

Jon Wells

New Member

jonwells04

Jon Wells

New Member

Attachments

oleson

Keith Oleson

CSEG and Liaisons

jonwells04

Jon Wells

New Member

oleson

Keith Oleson

CSEG and Liaisons