Guidance on Running with LME datasets with CESM2.1.3 (CLM5)

oleson · Jun 24, 2025

Unfortunately, we can't provide any specific answers since we don't have any experience with this type of workflow. Our suggestion is to experiment with a short sequence of simulations, e.g., just a few years for each segment, to see how the workflow should be implemented.

wvsi3w · Jun 25, 2025

oleson said:
Unfortunately, we can't provide any specific answers since we don't have any experience with this type of workflow. Our suggestion is to experiment with a short sequence of simulations, e.g., just a few years for each segment, to see how the workflow should be implemented.

Thank you Keith for your suggestion.

From 500 to 1850, it is better to use the I1850 compset, and then from 1850 onwards, using a transient compset like IHIST1850 is suggested.

So, because there is a need to change the compset from one point in time, it is better if I divide the simulation into two parts (at least):
from year 500 to 1850 (with I1850 compset that I use atm forcings from 500 to 1850). Then using IHIST1850 compset from 1850 onwards (setting the atm forcing accordingly).

But do you have a rough idea about the stream files? I mean is there another way of writing thousands of lines for the file names? I can definitely write it into the stream file but I was thinking maybe you know another way.
Because lets say I am doing the two parts simulation, then from 500 to 1850 my stream files will be like:

<filePath>
/home/USER/projects/def-xxx/USER/inputdata/atm/datm7/atm_forcing.datm7.GSWP3.0.5d.v1.c170516/Precip
</filePath>
<fileNames>
0500-01.nc
0500-02.nc
0500-03.nc
0500-04.nc
.
.
.
.
.
1850-11.nc
1850-12.nc

which is 16,200 lines of file names!
It is fine if this is the norm, but I thought maybe you guys know some other way for writing the names into the stream when it is a lot of files.

oleson · Jun 25, 2025

I think you could change the paths to your data by modifying:

cime/src/components/data_comps/datm/cime_config/namelist_definition_datm.xml

I think the list of file names would then be setup automatically by setting DATM_CLMNCEP_YR_START, DATM_CLMNCEP_YR_END.

The names would be different from yours, e.g., clmforc.GSWP3.c2011.0.5x0.5.Prec.1901-01.nc versus 1901-01.nc, but you could either change your names to match or link your files to those filenames.

wvsi3w · Jun 27, 2025

oleson said:
I think you could change the paths to your data by modifying:

cime/src/components/data_comps/datm/cime_config/namelist_definition_datm.xml

I think the list of file names would then be setup automatically by setting DATM_CLMNCEP_YR_START, DATM_CLMNCEP_YR_END.

The names would be different from yours, e.g., clmforc.GSWP3.c2011.0.5x0.5.Prec.1901-01.nc versus 1901-01.nc, but you could either change your names to match or link your files to those filenames.

Dear Keith, @oleson
Sorry, I have a question regarding the datm_in file.
In this thread related to the "1" in the user_datm*, you suggested changing that "1" to 1901, and the align issue got fixed. That person's problem is not my problem, but I found this thread because I want to specify something:

In my spin-up, which worked fine, this was my datm_in for IHIST compset (which I modified the years):

streams = "datm.streams.txt.CLMGSWP3v1.Solar 500 500 500",
"datm.streams.txt.CLMGSWP3v1.Precip 500 500 500",
"datm.streams.txt.CLMGSWP3v1.TPQW 500 500 500",
"datm.streams.txt.presaero.trans_1850-2000 1850 1850 1850",
"datm.streams.txt.topo.observed 1 1 1",
"datm.streams.txt.co2tseries.20tr 1850 1850 1850"
taxmode = "cycle", "cycle", "cycle", "cycle", "cycle", "cycle"

And this was for I1850 compset:

streams = "datm.streams.txt.CLMGSWP3v1.Solar 1 500 500",
"datm.streams.txt.CLMGSWP3v1.Precip 1 500 500",
"datm.streams.txt.CLMGSWP3v1.TPQW 1 500 500",
"datm.streams.txt.presaero.clim_1850 1 1850 1850",
"datm.streams.txt.topo.observed 1 1 1"
taxmode = "cycle", "cycle", "cycle", "cycle", "cycle"

I am assuming those 1s are the align in I1850 compset (?)

As this compset is better for having constant variables (such as urban, co2, etc) so I used I1850 in my last spin-up and it worked fine, all reached equilibrium.

I was setting up my transient run, and the start date is 500 and end is 1849, so what should I do with align here, do I leave it to be 1? Is the format below correct for the transient run?

streams = "datm.streams.txt.CLMGSWP3v1.Solar 1 500 1849",
"datm.streams.txt.CLMGSWP3v1.Precip 1 500 1849",
"datm.streams.txt.CLMGSWP3v1.TPQW 1 500 1849",
"datm.streams.txt.presaero.clim_1850 1 1850 1850",
"datm.streams.txt.topo.observed 1 1 1"
taxmode = "cycle", "cycle", "cycle", "cycle", "cycle"

Or should I change that 1 when we are in transient run from year 500 to 1849? However, since the first part of my transient run is from 500 to 1849 so I will have the constant values for CO2, aerosol, urban, fire, ... and only my forcings (Precip, Solar, TPHWL) are transient (from 500 to 1849).

Also, I think the taxmode part should be different? How about other parts of my datm_in (e.g. tintalgo, mapalgo, readmode, ...)?

P.S. will that align being "1" in the spin-up, affect the results of spin-up? should I redo it with "500 500 500"?

wvsi3w · Jun 27, 2025

Because I remember when we used IHIST compset the datm_in for the CO2 part and taxmode shows this:

"datm.streams.txt.co2tseries.20tr 1850 1850 2014"
taxmode = "cycle", "cycle", "cycle", "cycle", "cycle", "extend"

which means the co2 stream is "extend", and I thought since IHIST was a transient compset from 1850 onwards, then for all transients we should change the taxmode (and other things like tintalgo, ...?). But since my first part of transient is from 500-1849, then technically I wont have much transient except for the prescribed 7 variables (Precip, Solar, TPHWL).

you see what I mean?!

wvsi3w · Jun 27, 2025

wvsi3w said:
Because I remember when we used IHIST compset the datm_in for the CO2 part and taxmode shows this:

"datm.streams.txt.co2tseries.20tr 1850 1850 2014"
taxmode = "cycle", "cycle", "cycle", "cycle", "cycle", "extend"

which means the co2 stream is "extend", and I thought since IHIST was a transient compset from 1850 onwards, then for all transients we should change the taxmode (and other things like tintalgo, ...?). But since my first part of transient is from 500-1849, then technically I wont have much transient except for the prescribed 7 variables (Precip, Solar, TPHWL).

you see what I mean?!

I believe the align part being 1, is the most important question here, and what does it mean when align is 1.

Because if I dont know what that "1" does, then I can not trust my spin-up results. Also, that would impact my understanding of the current transient run.

Thank you in advance for your support

wvsi3w · Jun 30, 2025

wvsi3w said:
I believe the align part being 1, is the most important question here, and what does it mean when align is 1.

Because if I dont know what that "1" does, then I can not trust my spin-up results. Also, that would impact my understanding of the current transient run.

Thank you in advance for your support

Dear Keith (@oleson) and dear Sam (@slevis),

Based on this thread and other CESM documentation, I want to check my understanding about the role of the align value and taxmode in the datm streams:

This thread you mentioned that the align part sets which forcing year gets mapped to your simulation start year. And this link says, if you want model year 1850 to use forcing year 1901 (when you only have 1901–2010 data), you set align=1901, start=1901, end=1920, and cycle those years for the early part of the run. And when using I1850 (constant non-transient compset), we see the align value to be "1".

So, let me know if I am right:

Based on what I learned, this "1" means the first year in the forcing record (which is also likely 1 in a constant file, or just the only available year in a climatology). Because the forcing is the same every year (either one year, or a repeated climatological mean), the align value is essentially arbitrary—the model always uses the same forcing. The taxmode = cycle tells datm to repeat that year until the end date, so here the align value being "1" doesn't matter for my spindown simulation, and does not affect my results, because I am using only one year (500) in this non-transeint case (????)

Also, the taxmode = "cycle" for all of the streams in this non-transient one-year forcing case is making sense, because technically it is cycling over one year. But, the real question is what to do in terms of transient runs? When I want to do 500-1849, I have the forcings from 500 onwards, so for that I need to set these:
./xmlchange DATM_CLMNCEP_YR_ALIGN=500
./xmlchange DATM_CLMNCEP_YR_START=500
./xmlchange DATM_CLMNCEP_YR_END=1849

Am I right?

And, for a transient CO₂ stream, I guess I should set taxmode="extend" to hold the last value after the data ends. But for the main atmospheric streams (Precip, Solar, TPHWL), since I have a full time series covering 500–1849, it seems taxmode="cycle" is fine and won't cause any unwanted cycling?

If I should change the CO2 taxmode to extend, how about other taxmodes? Should I change them from cycle to extend as well?

I dont see why we should change them here because the start year is 500, and I have the forcing from that year onwards, so it shouldn't cycle through this period when technically we are asking to have a transient run and the start and end of it are clear.

Also, I saw "cycle" for taxmode in IHIST compset too, which means my assumption could be correct (?)

How about simulation from 1850 to 2000 (where I have all the forcing as well), should I set the taxmode cycle and xmls like this (?):
./xmlchange DATM_CLMNCEP_YR_ALIGN=1850
./xmlchange DATM_CLMNCEP_YR_START=1850
./xmlchange DATM_CLMNCEP_YR_END=2000

Thanks again for your clarification.

wvsi3w · Jun 30, 2025

I guess for your simulation 1850–2000, since the CO2 is using "fco2_datm_global_simyr_1750-2014_CMIP6_c180929.nc" which is from 1750-2014, then setting the taxmode to either cycle or extend works because taxmode = extend only for timeseries where the simulation exceeds the data range and if the simulations go past 2014 (the last year of CO₂ file), then we must set taxmode = extend for CO₂. This way, the model will continue to use the 2014 value for all future years. And because my simulation ends in 2000 then having taxmode =cycle is ok and doesnt matter. right?

And for the 500-1849 simulation it should be taxmode = cycle, right? because we are using constant value of CO2 280ppm for this period.

slevis · Jun 30, 2025

wvsi3w said:
I guess for your simulation 1850–2000, since the CO2 is using "fco2_datm_global_simyr_1750-2014_CMIP6_c180929.nc" which is from 1750-2014, then setting the taxmode to either cycle or extend works because taxmode = extend only for timeseries where the simulation exceeds the data range and if the simulations go past 2014 (the last year of CO₂ file), then we must set taxmode = extend for CO₂. This way, the model will continue to use the 2014 value for all future years. And because my simulation ends in 2000 then having taxmode =cycle is ok and doesnt matter. right?

And for the 500-1849 simulation it should be taxmode = cycle, right? because we are using constant value of CO2 280ppm for this period.

Sounds correct.

wvsi3w · Aug 7, 2025

Hello Sam and Keith, @slevis @oleson
I have a specific question about the continue_run:

My simulation, for some reason, stopped at year 900, and I want to continue it on another cluster. My forcing initial dataset as you know starts from the year 500 and ends at 1850.
In my initial case, I have done these:

./xmlchange DATM_CLMNCEP_YR_END=1849,DATM_CLMNCEP_YR_START=500,DATM_CLMNCEP_YR_ALIGN=500,CLM_CO2_TYPE=constant,DATM_PRESAERO=clim_1850,CCSM_CO2_PPMV=280.0,RUN_STARTDATE=0500-01-01,RESUBMIT=269,STOP_DATE=18500101

Now that I want to continue the run in a new case on another cluster, should I change the DATM_CLMNCEP_YR_START to 900? How about DATM_CLMNCEP_YR_ALIGN? Or If I leave the start and align to 500, the model itself will read the rpointer and restart files which are from year 900 and adjust the simulation accordingly?

Thanks.

wvsi3w · Aug 7, 2025

How about RUN_STARTDATE? Should I set it to 900-01-01, or I shouldn't change it, and the model will read it from rpointers and restart file?

I can not do a branch because I moved to a different cluster and a different system. So, I think I should create a case similar to what I had in the previous cluster, and the run type would be startup.

But I am not sure what to do with RUN_STARTDATE, DATM_START, and ALIGN.

slevis · Aug 7, 2025

You should not have to modify anything in the case.
On the new computer:
- Create a new case that looks like the case on the old computer.
- Confirm that it runs before doing anything else.
- Make sure the settings in the new case match the settings in the case on the old computer.
- Copy everything that the simulation would have had access to from the old computer to the new computer.
- Continue the simulation on the new computer as if you were on the old computer.

wvsi3w · Aug 8, 2025

slevis said:
You should not have to modify anything in the case.
On the new computer:
- Create a new case the looks like the case on the old computer.
- Confirm that it runs before doing anything else.
- Make sure the settings in the new case match the settings in the case on the old computer.
- Copy everything that the simulation would have had access to from the old computer to the new computer.
- Continue the simulation on the new computer as if you were on the old computer.

Hello Sam. Thank you for your support.

-I have tested the model on the new system and it works with all the compsets.

-I have copied the restart file from year 900CE and the rpointers from that old case to this new one. I also used the rest of the settings from the old case (CLM_CO2_TYPE=constant,DATM_PRESAERO=clim_1850,CCSM_CO2_PPMV=280.0,STOP_DATE=18500101 and etc)

-I looked at the threads on the forum and as far as I understand I should set DATM_CLMNCEP_YR_START and DATM_CLMNCEP_YR_ALIGN to 500, because when you are giving the user_nl_clm the finidat from old case which is from the year 900 and when you are copying the rpointers from that 900CE then the model will know to use DATM START from 900 and align it accordingly (am I right?).

-Also, from what I read on the forum, I think I need to only change one thing: in this new case in the new system, the RUN_STARTDATE should be 900-01-01 (am I correct?)

** Based on your response, I believe this would be the way.

slevis · Aug 11, 2025

I may have misunderstood. I understood that you wanted to continue a case (same case name) on the second computer that started on the first computer. If so, then my recommendation was to copy everything from the first to the second computer and otherwise leave everything unchanged. This is my reasoning:

- On the first computer you would simply set CONTINUE=TRUE and RESUBMIT>0 and you would let the simulation continue.
- You should be able to do the same thing on the second computer after copying all the necessary files.

If you change the startdate, then I think that CONTINUE=TRUE will fail. If you prefer to change the startdate, then you could start this as a "hybrid" or a "startup" run. If DATM_CLMNCEP_YR_START and DATM_CLMNCEP_YR_ALIGN were 500 before, I would leave them the same.

wvsi3w · Aug 18, 2025

Hello again,
My case has faced some issues.
My case crashed in the year-month 1206-05 because one of my forcing data (1206-05.nc file) was broken, and I had to make an average from 1206-04 and 1206-06 with the cdo (or ncra) command:

cdo ensmean

I have checked everything, but it crashes with this error:
ERROR: (shr_stream_verifyTCoord) ERROR: elapsed seconds must be increasing

I have read the threads about this error, and I concluded that maybe my file is corrupted, even though it was made correctly.
What do you think is the issue here?
Would it be because April and June are 30 days, and May is 31 days? But my data is monthly, I dont know if that 31days is relevant here.

wvsi3w · Aug 18, 2025

wvsi3w said:
Hello again,
My case has faced some issues.
My case crashed in the year-month 1206-05 because one of my forcing data (1206-05.nc file) was broken, and I had to make an average from 1206-04 and 1206-06 with the cdo (or ncra) command:

cdo ensmean

I have checked everything, but it crashes with this error:
ERROR: (shr_stream_verifyTCoord) ERROR: elapsed seconds must be increasing

I have read the threads about this error, and I concluded that maybe my file is corrupted, even though it was made correctly.
What do you think is the issue here?
Would it be because April and June are 30 days, and May is 31 days? But my data is monthly, I dont know if that 31days is relevant here.

I even tried copying another file from the year before it for that month (cp 1205-05.nc 1206-05.nc) and it failed with that same error.
So I dont think the cdo ensmean (or ncra) was incorrect.

The issue could be something else that I dont know yet.
Maybe the general attributes are being read by the model and it knows this file was made from another file and crashes because of the name of it?

slevis · Aug 18, 2025

Here are some thoughts:
- It does seem problematic to use a 30-day file for a 31-day month.
- I like your idea of copying 1205-05 to 1206-05. I don't know why that fails.
- Since the model worked for 1205-05 and 1204-05 (and so forth going back), I suggest looking carefully at the differences between these files (not the data but attributes, as you suggested). And then how your new file differs from them. This way I hope that you will eventually find the problem.

wvsi3w · Aug 19, 2025

slevis said:
Here are some thoughts:
- It does seem problematic to use a 30-day file for a 31-day month.
- I like your idea of copying 1205-05 to 1206-05. I don't know why that fails.
- Since the model worked for 1205-05 and 1204-05 (and so forth going back), I suggest looking carefully at the differences between these files (not the data but attributes, as you suggested). And then how your new file differs from them. This way I hope that you will eventually find the problem.

Dear Samuel,
Thank you so much for your answer.

I double checked the time variables in the general attribute section, and I realized this:

time is "time:units = "days since 0500-01-01 00:00:00""

for
ncdump -v time 1203-05.nc
time = 256746 (which is equal to days since 500CE= (((1203-500)*365)+(151days for the last 5 months))

for ncdump -v time 1200-05.nc
time = 255651 (which is equal to days since 500CE)

So when I copy another data from last year (or even make an average), the time variable will be messed up and thats why we get that error message about (elapsed seconds must be increasing).

Therefore, 1206-05.nc the time needs to be 257,841 days since 500CE.

wvsi3w · Aug 25, 2025

Hello,
I have another question for the scientist out there:

In the path below, we have this solar forcing file, which is from 1995 to 2005:
inputdata/atm/cam/solar/SolarForcing1995-2005avg_c160929.nc (attached is the header of this file).

What is this file doing to the model in our simulations? What is its purpose?
Because to force the model in my land only simulation, I usually use FSDS as the Solar input and FLDS as well in the TPHWL input (the L).
But I am not sure why there is this SolarForcing file in the cam/solar path of our inputdata. Moreover, when I open the file the tsi values are identical (all of them being 1361.1), which is weird!

So as far as I remember, the land-only simulation only needed FSDS and FLDS in the input data as forcing, and the reason I am asking the above question is that I have two TSI files (the header of one of them is written below, in which tsi is one-dimensional and it only has these three variables, and doesnt have all of those variables in the SolarForcing1995-2005 file that I attached), so should I consider using these tsi files that I have? I mean, is it rational to use my file in this cam/solar path instead of that SolarForcing1995-2005 file, which has ten more variables in it?

Code:

  dimensions:
    time = 2016;
  variables:
    double time(time=2016);
      :_Fillvalue = -9999.0; // double
      :units = "days since 0000-01-01 00:00:00";
      :time_origin = "01-JAN-0000";
      :axis = "T";
      :calendar = "noleap";

    double tsi(time=2016);
      :_Fillvalue = -9999.0; // double
      :reference = "Reconstructed TSI since 0AD";
      :long_name = "TSI values ....";
      :units = "W/m^2";

    int date(time=2016);
      :format = "YYYYMMDD";

  // global attributes:
  :modification_date = "08-Apr-2024 14:44:56";
}

wvsi3w · Aug 25, 2025

How about when we have a volcanic forcing file for a long period? Which file we should replace with it?

These two paths are the closest I found but I dont know if I can use my volcanic forcing in either one:
inputdata/atm/cam/physprops/
inputdata/atm/cam/chem/emis/CMIP6_emissions_2000climo/

This is the header of the volcanic forcing file I have (if it is usable, let me know if I can use it and where I should put it):

Code:

dimensions:
    time = UNLIMITED;   // (24002 currently)
    date = 24002;
    lev = 8;
    lat = 64;
  variables:
    double date(date=24002);
      :units = "yyyymmdd";
      :FillValue = -9999.0f; // float

    double lat(lat=64);
      :units = "degrees_north";

    double lev(lev=8);
      :units = "hPa";

    double time(time=24002);

    double MMRVOLC(time=24002, lev=8, lat=64);
      :units = "kg kg-1";
      :FillValue = -9999.0f; // float
      :long_name = "layer volcanic aerosol mass mixing ratio";

    double colmass(time=24002, lat=64);
      :FillValue = -999.0f; // float

    int datesec(time=24002);
      :FillValue = -999; // int

}

Guidance on Running with LME datasets with CESM2.1.3 (CLM5)

Keith Oleson

CSEG and Liaisons

wvsi3w

Member

Keith Oleson

CSEG and Liaisons

wvsi3w

Member

wvsi3w

Member

wvsi3w

Member

wvsi3w

Member

wvsi3w

Member

Moderator

wvsi3w

Member

wvsi3w

Member

Moderator

wvsi3w

Member

Moderator

wvsi3w

Member

wvsi3w

Member

Moderator

wvsi3w

Member

wvsi3w

Member

Attachments

wvsi3w

Member