Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Guidance on Running with LME datasets with CESM2.1.3 (CLM5)

oleson

Keith Oleson
CSEG and Liaisons
Staff member
1. I think we answered your question about how to create a domain file for your 1deg atmospheric forcing data already, right?
"To create a domain file, as detailed in the link you shared, you basically need to create a SCRIP grid file and a SCRIP mapping file that can be used as input to gen_domain. Generally speaking, you could use components/clm/tools/mkmapgrids/components/clm/tools/mkmapgrids/mkscripgrid.ncl to create the SCRIP grid file, then cime/tools/mapping/gen_mapping_files/gen_ESMF_mapping_file/create_ESMF_map.sh to create the mapping file, and then cime/tools/mapping/gen_domain_files/gen_domain to create the domain file. Another option is to create a domain file using an offline script following the structure of the GSWP3 domain file example given above."

And I think we already answered your question about how to use your own atmospheric files:

"Basically, the datm is controlled by these files (at least in release-cesm2.1.5) that can be found in the CaseDocs directory in your case directory for a transient run:

datm_in
datm.streams.txt.CLMGSWP3v1.Precip
datm.streams.txt.CLMGSWP3v1.Solar
datm.streams.txt.CLMGSWP3v1.TPQW
datm.streams.txt.co2tseries.20tr
datm.streams.txt.presaero.trans_1850-2000
datm.streams.txt.topo.observed

To use your own atmospheric forcing, you need to modify those files, in particular, for 6-hourly meteorological forcing, the first four files.
You do this by copying those files into your case directory and pre-pending "user_" to the file names.
A snippet of the Precip file should look like this by default (GSWP3 is the default atmospheric forcing):

<?xml version="1.0"?>
<file id="stream" version="1.0">
<dataSource>
GENERIC
</dataSource>
<domainInfo>
<variableNames>
time time
xc lon
yc lat
area area
mask mask
</variableNames>
<filePath>
/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/atm_forcing.datm7.GSWP3.0.5d.v1.c170516
</filePath>
<fileNames>
domain.lnd.360x720_gswp3.0v1.c170606.nc
</fileNames>
</domainInfo>
<fieldInfo>
<variableNames>
PRECTmms precn
</variableNames>
<filePath>
/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/atm_forcing.datm7.GSWP3.0.5d.v1.c170516/Precip
</filePath>
<fileNames>
clmforc.GSWP3.c2011.0.5x0.5.Prec.1901-01.nc
clmforc.GSWP3.c2011.0.5x0.5.Prec.1901-02.nc
...

So you'll need to replace this with your own domain file, path to your forcing files, and your forcing files."

2. No, landuse timeseries files are not like streams files, you can't assign start and end years. If you are going to run a simulation from a start year to an end year, then the file will need to contain all of the years.

3. The "surfdata.pftdyn" files you are pointing to are transient landcover datasets. In older code, they were pointed to by "fpftdyn" in lnd_in, now it is called flanduse_timeseries in lnd_in. Whether you can directly use them in newer code is unknown.
 

wvsi3w

wvsi3w
Member
1. I think we answered your question about how to create a domain file for your 1deg atmospheric forcing data already, right?
"To create a domain file, as detailed in the link you shared, you basically need to create a SCRIP grid file and a SCRIP mapping file that can be used as input to gen_domain. Generally speaking, you could use components/clm/tools/mkmapgrids/components/clm/tools/mkmapgrids/mkscripgrid.ncl to create the SCRIP grid file, then cime/tools/mapping/gen_mapping_files/gen_ESMF_mapping_file/create_ESMF_map.sh to create the mapping file, and then cime/tools/mapping/gen_domain_files/gen_domain to create the domain file. Another option is to create a domain file using an offline script following the structure of the GSWP3 domain file example given above."

And I think we already answered your question about how to use your own atmospheric files:

"Basically, the datm is controlled by these files (at least in release-cesm2.1.5) that can be found in the CaseDocs directory in your case directory for a transient run:

datm_in
datm.streams.txt.CLMGSWP3v1.Precip
datm.streams.txt.CLMGSWP3v1.Solar
datm.streams.txt.CLMGSWP3v1.TPQW
datm.streams.txt.co2tseries.20tr
datm.streams.txt.presaero.trans_1850-2000
datm.streams.txt.topo.observed

To use your own atmospheric forcing, you need to modify those files, in particular, for 6-hourly meteorological forcing, the first four files.
You do this by copying those files into your case directory and pre-pending "user_" to the file names.
A snippet of the Precip file should look like this by default (GSWP3 is the default atmospheric forcing):

<?xml version="1.0"?>
<file id="stream" version="1.0">
<dataSource>
GENERIC
</dataSource>
<domainInfo>
<variableNames>
time time
xc lon
yc lat
area area
mask mask
</variableNames>
<filePath>
/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/atm_forcing.datm7.GSWP3.0.5d.v1.c170516
</filePath>
<fileNames>
domain.lnd.360x720_gswp3.0v1.c170606.nc
</fileNames>
</domainInfo>
<fieldInfo>
<variableNames>
PRECTmms precn
</variableNames>
<filePath>
/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/atm_forcing.datm7.GSWP3.0.5d.v1.c170516/Precip
</filePath>
<fileNames>
clmforc.GSWP3.c2011.0.5x0.5.Prec.1901-01.nc
clmforc.GSWP3.c2011.0.5x0.5.Prec.1901-02.nc
...

So you'll need to replace this with your own domain file, path to your forcing files, and your forcing files."

2. No, landuse timeseries files are not like streams files, you can't assign start and end years. If you are going to run a simulation from a start year to an end year, then the file will need to contain all of the years.

3. The "surfdata.pftdyn" files you are pointing to are transient landcover datasets. In older code, they were pointed to by "fpftdyn" in lnd_in, now it is called flanduse_timeseries in lnd_in. Whether you can directly use them in newer code is unknown.
Dear Keith,
Thank you very much for your kind response. I appreciate clarifying my 2nd and 3rd questions.

But for my first question, I think I should mention a few things.
As you said in this thread (Guidance on Running with LME datasets with CESM2.1.3 (CLM5)) = "and since you are fine with running at 0.9x1.25 for the surface data, you don't have to do that anymore."

Let me explain what I did:
So I thought since you said there is no need to go through all those steps I can do it my way, I created a TEST case, removed the forcings from datm7 directory, put my LME data (1year) in it, set the start and end of the forcing to that one year, renamed the data to be similar to the previous data (I am truly embarrassed, just wanted to check if it works), and then I think since I did the re-grid process on my data several times, I can do it again (which I did when I realized my test case fails when I had a domain file of 0.5 deg, then I regrid my 1-year LME data to be 0.5 deg). Then I tested it and it showed some ERRORs related to "Index exceeds dimension bound" which I thought is related to some missing EDGE variables in my files which GSWP3 had, I did fix that again with making my 1year LME data to have exactly the same header as the GSWP3 files I replaced them with (I modified them with some nco commands in sh scripts to look similar to the original files, which again I don't think is a good idea), which this also failed today with the attached ERROR (related to some NaN issue).

So right now I am lost, I don't think I understood exactly what you mean by that "you don't have to do that anymore" part. Because, as far as I understand, when we have a domain that is 0.9x1.25 (or 0.5x0.5 or whatever) and we have the same resolution for our forcings, then we don't need to create the domain (right?). So do you think it is ok this way and are there any general steps to take in this case?

Oh, also the stream files from my TEST case are showing the names of my modified LME forcings which I assigned. And since I was removing the GSWP3 forcings and replacing them with mine, I assumed there was no need to have user_ stream files anymore. because the user_ stream files are used to point to another directory where we keep our forcings, and in my current TES,T this is not applicable.

- And final thing is that I am not 100 percent sure if my forcing files which I gathered from LME and NorESM outputs of simulations are well organized to be used as forcing in CLM5. I did gather the essential variables like Precipitation, T, FLDS, FSDS, PS, Q, Wind (calculated from UandV low lev) and I did a lot of normal processes to make them look like GSWP3 forcings but still not sure if there is a tool in CESM to check them for possible mistakes or issues before running them. Because as you can see, I came across many missing things like LAT LON EDGEs and some other floats that I managed to put in the files, however still unsure if it actually worked or just I changed the header of the files and not the actual variables.
 

Attachments

  • ERROR after running with my method of renaming and editing LME files to be exactly like GSWP3's.txt
    2 KB · Views: 1

oleson

Keith Oleson
CSEG and Liaisons
Staff member
I think there is confusion between the processes for preparing atmospheric forcing data and preparing surface/timeseries datasets.
What resolution is your atmospheric forcing data at (1deg X 1deg?)?
What resolution are your surface/timeseries datasets at?
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
To answer your specific question "as far as I understand, when we have a domain that is 0.9x1.25 (or 0.5x0.5 or whatever) and we have the same resolution for our forcings, then we don't need to create the domain (right?)".
You still need a domain file to describe the atm grid you are using, whether you create it or use an existing one.
For example, the GSWP3 data is at 0.5x0.5 and so it uses a domain file:

/glade/campaign/cesm/cesmdata/inputdata/atm/datm7/atm_forcing.datm7.GSWP3.0.5d.v1.c170516/domain.lnd.360x720_gswp3.0v1.c170606.nc

which is specified in the datm streams.
If your atm resolution is 1deg X 1deg, then you'll need a 1deg X 1deg domain file. If your atm resolution is 0.9x1.25, you'll need a 0.9x1.25 domain file, and so on.
 

wvsi3w

wvsi3w
Member
I think there is confusion between the processes for preparing atmospheric forcing data and preparing surface/timeseries datasets.
What resolution is your atmospheric forcing data at (1deg X 1deg?)?
What resolution are your surface/timeseries datasets at?

Ok, so all of the atmospheric forcings I have ((these: Precip, Solar, TPHWL) for both LME and NorESM datasets) are now 1deg (0.9x1.25).

When I noticed the domain file for GSWP3 was 0.5 deg and I didn't want to go through the steps to create the domain file I easily re-grid the 1deg I had into 0.5deg res. It was no issue for re-gridding and I can do it as many times as we need. But, again, I know I shouldn't do that, its silly.

Moreover, I see that the IHistClm50BgcCrop case gives us 0.5deg atm forcing and 0.5deg domain (the domain which you also mentioned its path), but this case also has these paths for the surface dataset and landuse timeseries with both being 1deg (0.9x1.25):

fsurdat = '/home/XXX/projects/def-xxx/XXX/inputdata/lnd/clm2/surfdata_map/release-clm5.0.18/surfdata_0.9x1.25_hist_78pfts_CMIP6_simyr1850_c190214.nc'

flanduse_timeseries = '/home/XXX/projects/def-xxx/XXX/inputdata/lnd/clm2/surfdata_map/landuse.timeseries_0.9x1.25_hist_78pfts_CMIP6_simyr1850-2015
_c170824.nc'

So I assume it's ok to have 0.5deg atm forcing and 1deg surfda and timeseries because the model already worked fine with these (?).

This was the "IHistClm50BgcCrop" case, so you asked, "What resolution are your surface/timeseries datasets at?" Technically, it should be like what we have in the above-mentioned case (1deg, 0.9x1.25). And since we can not assign start and end years for these files (as you mentioned too), I need to create a file that contains the whole duration of my simulation (if my NorESM test starts from year 500 onwards, then these surfdat-timeseries files should have the same year), but again I see no need to change its resolution as it was 1deg in IHist case I mentioned.

The process of making these surface and timeseries files for having data from 500 onward or from 850 onward is simple, and we just need to copy data from the year 1850, and copy it 1000 times to have a file from 850, and copy 1350 times to have a file from the year 500.
BUT, you mentioned here in this thread (Guidance on Running with LME datasets with CESM2.1.3 (CLM5)) that " the mapping files for 0.9x1.25 already exist, go directly to creating a surface dataset, You could then modify the surface dataset and landuse timeseries file". So, is my approach of modifying the surfdat-timeseries close to what you already mentioned here? Or there is a way I am missing!

Regarding your next message, "You still need a domain file to describe the atm grid you are using, whether you create it or use an existing one." Yes, I agree, maybe I didn't convey my message thoroughly, I exactly meant that. Thanks.


----------- So -----------
If I understood correctly you are saying there is no need for me to regrid from 0.9x1.25 to 0.5x0.5 deg as I can simply use a domain that is similar to my atm forcing resolution (a 1deg domain), and when I have my 1deg (0.9x1.25) forcing files, I can go through the steps of making the domain, which you explained multiple times in this thread (and I am sorry if I misunderstood it).

But to me, this part where you say "Another option is to create a domain file using an offline script following the structure of the GSWP3 domain file" is still not clear. Also if we ignore this part where it was unclear a bit, I am still uncertain about the whole domain creation process, I will try again. I failed before and I explained it here but I will try again soon (this time with 0.9x1.25 res) and let you know again how it goes.

ok, let's say I must do the following, and correct me if I am wrong, please (thanks a million):

A) - create new case (IHistClm50BgcCrop 0.9x1.25).
B) - Go through the steps you mentioned for the domain file creation (a bit vague as my issues with it remain unsolved but I will try with 0.9x1.25 res again).
C) - copy for example 1000 years backward into these files to cover the simulation years I need to do "landuse.timeseries_0.9x1.25_hist_78pfts_CMIP6_simyr0850-2015_c170824.nc" and "surfdata_0.9x1.25_hist_78pfts_CMIP6_simyr0850_c190214.nc"
D) - Do the same thing for the CO2 file (as 1850s CO2 was clearly different that 850s or 500s)
E) - case.setup + copy restart files and rpointers from last spin-up to the current directory
F) - case.build
G) - for my second spin-up using LME, I do ./xmlchange DATM_CLMNCEP_YR_START=850 + END=851
H) - Do I need to do AD spin-up again? (./xmlchange CLM_ACCELERATED_SPINUP="on") or this time I can simply do ND (./xmlchange RUN_TYPE=startup).
I) - change "user_nl_clm" by pointing to the restart file "finidat" and the rest of the things I need to put there, like my soil layer structure etc.
- case.submit

GG) - after my second spin-up reached equilibrium, I should start doing the transient with a 100-year at a time (850-950, 950-1050, ...) so the DATM_START and END will be set accordingly + CO2 too.
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
"So I assume it's ok to have 0.5deg atm forcing and 1deg surfdata and timeseries because the model already worked fine with these (?)."

Yes, the datm will interpolate the 0.5deg atmospheric forcing data to the 0.9x1.25 land grid.

"So, is my approach of modifying the surfdat-timeseries close to what you already mentioned here?"

Sure, if you have an 1850 surface dataset, then it seems ok to copy that multiple times to create a landuse timeseries file for 500-1850. Obviously you won't have any landcover change during that time. I though you were going to try to use some of these datasets: Instructions | Community Earth System Model

"If I understood correctly you are saying there is no need for me to regrid from 0.9x1.25 to 0.5x0.5 deg as I can simply use a domain that is similar to my atm forcing resolution (a 1deg domain), and when I have my 1deg (0.9x1.25) forcing files, I can go through the steps of making the domain, which you explained multiple times in this thread (and I am sorry if I misunderstood it). But to me, this part where you say "Another option is to create a domain file using an offline script following the structure of the GSWP3 domain file" is still not clear."

Correct. And there are at least a couple of ways to create the domain file, 1) as per my instructions, create a SCRIP file, etc., or 2) create an offline script (e.g., python) to generate a file similar in form to the GSWP3 domain file but with lat/lon/mask information consistent with your 0.9x1.25 atmospheric forcing file. If you have any more trouble creating this domain file, then send me a sample of your atmospheric forcing files and I can see if we have a domain file that will work for you. One consideration is whether your atmospheric forcing files have data for every 0.9x.1.25 grid cell. If not, the mask variable on the domain file will need to contain data (1s and 0s) that says whtether there is valid forcing at each grid cell.
 

wvsi3w

wvsi3w
Member
"So I assume it's ok to have 0.5deg atm forcing and 1deg surfdata and timeseries because the model already worked fine with these (?)."

Yes, the datm will interpolate the 0.5deg atmospheric forcing data to the 0.9x1.25 land grid.

"So, is my approach of modifying the surfdat-timeseries close to what you already mentioned here?"

Sure, if you have an 1850 surface dataset, then it seems ok to copy that multiple times to create a landuse timeseries file for 500-1850. Obviously you won't have any landcover change during that time. I though you were going to try to use some of these datasets: Instructions | Community Earth System Model

"If I understood correctly you are saying there is no need for me to regrid from 0.9x1.25 to 0.5x0.5 deg as I can simply use a domain that is similar to my atm forcing resolution (a 1deg domain), and when I have my 1deg (0.9x1.25) forcing files, I can go through the steps of making the domain, which you explained multiple times in this thread (and I am sorry if I misunderstood it). But to me, this part where you say "Another option is to create a domain file using an offline script following the structure of the GSWP3 domain file" is still not clear."

Correct. And there are at least a couple of ways to create the domain file, 1) as per my instructions, create a SCRIP file, etc., or 2) create an offline script (e.g., python) to generate a file similar in form to the GSWP3 domain file but with lat/lon/mask information consistent with your 0.9x1.25 atmospheric forcing file. If you have any more trouble creating this domain file, then send me a sample of your atmospheric forcing files and I can see if we have a domain file that will work for you. One consideration is whether your atmospheric forcing files have data for every 0.9x.1.25 grid cell. If not, the mask variable on the domain file will need to contain data (1s and 0s) that says whtether there is valid forcing at each grid cell.

- About the surface dataset, I received the response from a colleague who I am using their NorESM output as forcing, and in their response he mentioned "in our past1K or 1.5K simulation, the model reads in the constant (1850) land surface data and use it throughout the run from 500 to 1860 without any need of copying it many times."
So do you think this option of using a constant 1850 surfdat + landuse for the simulations prior to 1850 is available only in NorESM or do we have it too in CESM?

- I have put the steps I took for the domain creation in the attached text file (also the errors). It failed again.

- About the domain file which you kindly suggested to look at, first, I found one domain with 1deg res in this path:
inputdata/share/domains/
domain.lnd.fv0.9x1.25_gx1v7.151020.nc
Do you suggest using this domain for my forcings (I doubt it)? In case that's not useful, I have uploaded 3 different forcings on my drive here in this link in zip files as you suggested to look at (I am very grateful for your help): Shared_forcings - Google Drive



One-year sample of my NorESM monthly 1deg forcing (for two members) + LME-6_hourly (which is 2.5GB in size) I couldn't upload my forcings in the attachment because of size issues. If there are any issues with them, let me know so that I can share them another way.

- And about the other dataset for surfdat and landuse, which starts from 850, since you said this about the landuse files vs pfts: "Whether you can directly use them in newer code is unknown", I became hesitant, and honestly, it's not worth the risk and time, you are right (I will stick to the 1850 data, either using constant option or copying it many times to have that file).

- lastly, in the README file in this path "my_cesm_sandbox/components/clm/tools/mkmapdata" it says:

" regridbatch.sh ------- Script to run mkmapdata.sh for many resolutions on cheyenne"

I am not on Cheyenne, do you think that might be the problem? However, the error is the same even when I did the ./mkmapdata.sh alone. I still think something is wrong with my forcing data.
 

Attachments

  • steps of testing domain creation.txt
    4 KB · Views: 1

oleson

Keith Oleson
CSEG and Liaisons
Staff member
I'm not back in the office until next week, but a couple of quick thoughts:
I suppose you could set up a transient compset and then remove the transient landuse part of it, instead just using the 1850 surface dataset.
I think you could use that domain file. But, you'd have to set the mask everywhere to 1, right now it is masked for ocean (0), and your atm forcing data appears to be valid at every grid cell. And your latitudes in your forcing file are not the typical fv0.9x1.25. Yours are:

-89.53125, -88.59375, ...

While the domain file has:

-90, -89.0575916230368, ...

So you'd have to modify the coordinates in the domain file to match. Or modify the coordinates of the forcing files to match (regrid).
Your longitudes appear to match.
 
Top