questions about CLM5 biogeochemistry

xgao304 · May 18, 2022

I would like to do some sensitivity analyses of CLM5 biogeochemistry. My experiment design is to run CLM5 biogeochemistry component only
at the flux-site level with landuse change, crop/irrigation (agriculture), fire components turned off. Here is the detail experiment design:

1. Perform the simulations with "I1850CLM50BGC" compset (stub ICE, SOCN, SROF, SGLC, and SWAV) in accelerated decomposition (AD) spin-up mode
followed by regular spin-up mode with aerosol, n deposition, and CO2 set as 1850 values to ensure the equilibrium has been reached.
I will cycle the 20-year GSWP3 forcing from 1901-1920 during both spin-up mode runs.

2. Once the equilibrium has been reached, perform the transient run with "IHISTCLM50BGC" compset (stub ICE, SOCN, SROF, SGLC, and SWAV)
with transient time series of aerosol, n deposition and CO2, and forcing from 1901-2014.

Here are my questions:

1. I saw different studies have used different lengths of AD and regular spin-up to ensure the equilibrium. Since it is not feasible to check the equilibrium at each fluxnet site (we have more than 80 sites to evaluate), what would be the recommended lengths for AD and regular spin-up, respectively?

2. Is this a suitable experiment design? Or I should use one compset (eg. I2000CLM50BGC) for both AD/regular spin-up and transient runs but only change the
aerosol/n deposition/CO2 values correspondingly for each run? Also, there are two surface datasets to choose as follows and I am not sure which one should be used for IHISTCLM5BGC if we go for the 1)+2) experiment design.

surfdata_360x720cru_hist_16pfts_Irrig_CMIP6_simyr1850_cXXXXX.nc
surfdata_360x720cru_hist_16pfts_Irrig_CMIP6_simyr2000_cXXXXX.nc

Both are generated using "mksurfdata.pl" tool

3. Although I want to turn landuse change, irrigation/crop, fire off by not specifying the required components, the corresponding "lnd_in" file still specifies
irrigation (true) as well as the data for population density (related to fire), urban, and light frequency (related to fire). I am wondering if these data are simply placeholders and actually will not affect the biogeochemistry calculation?

Thanks,

Xiang

erik · May 20, 2022

1. I saw different studies have used different lengths of AD and regular spin-up to ensure the equilibrium. Since it is not feasible to check the equilibrium at each fluxnet site (we have more than 80 sites to evaluate), what would be the recommended lengths for AD and regular spin-up, respectively?

First I'm not the best to answer this, but I'll try. What's documented in the CLM User's Guide is 600 years. But, I know from practice global simulations are often done into 2000 years to get the high latitudes to equilibrium. But, you shouldn't need that long for sites at mid latitudes. Since, we know that the colder high latitude sites are the ones that need the greatest spinup you could just check your coldest site and figure the rest will have reached equilibrium well before it does.

2. Is this a suitable experiment design? Or I should use one compset (eg. I2000CLM50BGC) for both AD/regular spin-up and transient runs but only change the
aerosol/n deposition/CO2 values correspondingly for each run? Also, there are two surface datasets to choose as follows and I am not sure which one should be used for IHISTCLM5BGC if we go for the 1)+2) experiment design

This is the standard way that spinup is done. The only reason I'd suggest something different is if you had to be really careful with your computing costs. Or if you had other constraints.

For the IHist compset you use the 1850 surface dataset and the 1850-2005 landuse.timeseries file that goes with it.

3. Although I want to turn landuse change, irrigation/crop, fire off by not specifying the required components, the corresponding "lnd_in" file still specifies
irrigation (true) as well as the data for population density (related to fire), urban, and light frequency (related to fire). I am wondering if these data are simply placeholders and actually will not affect the biogeochemistry calculation?

They are going to affect your simulations. But, you could turn their affects off by adding settings into your user_nl_clm file to turn them off or have them cycle over a restricted range of years.

xgao304 · May 24, 2022

Hi Erik,

Thanks for the reply. I have some follow-up questions.

I would like to use the PFT specified at each Fluxnet site to drive the simulations instead of using the PFTs from the CLM5 dataset.
However, some work is required to map between the PFTs of the Fluxnet and CLM5. The vegetation of the Fluxnet sites is classified
base on IGBP, which contains the following broad categories relevant to my study:

five forests: evergreen needleleaf,
evergreen broadleaf
deciduous needleleaf
deciduous broadleaf
mixed forest

four shrub: closed shrublands
open shrublands
woody savannas
savannas

Permanent wetlands
urban and built-up lands
Grassland
Water body

In CLM5, each broadleaf/neeedleleaf evergreen/deciduous forest can have boreal, temperate, and tropical categories. And shrub is categorized
as broadleaf evergreen temperate, broadleaf deciduous temperate, and broadleaf deciduous boreal; grassland is subdivided into
C3 arctic, C3 grass, and C4 grass.

I am wondering if there is a simple rule of thumb to map between IGBP and CLM5. I tried to look further into PTCLM.dat which contains 44 fluxnet sites, but could not find a clear pattern - as the same IGBP type can be mapped into different CLM5 PFTs. I don't know what criteria were used when this dataset was generated.

Another question is related to permanent wetlands (we have 37 fluxnet sites for wetlands). I know starting from CLM4.5, the wetland land unit is replaced with surface water store. My question is when I use "singlept" tool to generate site-based domain and surface datasets, how should
I specify the following variables:

overwrite_single_pft =
dominant_pft
zero_nonveg_pfts
uniform_snowpack
no_saturation_excess

The same question also applies to the IGBP type urban and water body (I have a couple of sites for these two types). Should "waterbody" and "urban" correspond to "lake" and "urban" landunits in CLM5, respectively? Then how to specify the above variables when generating the domain and surface datasets?

Sorry for so many questions. Thanks a lot.

oleson · May 25, 2022

I've attached a python script (written by another member of our group) that was used to define pfts, etc. for a number of tower sites that had IGBP classifications.
Some climate rules were used to classify into boreal, temperate, tropical.
For wetland, baseflow_scalar was set to 0 in user_nl_clm.
Waterbody could be lake although our lakes are generally deep lakes, you might have to adjust lake depths.
There are three urban landunits available in general (tall building district (TBD), high density (HD), and medium density (MD) so you would need to pick the most appropriate one, or pick one and adjust the urban parameters on the surface dataset accordingly.
I think that in general for these sites,
uniform_snowpack = True
no_saturation_excess = True

xgao304 · May 26, 2022

Thanks a lot.

questions about CLM5 biogeochemistry

xgao304

Member

erik

Erik Kluzek

CSEG and Liaisons

xgao304

Member

oleson

Keith Oleson

CSEG and Liaisons

Attachments

xgao304

Member