Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Regarding using high resolution USDA cropscap crop cover data for improved crop yield in CLM.

uzairrahil

Mohammad Uzair Rahil
New Member
Dear Scientists,

I am currently working on a regional case study focused on simulating crop yields and projecting their future trends in Lower Michigan. The configuration I’m using is:

  • CTSM Version: alpha-ctsm5.2.mksrf.23_ctsm5.1.dev171
  • Resolution: 0.05° over Lower Michigan
  • Compset: IHISTCLM50_BGC_Crop
  • Meterological Forcing: CONUS404
Unfortunately, the crop yield results from my simulation are not satisfactory. To improve accuracy, I am considering incorporating USDA Cropscape data, given its high spatial resolution (10m–30m). Specifically, I aim to replace the default crop types: rainfed_temperate_corn, rainfed_temperate_soybean, and spring_wheat.

I have a few uncertainties and would appreciate your guidance:

  1. Do you think incorporating Cropscape data will improve yield results, particularly for the above crop types? Given its higher resolution, I am hopeful it will offer better spatial representation—but I’m interested in your experience or opinion. Potentially, what other data could we replace so it will improve?
  2. Regarding implementation, I’ve come across two approaches in various threads:
    • The first is to modify the raw land cover files and use mksurfdata_map to regenerate the surface dataset after that. I find this complex.
    • The second, which I’m considering, is to generate the surface and dynamic land use time series files for the period 1980–2022 (Already Generated), and then modify the crop fractions within the dynamic land use time series using Cropscape data.
    • Which approach would you recommend? Are there any specific threads, papers, or resources you suggest I review?
  3. Based on your experience, what are the most critical factors I should pay close attention to when working on such a case? I’d be grateful for any tips or lessons learned you can share.
  4. Which kind of spinup do you recommend. I am considering to do spinp as follows:
Create a new case for spinup, and then: (repeat first year forcing for each other year).
./xmlchange RUN_STARTDATE=0001-01-01
./xmlchange DATM_YR_ALIGN=1
./xmlchange DATM_YR_END=1980
./xmlchange DATM_YR_START=1980

and run the case from 1980 till 2022 and then use the finidate for the actual run. (1980-2022).

I truly appreciate your support and guidance in advance.

Warm regards,
MUR
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Regarding 2), the most robust way is to modify the raw input files. Then you will be taking advantage of the aggregation rules that are built into the mksurfdata_map code. And you'll be able to generate surface datasets at other resolutions or other regions if needed.
Regarding 3), we generally recommend spinning up over multiple atmospheric years to avoid biasing the model. E.g., if the single year you've chosen to spinup over is a very dry or very wet year, then the initial conditions you generate will be biased toward very dry or very wet conditions.
 

uzairrahil

Mohammad Uzair Rahil
New Member
Thank you very much for your response, dear. @oleson . I greatly appreciate your guidance. I do, however, have a few remaining questions and would value your insight on them.
My goal is to use the USDA cropscap crop land data layer (30m) for a regional case in lower Michigan for crop yield improvement through the IHIST_CLM50_BGC_CROP compset. I use dynamic land use time series from 1980-2022 and also created surface data through mksufdata_esmf . I am using CONUS404 on 0.05 deg resolution.




The "landuse_timeseries_hist_1980-2022_78pfts.txt" Created has some useful directories and each year has four global files as below :
=================================
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/CTSM53RawData/globalctsm53histTRENDY2024Deg025_240728/mksrf_landuse_ctsm53_histTRENDY2024_1980.c240728.nc 1980
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/CTSM53RawData/globalctsm53histTRENDY2024Deg025_240728/mksrf_landuse_ctsm53_histTRENDY2024_1980.c240728.nc 1980
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/gao_oneill_urban/historical/urban_properties_GaoOneil_05deg_ThreeClass_1980_cdf5_c20220910.nc 1980
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/lake_area/mksurf_lake_0.05x0.05_hist_clm5_hydrolakes_1980.cdf5.c20220325.nc 1980

and so on for other years till 2022 (my simulation end year). While the first two are the same for each year and are repeated for each.
================================






1 - Can you please confirm whether the first two files for each simulation year—those containing variables such as PCT_CFT, PCT_NAT_PFT, and FERTINITRO_CF—are indeed the ones I need to modify?

2- Besides these files, are there any other files I should be modifying? If so, which variables in those files specifically need to be updated?

3- Should I regrid the USDA CDL dataset to match my simulation resolution of 0.05°, or is it acceptable to use the original 30-meter resolution directly by CLM?

4- Since the raw land use/land cover data is global and at a coarser resolution, should I clip the dataset to match my forcing domain (simulation meshgrid) before modification, or can I modify the entire global dataset? If I clip it, will CLM still accept and properly interpret the resulting file during simulation? what will be the proper way to clip or subset?

5- After Modifications, Do I need to add something in the user_nl_clm or any other configurations I need to change?

Thank you so much for your guide.
Cheers
Rahil
 

uzairrahil

Mohammad Uzair Rahil
New Member
Plz Ignore the above message:





Thank you very much for your response, dear. @oleson . I greatly appreciate your guidance. I do, however, have a few remaining questions and would value your insight on them.
My goal is to use the USDA cropscap crop land data layer (30m) for a regional case in lower Michigan for crop yield improvement through the IHIST_CLM50_BGC_CROP compset. I use dynamic land use time series from 1980-2022 and also created surface data through mksufdata_esmf . I am using CONUS404 on 0.05 deg resolution.




The "landuse_timeseries_hist_1980-2022_78pfts.txt" Created has some useful directories and each year has four global files as below :
=================================
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/CTSM53RawData/globalctsm53histTRENDY2024Deg025_240728/mksrf_landuse_ctsm53_histTRENDY2024_1980.c240728.nc 1980
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/CTSM53RawData/globalctsm53histTRENDY2024Deg025_240728/mksrf_landuse_ctsm53_histTRENDY2024_1980.c240728.nc 1980
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/gao_oneill_urban/historical/urban_properties_GaoOneil_05deg_ThreeClass_1980_cdf5_c20220910.nc 1980
/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/rawdata/lake_area/mksurf_lake_0.05x0.05_hist_clm5_hydrolakes_1980.cdf5.c20220325.nc 1980

and so on for other years till 2022 (my simulation end year). While the first two are the same for each year and are repeated for each.
================================






1 - Can you please confirm whether the first two files for each simulation year—those containing variables such as PCT_CFT, PCT_NAT_PFT, and FERTINITRO_CF—are indeed the ones I need to modify?

2- Besides these files, are there any other files I should be modifying? If so, which variables in those files specifically need to be updated?

3- Should I regrid the USDA CDL dataset to match my simulation resolution of 0.05°, or is it acceptable to use the original 30-meter resolution directly by CLM?

4- Since the raw land use/land cover data is global and at a coarser resolution, should I clip the dataset to match my forcing domain (simulation meshgrid) before modification, or can I modify the entire global dataset? If I clip it, will CLM still accept and properly interpret the resulting file during simulation? what will be the proper way to clip or subset?

4- The CDL available for my study area is from 2007-2022, not for the initial years (1980-2007), how can CLM handle this as some files (2007 onwars ) will have fine resolution PCT_PFT (if this variable I should modify ) and before that will be coarser (default resolution 0.25deg)
5- After Modifications, Do I need to add something in the user_nl_clm or any other configurations I need to change?


I look forward to your kind response with patience.

Best regards,
Rahil
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Given the complications you've listed regarding mismatches in space and time, I've reconsidered my suggestion to use mksurfdata_esmf. I think it would be simplest to modify the landuse timeseries file you've already created at 0.05deg. You shouldn't need to modify the surface dataset since it is for 1980 and your new data starts in 2007. The potential variables to modify would be FERTNITRO_CFT, PCT_CROP, PCT_CROP_MAX, PCT_NAT_PFT, PCT_CFT, PCT_CFT_MAX. I think you'd only need to modify PCT_CROP and PCT_CROP_MAX if you changed the total crop area within a gridcell. Modifying PCT_NAT_PFT should only be necessary if you change the distribution of natural vegetated pfts.
 

uzairrahil

Mohammad Uzair Rahil
New Member
Dear Oleson,

Thank you very much for your helpful response.

For final confirmation, I’d like to clarify that the CDL includes crop cover for corn, soybean, and wheat—these are the crops whose coverage I intend to update in the land use time series. In this case, do you think modifying only the following variables would be sufficient: PCT_CFT, PCT_CROP, and PCT_CROP_MAX? Or would you recommend adjusting any additional variables as well?


Thank you again for your guidance.
 

slevis

Moderator
Staff member
You may not need to modify other variables, as long as the data in the file remain internally consistent (e.g. percentages add to 100 when needed).
 
Top