Main menu

Navigation

Creating CLM4.5 surface datasets

12 posts / 0 new
Last post
sacks
Creating CLM4.5 surface datasets

Here are questions asked by Alan Rhodes, which I am re-posting here, and will answer shortly:

 

Hi Bill,

 

I spoke with you several months via email with Colin Zarzycki, Erik Kluzek, and Michael Levy, when I was creating CLM4.0 surface/initial condition datasets for some of the variable resolution global climate modeling (VRGCM) that my research team is conducting.  I'm now in the process of generating CLM4.5 surface and initial condition datasets for my VRGCM grids (which use the FAMIPC5 compset), using the methodology learned from CLM4.0, and I had a few questions for you.

 

1) I'm currently trying to generate CLM4.5 surface datasets and I have successfully created all of the necessary mapping files for our VRGCM 14km and 28km grids (for CLM4.0 too, which we got operational last year); however, when I use ./mksurfdata.pl (or more specifically, execgy ./mksurfdata.pl -res usrspec -usr_gname AR_30_x4 -usr_gdate 150130) I get the following error: 

 

 Attempting to make inversion-derived CH4 parameters.....

 domain_read_dims_2d read lon and lat dims from lsmlon/lsmlat

 domain_read initialized domain

 domain_read read mask

(gridmap_map_read) reading mapping matrix data...

(gridmap_map_read) * file name                  : ../../shared/mkmapdata/map_360x720cru_cruncep_to_AR_30_x4_nomask_aave_da_c150130.nc

 * matrix dimensions rows x cols :      259200  x       75062

 * number of non-zero elements:       165933

 Open CH4 parameter file:

 /glade/p/cesm/cseg/inputdata/lnd/clm2/rawdata/mksrf_ch4inversion_360x720_crunce

 p_simyr2000.c130322.nc

 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than

   1.00000000000000       at        52451

 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than

   1.00000000000000       at        58647

 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than

   1.00000000000000       at        62882

 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than

   1.00000000000000       at        72880

 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than

   1.00000000000000       at        73168

 

...and it will not proceed to create the 1850 and/or 2000 surface datasets.  Any ideas on how to best get around this issue?

 

2)  At the NCAR tutorial this past August, Erik Kluzek mentioned that there is a 3' or 5' resolution PFT file available.  Is there anywhere I could get a hold of this for both 1850 and 2000?  If so, do they "embed" themselves into any other part of the CLM file generation process or do I simply specify this PFT file in the user_nl_clm file (via fpftcon) and send the simulation to the queue?  I will be running CAM and CLM at variable resolutions ranging from 1 degree to 0.25 and 0.125 degree, so I thought it might be interesting to see how the model responds to a higher resolution PFT file to capitilize on the fact that we are running at <0.5 degree resolutions (since the standard resolution for the PFT file is at 0.5 degree).

 

3) I've heard from Mark Flanner (and his graduate students) that CLM4.5 has benefits in terms of better snowpack/canopy interactions, better surface energy budgets (splitting surfaces into snow/vegetated cover and non-vegetated cover), and other added benefits.  This improvement has led my research team to want to move from CLM4.0 to CLM4.5 in our CESM production runs.  But, I've also heard that there was an issue with forest fires?  Was this ever an issue and if so has it been alleviated?  I'm currently using CESM version 1.2, so I'm just making sure that my version of the model wouldn't have that issue.

 

4) In the FAMIPC5 compset, I believe the default CLM surface dataset is 1850.  I think we would like to use the 2000 surface dataset for our runs (a few runs will be from 1979-2005 and then future RCP scenario runs 2060-2080 and 2080-2100).  Are there major differences between the two datasets?  How hard would it be to use the FAMIPC5 compset with 2000 surface/initial condition datasets?  I diff'ed the 1850 and 2000 surface dataset files we generated several months ago and plotted them in NCL and it appears that globally there may be large discrepancies in PFTs (which would impact the results of our climate simulations), but over our region of interest (western USA) there doesn't appear to be a significant difference other than non-Arctic grasses and croplands. 

 

Any insights (or other points of contact) on these questions would be extremely helpful.

 

All the best,

 

AR       

-- 

 

Alan Rhoades

PhD Student, Atmospheric Science Graduate Group

Climate Change Water and Society (CCWAS) NSF IGERT Trainee

University of California, Davis

LinkedIn

alan.m.rhoades@gmail.com 

 

amrhoades@ucdavis.edu

Bill Sacks

CESM Software Engineer

sacks

(1) Yes, sorry: In the CESM release code base (and older versions of the development code), this error check is too strict. To fix it, open models/lnd/clm/tools/clm4_5/mksurfdata_map/src/mkCH4inversionMod.F90, find the line that sets max_valid_f0 (around line 79), and change it to:

  real(r8), parameter :: max_valid_f0    = 1._r8 + 1.0e-14_r8

If necessary, you could increase the tolerance beyond 1e-14; please let me know if you need to do that.

 

(2) The high res (3') pft file is pointed to during the surface dataset generation process. In order to use it, add the option:

-hirespft

to your mksurfdata.pl command. However, I think this is only available for year 2000, not 1850 (at least, that's what the documentation says).

However: Based on question #4, it sounds like you want to run transient simulations, which require fpftdyn in addition to the surface dataset. We do not have transient data at the 3' resolution, so unfortunately you will not be able to use the high-res pft option in this case.

 

(3) Yes, you have heard correctly on all counts. The main fire bug was fixed in recent CLM development versions, but the latest release code base (the CESM1.2 series) still has this bug. For offline CLM runs that have been done, this bug hasn't been too problematic, but it caused excessive fires in the Amazon when coupled with CAM, due to CAM simulating a too-dry climate there (setting up positive feedbacks). 

Also note that CLM4.5 is not considered scientifically validated for use in coupled simulations, for this and other reasons. I'm not clear on whether you're using CAM or a different atmosphere model. Maybe you're doing your own careful vetting of the coupled simulation anyway, so that this lack of scientific validation isn't an issue? (See http://www2.cesm.ucar.edu/models/scientifically-supported).

 

(4) I *think* that the default for AMIP compsets is to use an 1850 surface dataset but also use an fpftdyn file. This provides transient PFTs, so that the appropriate PFT breakdown is used for each year. You can confirm this by looking at whether fpftdyn is given in your lnd_in file. When fpftdyn is given, then it is irrelevant whether you use a year-1850 or year-2000 surface dataset (but the model expects you to provide a year-1850 dataset, and some error-checks may fail if you try to give it a 2000 surface dataset). To answer your more general question: the only difference between year-1850 and year-2000 surface datasets are in the pft distributions, as you have already investigated – and again, these are overridden by fpftdyn in a transient simulation.

Note that, in order to create a fpftdyn file, you will need to add the -y option to mksurfdata.pl:

-y 1850-2000

and for RCP runs you need:

-y 1850-2100 -rcp 2.6,4.5,6,8.5

 

(that will generate output for ALL of the RCPs; you can use any one of the generated files for the 1850-2005 period, since they should all be identical for that period).

Bill Sacks

CESM Software Engineer

alan.m.rhoades@...

Hi Bill,

Thanks for all of the feedback you provided.  All of it was extremely helpful and clear.

I had a follow up question.  When I try to use the -hires option (I tried -hirespft, but it doesn't seem to be the right syntax) I get the following error output:

"** trouble getting vegtyp file with: ./../../../bld/queryDefaultNamelist.pl -csmdata /glade/p/cesm/cseg/inputdata -silent -justvalue -phys clm4_0 -onlyfiles  -res 3x3min -options sim_year=1850 -var mksrf_fvegtyp -namelist clmexp"

Does the -hires option only work for CLM4.5?  Do I have to specify something regarding the 2000 year (you mentioned previously that there isn't a 3' PFT file for 1850, so I was curious if I may have neglected a detail in the command line argument)?  If so, what would be the appropriate call to mksurfdata.pl?

Thanks again,

AR

Alan Rhoades
PhD Candidate, Atmospheric Science Graduate Group
University of California, Davis

sacks

Hi Alan,

-hires should work for CLM4.0. You're right: there are slightly different options for CLM4.0 vs CLM4.5:

For CLM4.5, most input data come in at high resolution by default, and you can just choose between low res vs high res pfts using the -hirespft option. That is only available for year-2000 datasets.

For CLM4.0, most input data come in at low resolution by default, but you can choose to use high-res raw data for a bunch of different fields – pfts and others – using the -hires option. But the vegtyp file only exists at high res for year-2000 datasets: specifying -y 2000 to mksurfdata.pl. In principle it should be possible to use the low-res raw data for PFTs in CLM4.0 but the high-res raw data for other fields. You would need to do that in order to produce CLM4.0 surface datasets for year 1850 or transient which use as much high-res raw data as possible. At a glance, it looks like this can't be done out-of-the-box, but I think you could change a few lines in some of the xml files to get that to work. If you need this, let me know, and I can point you to the necessary changes.

Bill Sacks

CESM Software Engineer

alan.m.rhoades@...

Hi Bill,

Sounds great.  I just tried creating the hires dataset for the year 2000 in CLM4.0 and CLM4.5, but ran into the following error message in both:

" * matrix dimensions rows x cols :      259200  x       75062

 * number of non-zero elements:       230569

 Successfully made VOC Emission Factors

 domain_clean: cleaning       259200

 pctpft < 0.0 =  -5.643112366282566E-011  suma, pcturb_excess, sumpft =

   96.1725369169036       0.279375325201691        7.29922977721126

abort:

ERROR in mksurfdata_map: 34304"

This was the command line argument I used for CLM4.0...

"execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname AR_30_x4 -usr_gdate 150129"

...and this was the command line argument I used for CLM4.5...

"execgy ./mksurfdata.pl -hirespft -y 2000 -res usrspec -usr_gname AR_30_x4 -usr_gdate 150130"

Any ideas on what is going on?  Seems like there may be a script tolerance level issue again?

Thanks in advance,

AR

Alan Rhoades
PhD Candidate, Atmospheric Science Graduate Group
University of California, Davis

sacks

Hi Alan,

I agree that this just looks like a script tolerance issue. Recent versions of mksurfdata_map (in the development code) have addressed this, I think, for CLM4.5. But in the meantime, you can search for the given error message in mksurfdat.F90 and replace the conditional in this line (around line 1153 in the clm4.0 code):

                   if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*50.0_r8 )then

                      call abort()

                   end if

 with something like: 

                   if ( abs(pctpft(n,m)) > 1.e-10_r8 )then

 

Bill Sacks

CESM Software Engineer

alan.m.rhoades@...

Hi Bill,

Looks like that modification worked for both CLM4.0 and CLM4.5.  I was able to create a year 2000 surface dataset for both.  

This was the command line argument I used for both:

"execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname AR_30_x4 -usr_gdate 150130"

To confirm your previous thought on ftpdyn files, I also tried to create a 1850-2000 dataset and an error occured right away.  So, it doesn't look like a transient surface dataset can be generated for historical runs.  Is this also the case for RCP scenarios?  Or, could I use a -y 2000-2100 option to generate a 3' FTPDYN for any of the RCP scenarios? 

Alan Rhoades
PhD Candidate, Atmospheric Science Graduate Group
University of California, Davis

sacks

I'm glad that worked for you. Thanks for the confirmation. You also canNOT use -hirespft for RCP scenarios.

Bill Sacks

CESM Software Engineer

alan.m.rhoades@...

Thanks for the info on the 3' PFT datasets.  Last question, since coupled atm/lnd CESM simulations are now being pushed to resolutions of ~25km (or even ~10km in some cases) and it would be advantagous to have higher resolution surface datasets for these simulations (especially for RCP climate analysis), are there plans to create 3' hires datasets for RCP scenarios and/or for FTPDYN 1850-2005 surface datasets?  If not, are there capabilities/tools to create your own or, as I would imagine, would this take a substantial effort?

Thanks again for all of the prompt replies, they were super helpful.

AR

Alan Rhoades
PhD Candidate, Atmospheric Science Graduate Group
University of California, Davis

sacks

The next iteration of transient datasets (for CMIP6) will likely be produced at 1/4 degree rather than the present 1/2 degree - but I realize that's still far from 3'. I think you're right that producing higher-resolution versions of these datasets would take substantial effort. However, I'm forwarding this message to Peter Lawrence, who may be able to comment further.

Bill Sacks

CESM Software Engineer

rosh.me91@...

Hi,

Im getting a similar error in creating surface datasets

Im trying to generate surface data files for res 1.9x2.5 with crop on for years 1850 to 2005. Im using the following command:

 

./mksurfdata.pl -crop -years 1850-2000 -res 1.9x2.5.

 

On using this im getting the following error:

** trouble getting vegtyp file with: ./../../../bld/queryDefaultNamelist.pl -res 1.9x2.5 -csmdata /scratch/cas/phd/asz122525/inputdata silent -justvalue -phys clm4_0 -onlyfiles  -res 0.5x0.5 -options sim_year=1850,crop='on' -var mksrf_fvegtyp -namelist clmexp

 

Im not sure why Im getting this error. Im not using hirespft. Also i tried th same in clm4.0 with -irrig on but got the same error.


Please let me know how to resolve this

 

Thank you in advance!

 

Roshni Mathur

IIT, Delhi

alan.m.rhoades@...

Reposting...

Hi Bill and Erik,

 

I'm currently generating some variable-resolution CESM (VR-CESM) CLM4.0 surface datasets (both 1850-2000 0.5 deg and 2000 -hires) for a set of three CONUS grids (55, 28, and 14 km) given to me by Colin Zarzycki.  

 

I'm using cesm1_2_rel06 and cesm1_3_beta01 to generate the files, which has been successful in previous VR-CESM cases that I have generated.

 

Unfortunately, I'm facing similar issues to the ones I found previously with tolerance issues in mksurfdat.F90 and/or mkgridmapMod.F90.

 

Here is command line argument...

 

execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname conus_30_x2 -usr_gdate 170831

 

Here is the error output (ERROR in mksurfdata_map: 34304)...

 

 Attempting to make VOC emission factors .....

 domain_read read lon and lat dims

 domain_read initialized domain

 domain_read read LANDMASK

 Open VOC file:

 /glade/p/cesm/cseg/inputdata/lnd/clm2/rawdata/mksrf_vocef_0.5x0.5_simyr2000.c11

 0531.nc

(gridmap_map_read) reading mapping matrix data...

(gridmap_map_read) * file name                  : ../../shared/mkmapdata/map_0.5x0.5_AVHRR_to_conus_30_x2_nomask_aave_da_c170831.nc

 * matrix dimensions rows x cols :      259200  x       55082

 * number of non-zero elements:       226392

 Successfully made VOC Emission Factors

 

 domain_clean: cleaning       259200

 pctpft < 0.0 =  -2.678134269906707E-009  suma, pcturb_excess, sumpft =

   77.0426326381054        4.70448108348547        20.4922497590901

abort:

ERROR in mksurfdata_map: 34304

 

I have modified mksurfdat.F90 to...

 

!!!! AR MODIFICATION FOR TOLERANCE

!                   if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*50.0_r8 )then

                   if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*1.e-18_r8 )then

 

...and modified mkgridmapMod.F90, even though the mapping files seem to be OK, too...

 

!!!! AR MODIFICATIONS FOR TOLERANCE 9/1/17

!    real(r8), parameter   :: tol = 1.0e-5_r8  ! tolerance for checking that mapping data

                                              ! are within expected bounds

    real(r8), parameter   :: tol = 1.0e-4_r8  ! tolerance for checking that mapping data

                                              ! are within expected bounds

 

Neither of these strategies seems to work though.  I don't want to continue to push down the mksurfdat.F90 further, as it is already a much lower tolerance.  

 

Do you have any leads on how to address this?  

 

Thanks in advance,

 

AR

EMAIL 2

Hi Alan One problem is that you are going the wrong direction on the tolerance for abs(pctpft). You should do something like change if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*50.0_r8 )thento if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*100.0_r8 )then

Rather than multiply by a smaller number which makes it tighter rather than looser. If the multiplier is less than 1000.0 that still seems fairly reasonable to me.
If you like I can explain why that's the case, as above statement is a bit complicated.
Briefly: My suggestion of using if ( abs(pctpft(n,m)) > 1.e-10_r8 ) is consistent with what Erik is saying. Your expression of epsilon(...) * 1.e-18_r8 is an extremely small number (roughly 1e-16 * 1e-18, or 1e-34). In your case it looks like you need a tolerance of 1e-9 (NOT epsilon(...) * 1.e-9_r8) . That might actually be reasonable: I see that in the latest code, we use a much looser tolerance of 1e-4. That said, we also do some other corrections in the latest code so that using such a loose tolerance won't cause other problems down the line. So I make no guarantees about whether simply relaxing the tolerance to 1.e-9_r8 will work, or if you'll run into other downstream problems.

EMAIL 3

Ahhh, follow through error with the epsilon multiplication (apologies, I should have caught that).  I noticed that in my cesm1_3_beta01 build I accidentally kept that epsilon multiplication with the small tolerance number, however in the cesm1_2_rel06 I didn't.

 I changed the tolerance to...                       if ( abs(pctpft(n,m)) > 1.e-4_r8 )then ...in mksurfdat.F90 and it successfully created the -hires surface dataset for all three VR cases. Command line was... execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname conus_30_x2 -usr_gdate 170831

Alan Rhoades
PhD Candidate, Atmospheric Science Graduate Group
University of California, Davis

Log in or register to post comments

Who's new

  • Nicholas.Davis@...
  • numarsanifa@...
  • bingdian_46@...
  • mxy2832029@...
  • nthg2000@...