Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Creating CLM4.5 surface datasets

sacks

Bill Sacks
CSEG and Liaisons
Staff member
Here are questions asked by Alan Rhodes, which I am re-posting here, and will answer shortly: Hi Bill, I spoke with you several months via email with Colin Zarzycki, Erik Kluzek, and Michael Levy, when I was creating CLM4.0 surface/initial condition datasets for some of the variable resolution global climate modeling (VRGCM) that my research team is conducting.  I'm now in the process of generating CLM4.5 surface and initial condition datasets for my VRGCM grids (which use the FAMIPC5 compset), using the methodology learned from CLM4.0, and I had a few questions for you. 1) I'm currently trying to generate CLM4.5 surface datasets and I have successfully created all of the necessary mapping files for our VRGCM 14km and 28km grids (for CLM4.0 too, which we got operational last year); however, when I use ./mksurfdata.pl (or more specifically, execgy ./mksurfdata.pl -res usrspec -usr_gname AR_30_x4 -usr_gdate 150130) I get the following error:   Attempting to make inversion-derived CH4 parameters..... domain_read_dims_2d read lon and lat dims from lsmlon/lsmlat domain_read initialized domain domain_read read mask(gridmap_map_read) reading mapping matrix data...(gridmap_map_read) * file name                  : ../../shared/mkmapdata/map_360x720cru_cruncep_to_AR_30_x4_nomask_aave_da_c150130.nc * matrix dimensions rows x cols :      259200  x       75062 * number of non-zero elements:       165933 Open CH4 parameter file: /glade/p/cesm/cseg/inputdata/lnd/clm2/rawdata/mksrf_ch4inversion_360x720_crunce p_simyr2000.c130322.nc max_bad_r8 ERROR: f0 =    1.00000000000000       greater than   1.00000000000000       at        52451 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than   1.00000000000000       at        58647 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than   1.00000000000000       at        62882 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than   1.00000000000000       at        72880 max_bad_r8 ERROR: f0 =    1.00000000000000       greater than   1.00000000000000       at        73168 ...and it will not proceed to create the 1850 and/or 2000 surface datasets.  Any ideas on how to best get around this issue? 2)  At the NCAR tutorial this past August, Erik Kluzek mentioned that there is a 3' or 5' resolution PFT file available.  Is there anywhere I could get a hold of this for both 1850 and 2000?  If so, do they "embed" themselves into any other part of the CLM file generation process or do I simply specify this PFT file in the user_nl_clm file (via fpftcon) and send the simulation to the queue?  I will be running CAM and CLM at variable resolutions ranging from 1 degree to 0.25 and 0.125 degree, so I thought it might be interesting to see how the model responds to a higher resolution PFT file to capitilize on the fact that we are running at
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
(1) Yes, sorry: In the CESM release code base (and older versions of the development code), this error check is too strict. To fix it, open models/lnd/clm/tools/clm4_5/mksurfdata_map/src/mkCH4inversionMod.F90, find the line that sets max_valid_f0 (around line 79), and change it to:  real(r8), parameter :: max_valid_f0    = 1._r8 + 1.0e-14_r8If necessary, you could increase the tolerance beyond 1e-14; please let me know if you need to do that. (2) The high res (3') pft file is pointed to during the surface dataset generation process. In order to use it, add the option:-hirespftto your mksurfdata.pl command. However, I think this is only available for year 2000, not 1850 (at least, that's what the documentation says).However: Based on question #4, it sounds like you want to run transient simulations, which require fpftdyn in addition to the surface dataset. We do not have transient data at the 3' resolution, so unfortunately you will not be able to use the high-res pft option in this case. (3) Yes, you have heard correctly on all counts. The main fire bug was fixed in recent CLM development versions, but the latest release code base (the CESM1.2 series) still has this bug. For offline CLM runs that have been done, this bug hasn't been too problematic, but it caused excessive fires in the Amazon when coupled with CAM, due to CAM simulating a too-dry climate there (setting up positive feedbacks). Also note that CLM4.5 is not considered scientifically validated for use in coupled simulations, for this and other reasons. I'm not clear on whether you're using CAM or a different atmosphere model. Maybe you're doing your own careful vetting of the coupled simulation anyway, so that this lack of scientific validation isn't an issue? (See http://www2.cesm.ucar.edu/models/scientifically-supported). (4) I *think* that the default for AMIP compsets is to use an 1850 surface dataset but also use an fpftdyn file. This provides transient PFTs, so that the appropriate PFT breakdown is used for each year. You can confirm this by looking at whether fpftdyn is given in your lnd_in file. When fpftdyn is given, then it is irrelevant whether you use a year-1850 or year-2000 surface dataset (but the model expects you to provide a year-1850 dataset, and some error-checks may fail if you try to give it a 2000 surface dataset). To answer your more general question: the only difference between year-1850 and year-2000 surface datasets are in the pft distributions, as you have already investigated – and again, these are overridden by fpftdyn in a transient simulation.Note that, in order to create a fpftdyn file, you will need to add the -y option to mksurfdata.pl:-y 1850-2000and for RCP runs you need:-y 1850-2100 -rcp 2.6,4.5,6,8.5 (that will generate output for ALL of the RCPs; you can use any one of the generated files for the 1850-2005 period, since they should all be identical for that period).
 
Hi Bill,Thanks for all of the feedback you provided.  All of it was extremely helpful and clear.I had a follow up question.  When I try to use the -hires option (I tried -hirespft, but it doesn't seem to be the right syntax) I get the following error output:"** trouble getting vegtyp file with: ./../../../bld/queryDefaultNamelist.pl -csmdata /glade/p/cesm/cseg/inputdata -silent -justvalue -phys clm4_0 -onlyfiles  -res 3x3min -options sim_year=1850 -var mksrf_fvegtyp -namelist clmexp"Does the -hires option only work for CLM4.5?  Do I have to specify something regarding the 2000 year (you mentioned previously that there isn't a 3' PFT file for 1850, so I was curious if I may have neglected a detail in the command line argument)?  If so, what would be the appropriate call to mksurfdata.pl?Thanks again,AR
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
Hi Alan,-hires should work for CLM4.0. You're right: there are slightly different options for CLM4.0 vs CLM4.5:For CLM4.5, most input data come in at high resolution by default, and you can just choose between low res vs high res pfts using the -hirespft option. That is only available for year-2000 datasets.For CLM4.0, most input data come in at low resolution by default, but you can choose to use high-res raw data for a bunch of different fields – pfts and others – using the -hires option. But the vegtyp file only exists at high res for year-2000 datasets: specifying -y 2000 to mksurfdata.pl. In principle it should be possible to use the low-res raw data for PFTs in CLM4.0 but the high-res raw data for other fields. You would need to do that in order to produce CLM4.0 surface datasets for year 1850 or transient which use as much high-res raw data as possible. At a glance, it looks like this can't be done out-of-the-box, but I think you could change a few lines in some of the xml files to get that to work. If you need this, let me know, and I can point you to the necessary changes.
 
Hi Bill,Sounds great.  I just tried creating the hires dataset for the year 2000 in CLM4.0 and CLM4.5, but ran into the following error message in both:" * matrix dimensions rows x cols :      259200  x       75062 * number of non-zero elements:       230569 Successfully made VOC Emission Factors domain_clean: cleaning       259200 pctpft < 0.0 =  -5.643112366282566E-011  suma, pcturb_excess, sumpft =   96.1725369169036       0.279375325201691        7.29922977721126abort:ERROR in mksurfdata_map: 34304"This was the command line argument I used for CLM4.0..."execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname AR_30_x4 -usr_gdate 150129"...and this was the command line argument I used for CLM4.5..."execgy ./mksurfdata.pl -hirespft -y 2000 -res usrspec -usr_gname AR_30_x4 -usr_gdate 150130"Any ideas on what is going on?  Seems like there may be a script tolerance level issue again?Thanks in advance,AR
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
Hi Alan,I agree that this just looks like a script tolerance issue. Recent versions of mksurfdata_map (in the development code) have addressed this, I think, for CLM4.5. But in the meantime, you can search for the given error message in mksurfdat.F90 and replace the conditional in this line (around line 1153 in the clm4.0 code):                   if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*50.0_r8 )then                      call abort()                   end if with something like:                    if ( abs(pctpft(n,m)) > 1.e-10_r8 )then 
 
Hi Bill,Looks like that modification worked for both CLM4.0 and CLM4.5.  I was able to create a year 2000 surface dataset for both.  This was the command line argument I used for both:"execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname AR_30_x4 -usr_gdate 150130"To confirm your previous thought on ftpdyn files, I also tried to create a 1850-2000 dataset and an error occured right away.  So, it doesn't look like a transient surface dataset can be generated for historical runs.  Is this also the case for RCP scenarios?  Or, could I use a -y 2000-2100 option to generate a 3' FTPDYN for any of the RCP scenarios? 
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
I'm glad that worked for you. Thanks for the confirmation. You also canNOT use -hirespft for RCP scenarios.
 
Thanks for the info on the 3' PFT datasets.  Last question, since coupled atm/lnd CESM simulations are now being pushed to resolutions of ~25km (or even ~10km in some cases) and it would be advantagous to have higher resolution surface datasets for these simulations (especially for RCP climate analysis), are there plans to create 3' hires datasets for RCP scenarios and/or for FTPDYN 1850-2005 surface datasets?  If not, are there capabilities/tools to create your own or, as I would imagine, would this take a substantial effort?Thanks again for all of the prompt replies, they were super helpful.AR
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
The next iteration of transient datasets (for CMIP6) will likely be produced at 1/4 degree rather than the present 1/2 degree - but I realize that's still far from 3'. I think you're right that producing higher-resolution versions of these datasets would take substantial effort. However, I'm forwarding this message to Peter Lawrence, who may be able to comment further.
 
Hi,Im getting a similar error in creating surface datasetsIm trying to generate surface data files for res 1.9x2.5 with crop on for years 1850 to 2005. Im using the following command: ./mksurfdata.pl -crop -years 1850-2000 -res 1.9x2.5. On using this im getting the following error:** trouble getting vegtyp file with: ./../../../bld/queryDefaultNamelist.pl -res 1.9x2.5 -csmdata /scratch/cas/phd/asz122525/inputdata silent -justvalue -phys clm4_0 -onlyfiles  -res 0.5x0.5 -options sim_year=1850,crop='on' -var mksrf_fvegtyp -namelist clmexp Im not sure why Im getting this error. Im not using hirespft. Also i tried th same in clm4.0 with -irrig on but got the same error.
Please let me know how to resolve this Thank you in advance! Roshni MathurIIT, Delhi
 
Reposting...

Hi Bill and Erik, I'm currently generating some variable-resolution CESM (VR-CESM) CLM4.0 surface datasets (both 1850-2000 0.5 deg and 2000 -hires) for a set of three CONUS grids (55, 28, and 14 km) given to me by Colin Zarzycki.   I'm using cesm1_2_rel06 and cesm1_3_beta01 to generate the files, which has been successful in previous VR-CESM cases that I have generated. Unfortunately, I'm facing similar issues to the ones I found previously with tolerance issues in mksurfdat.F90 and/or mkgridmapMod.F90. Here is command line argument... execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname conus_30_x2 -usr_gdate 170831 Here is the error output (ERROR in mksurfdata_map: 34304)...  Attempting to make VOC emission factors ..... domain_read read lon and lat dims domain_read initialized domain domain_read read LANDMASK Open VOC file: /glade/p/cesm/cseg/inputdata/lnd/clm2/rawdata/mksrf_vocef_0.5x0.5_simyr2000.c11 0531.nc(gridmap_map_read) reading mapping matrix data...(gridmap_map_read) * file name                  : ../../shared/mkmapdata/map_0.5x0.5_AVHRR_to_conus_30_x2_nomask_aave_da_c170831.nc * matrix dimensions rows x cols :      259200  x       55082 * number of non-zero elements:       226392 Successfully made VOC Emission Factors  domain_clean: cleaning       259200 pctpft < 0.0 =  -2.678134269906707E-009  suma, pcturb_excess, sumpft =   77.0426326381054        4.70448108348547        20.4922497590901abort:ERROR in mksurfdata_map: 34304 I have modified mksurfdat.F90 to... !!!! AR MODIFICATION FOR TOLERANCE!                   if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*50.0_r8 )then                   if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*1.e-18_r8 )then ...and modified mkgridmapMod.F90, even though the mapping files seem to be OK, too... !!!! AR MODIFICATIONS FOR TOLERANCE 9/1/17!    real(r8), parameter   :: tol = 1.0e-5_r8  ! tolerance for checking that mapping data                                              ! are within expected bounds    real(r8), parameter   :: tol = 1.0e-4_r8  ! tolerance for checking that mapping data                                              ! are within expected bounds Neither of these strategies seems to work though.  I don't want to continue to push down the mksurfdat.F90 further, as it is already a much lower tolerance.   Do you have any leads on how to address this?   Thanks in advance, AR

EMAIL 2

Hi Alan One problem is that you are going the wrong direction on the tolerance for abs(pctpft). You should do something like change if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*50.0_r8 )thento if ( abs(pctpft(n,m)) > epsilon(pctpft(n,m))*100.0_r8 )then

Rather than multiply by a smaller number which makes it tighter rather than looser. If the multiplier is less than 1000.0 that still seems fairly reasonable to me.
If you like I can explain why that's the case, as above statement is a bit complicated.
Briefly: My suggestion of using if ( abs(pctpft(n,m)) > 1.e-10_r8 ) is consistent with what Erik is saying. Your expression of epsilon(...) * 1.e-18_r8 is an extremely small number (roughly 1e-16 * 1e-18, or 1e-34). In your case it looks like you need a tolerance of 1e-9 (NOT epsilon(...) * 1.e-9_r8) . That might actually be reasonable: I see that in the latest code, we use a much looser tolerance of 1e-4. That said, we also do some other corrections in the latest code so that using such a loose tolerance won't cause other problems down the line. So I make no guarantees about whether simply relaxing the tolerance to 1.e-9_r8 will work, or if you'll run into other downstream problems.EMAIL 3

Ahhh, follow through error with the epsilon multiplication (apologies, I should have caught that).  I noticed that in my cesm1_3_beta01 build I accidentally kept that epsilon multiplication with the small tolerance number, however in the cesm1_2_rel06 I didn't. I changed the tolerance to...                       if ( abs(pctpft(n,m)) > 1.e-4_r8 )then ...in mksurfdat.F90 and it successfully created the -hires surface dataset for all three VR cases. Command line was... execgy ./mksurfdata.pl -hires -y 2000 -res usrspec -usr_gname conus_30_x2 -usr_gdate 170831
 
Top