Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Time dimension in CESM2 Large Ensemble output data stored on Campaign

I am working with CESM2 Large Ensemble data at /campaign/cgd/cesm/CESM2-LE/atm/proc/tseries/ and would like some clarification regarding six-hourly and daily data.

For six-hourly data…

Question #1

Are T, Q, U, V, and Z3 output variables instantaneous or time-averaged? If they are instantaneous, what does the time_bnds variable tell us - is the instantaneous value taken at the first or second time_bnd?

Question #2

When concatenating all of the files together (1850-2100) there are 366,462 six-hourly times. This is two more than there should be given a 365 day year. By investigating the “time_bnds” variable, I found two times that look like extra data:

1. First time index in file 1850010100-1859123100, which has time_bnds of “1850-01-01 00hrs” to “1850-01-01 00hrs”. This data point is very similar, but not identical, to the second time index in this file, which has time_bnds of “1850-01-01 00hrs” to “1850-01-01 06hrs”.

2. First time index in file 2015010100-2024123100, which has time_bnds of “2015-01-01 00hrs” to “2015-01-01 00hrs”. This data point is very similar, but not identical, to the last time index in file 2010010100-2014123100, which has time_bnds of “2014-12-31 18hrs” to “2015-01-01 00hrs”.

Are these the two extra times? What are these values if their time_bnds do not span a six-hourly period? If constructing a six-hourly time series between 1850-01-01 00:00:00 and 2100-12-31 24:00:00, should these two values be removed?

For daily data…

Question #1

I assume that T, Q, U, V, and Z3 are all time-averages. Is this assumption correct?

Question #2

When concatenating all of the files together (1850-2100) there are 91617 daily times. This is also two more than there should be given a 365 day year. Similarly to the six-hourly files, I found by investigating “time_bnds” that there appeared to be two extra times at the beginning of file 18500101-18591231 and file 20150101-20241231.

Likewise, are these extra times that should be removed if creating a time series of daily data from 1850-01-01 to 2100-12-31?
 

asphilli

Adam Phillips
CVCWG Liaison
Staff member
You can tell if the data is time-averaged or instantaneous by looking at the attributes of the variable in the file in question. For example, looking at a daily file using ncdump: ncdump -h /glade/campaign/cgd/cesm/CESM2-LE/atm/proc/tseries/day_1/T/b.e21.BSSP370cmip6.f09_g17.LE2-1301.010.cam.h6.T.20750101-20841231.nc
you will get the following output:
<snip>
double T(time, lev, lat, lon) ;
T:_FillValue = -900. ;
T:mdims = 1 ;
T:units = "K" ;
T:long_name = "Temperature" ;
T:cell_methods = "time: mean" ;

That tells you that the variable is time-averaged, the time is set at the very end of the averaging period (for CESM2 and earlier data) and the time_bnds array represents the averaging period at each time step.

Looking at a 6-hourly file: ncdump -h /glade/campaign/cgd/cesm/CESM2-LE/atm/proc/tseries/hour_6/T/b.e21.BSSP370smbb.f09_g17.LE2-1191.010.cam.h5.T.2095010100-2100123100.nc
you will get the following output:
<snip>
float T(time, lev, lat, lon) ;
T:mdims = 1 ;
T:units = "K" ;
T:long_name = "Temperature" ;

If the cell_methods attribute is not present it means that the variable is instantaneous. In this case the time_bounds variable should be ignored. The time array in the file contains the correct time that the instantaneous measurement was taken.

With regards to the extra time steps seen in both the daily and 6 hourly data: Your investigation of the time_bnds array was the right way to go, and showed that something is different with those timesteps. This is a "feature" of CAM where an instantaneous extra timestep (for sub-monthly data) is added at the beginning of a run. It can be identified using the time_bnds array as you have done. The solution is to simply not use these identified timesteps.

Let me know if you have any further questions!
 
Top