Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Post processing CESM1 Cheyenne

Hi all,

The very last thing I have left to do before vacating Cheyenne is redoing a timeseries postprocessing that failed. About 2 months ago I ran timeseries postprocessing on an in-progress simulation to check how it looked 20 years in. This ran fine and the data in the files looked good so I continued the simulation for another 100 years. I then re-ran the timeseries postprocessing but it seemed like in encountered a glitch on Cheyenne because it worked fine for about 15 minutes and was generating files but then just hung for 2 hours doing nothing so I killed the script. It looked like it had just stalled because the last thing it did before stalling was generate a bunch of temp files in the ocean proc directory. In the log file it just says 'NetCDF: HDF error' many times at the end of the file. While it was still working before it stalled out it created timeseries files from the time step it had left off on when I looked at the initial 20 years which meant that I had the 2100-2120 files that generated correctly a few months back but then a broken set of files for 2120-2170 and 2170-end. I assume this means it creates files in 50 year blocks from whatever timestep it left off on if the script is run more than once.

Ideally I would like to have one functional clean set of files like 2100-2150, 2150-2200 etc. To do this do I need to delete everything in the proc folders and just run the timeseries script again? If I ran it again now would it clean up the temp files?

Thanks!!
 

dbailey

CSEG and Liaisons
Staff member
Oof. I am moving this to the new diagnostics forum. The timeseries on cheyenne has been very fragile for a little while. A couple tips.

1. Set TIMESERIES_GENERATE_ALL to FALSE in env_postprocess.xml.

2. Customize the component timeseries by setting tseries_create for each component one at a time. This is tedious, but sometimes can get through.

3. Are your files netcdf3? 64bit offset? netcdf4? You might have to change this.

Dave
 

dbailey

CSEG and Liaisons
Staff member
We are definitely working on an option for derecho and casper as well. So, worst case you can copy your archive to /glade/derecho/scratch.
 
I ended up trying something different before I saw these messages. I renamed the proc folders as proc1 assuming that when I re-ran the time series script it would generate new proc folders and new data files. I did it this way so that if anything went wrong with re-running I would still have that incomplete set of files in the old proc1 folders. This ran without issue but the result is a bit odd. The new proc folders were created but they are populated with time series files that go from 2121-2171 and 2172-end but no data for 2100-2120 (despite those times still being in hist). So the script behaved as though the files from that first time block were already there. I'm not sure why when it was creating all new blank folders that the time series files were not created as expected from 2100-2150 etc, but I think this might be workable. The only files that were incomplete temps before were the post 2122 ones so I think if I grab all of the pre-2121 files from the proc1 folders and put them in this new proc folders with the post-2122 files that I will then have a full set just with different times in each variable's file set than I normally have. I am about to go through all the folders one by one to see if everything is really all there but if so I'll just call this good enough so that I can finish getting everything backed up off Cheyenne.
 
Top