Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Timeseries post-processing on derecho

dbailey

CSEG and Liaisons
Staff member
We have a sort of working version of the timeseries post-processing on derecho. It does not work yet on casper as it relies on impi which is not available on casper yet.

Option #1 :: if you own $CASEROOT:

cd $CASEROOT

module use /glade/work/bdobbins/Software/Modules

module load cesm_postprocessing_derecho

create_postprocess -caseroot=`pwd`

cd postprocess/

cp /glade/u/home/dbailey/timeseries .


Edit timeseries to fix the CASENAME


Option #2 :: if you do NOT own $CASEROOT:

cp /glade/work/nanr/cesm_tags/CASE_tools/pp-offline/derecho/LR/pp-offline-standAlone-derecho.csh MY_TOOLS_DIRECTORY
Edit pp-offline-standAlone-derecho.csh to set the CASENAME and PATHS.
./pp-offline-standAlone-derecho.csh
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Just noting that I tried this on clm primary (h0) history files and it worked for me, thanks!
Minor: I edited CASE and CASEROOT in the timeseries script, not CASENAME.
 

mlague

New Member
I was able to follow these instructions to get time series for both the lnd and ice model components, but not the atm. Specifically, I'm trying to make time series of variable from the monthly history output files. I'm not sure what I'm missing, as I made the same changes to the atmosphere part of the scripts as to the land part, which successfully got me my land time series, but don't even result in the creation of the "atm/proc" directory. Any tips?
 

mlague

New Member
I've attached the env_timeseries.xml file (as plain text) - forgot to attach to original message.
 

Attachments

  • env_timeseries.txt
    18.5 KB · Views: 17

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Are your primary history files *.cam.h0.*.nc? Mine from a recent run are *.cam.h0a.*.nc.
 

mlague

New Member
Oooooooooooooooo you're absolutely right, they are indeed h0a now! Just switched that in the env_timeseries and we'll see how it goes! Thanks!!!!
 

michelle_dvorak

Michelle Dvorak
Member
I have not had success in starting a post-processing job on Derecho using these instructions. Job fails when I submit with the following in the log file:

Extra data: line 1 column 1470 (char 1469)
cesm_tseries_generator: DEBUG... Running on 1 cores
Extra data: line 1 column 1470 (char 1469)
dec2276.hsn.de.hpc.ucar.edu: rank 117 exited with code 1
cesm_tseries_generator: DEBUG... Running on 1 cores
Extra data: line 1 column 1470 (char 1469
dec2276.hsn.de.hpc.ucar.edu: rank 112 died from signal 15

I've tried to take a look at the offending "cesm_tseries_generator" file, which I believe is located in /opt/ncar/cesm_postprocessing/cesm-env2/bin/ , according to the timeseries file, but I suspect there is an environment issue that is over my head. I am told the directory does not exist.

Help appreciated! Thank you --

Michelle
 

dbailey

CSEG and Liaisons
Staff member
Hi there.

Sounds like a syntax error in your env_timeseries.xml or env_postprocess.xml file. Did you modify either of these files?
 

agilbert

Ash Gilbert
New Member
I haven't been able to finish a timeseries post-processing job on Derecho. The timeseries job fails with the error "module cftime has no attribute utime" when opening the atm archive files and saving the timeseries. The case I am trying to run this on is 2 months long. I've looked at the timeseries python files, but none of them use the cftime module, so I don't know where this error is coming from. Any help with this is appreciated. I've included the log file and my env_timseries and env_postprocess xml files.

Thanks
Ash
 

Attachments

  • env_postprocess.txt
    9.3 KB · Views: 7
  • env_timeseries.txt
    18.5 KB · Views: 2
  • timeseries.log.20240716-173504.txt
    8.4 KB · Views: 3

katelynfitzgerald

Katelyn FitzGerald
New Member
Confusingly, cftime is often not used directly so it might not being included in your Python scripts explicitly.

I suspect what you're seeing might be a compatibility issue between your code and the Python environment you have set up. It looks like utime was removed from the cftime package back in 2021 with version 1.5.0 (noted on the release notes here: GitHub - Unidata/cftime: Time-handling functionality from netcdf4-python.). You may need to install some older versions to get things to run.
 

agilbert

Ash Gilbert
New Member
Confusingly, cftime is often not used directly so it might not being included in your Python scripts explicitly.

I suspect what you're seeing might be a compatibility issue between your code and the Python environment you have set up. It looks like utime was removed from the cftime package back in 2021 with version 1.5.0 (noted on the release notes here: GitHub - Unidata/cftime: Time-handling functionality from netcdf4-python.). You may need to install some older versions to get things to run.
Gotcha, it's a version issue. Where should my python environment for the post processing be? I'm a little confused as the instructions didn't say anything about setting one up.
 

katelynfitzgerald

Katelyn FitzGerald
New Member
Hopefully someone else will chime in here. It looks like some of the environment info be a bit buried perhaps

I work on some other Python tools so have familiarity with some of these Python packages, but not this workflow in particular.
 

dbailey

CSEG and Liaisons
Staff member
Gotcha, it's a version issue. Where should my python environment for the post processing be? I'm a little confused as the instructions didn't say anything about setting one up.
Hi Ash. Did you try doing:

conda activate npl

This should not be necessary as the container setup should have all you need. Can you compare your timeseries script to mine?

/glade/u/home/dbailey/timeseries
 

aswann2

Abigail Swann
Member
Hi! I'm trying to use this script and could use some guideance. Specifcally I was able to get it to make montly history output files for cam, but it reached 12 hours of wallclock time before it finished. Is there a way to request a subset of variables? Or how else can I get it to complete cam files? I also need to do ocean and land, but I supposed I could run those in a seperate instance.
 

aswann2

Abigail Swann
Member
Hi! I'm trying to use this script and could use some guideance. Specifcally I was able to get it to make montly history output files for cam, but it reached 12 hours of wallclock time before it finished. Is there a way to request a subset of variables? Or how else can I get it to complete cam files? I also need to do ocean and land, but I supposed I could run those in a seperate instance.
I think the issue is that I accidentally made not just monthly files but also daily. I'm confused why though, becuase in my env_timeseries.xml file it says FALSE for everything (even monthly cam!). Where else could this be set?

snippet of env_timeseries.xml

Code:
<components>
  <comp_archive_spec name="cam">
    <rootdir>atm</rootdir>
    <multi_instance>True</multi_instance>
    <default_calendar>noleap</default_calendar>
    <files>
      <file_extension suffix=".h0.[0-9]">
        <subdir>hist</subdir>
        <tseries_create>FALSE</tseries_create>
        <tseries_output_format>netcdf4c</tseries_output_format>
        <tseries_tper>month_1</tseries_tper>
        <tseries_filecat_tper>years</tseries_filecat_tper>
        <tseries_filecat_n>50</tseries_filecat_n>
      </file_extension>
      <file_extension suffix=".h1.[0-9]">
        <subdir>hist</subdir>
        <tseries_create>FALSE</tseries_create>
        <tseries_output_format>netcdf4c</tseries_output_format>
        <tseries_tper>day_1</tseries_tper>
        <tseries_filecat_tper>years</tseries_filecat_tper>
        <tseries_filecat_n>10</tseries_filecat_n>
      </file_extension>
 

aswann2

Abigail Swann
Member
I think the issue is that I accidentally made not just monthly files but also daily. I'm confused why though, becuase in my env_timeseries.xml file it says FALSE for everything (even monthly cam!). Where else could this be set?

snippet of env_timeseries.xml

Code:
<components>
  <comp_archive_spec name="cam">
    <rootdir>atm</rootdir>
    <multi_instance>True</multi_instance>
    <default_calendar>noleap</default_calendar>
    <files>
      <file_extension suffix=".h0.[0-9]">
        <subdir>hist</subdir>
        <tseries_create>FALSE</tseries_create>
        <tseries_output_format>netcdf4c</tseries_output_format>
        <tseries_tper>month_1</tseries_tper>
        <tseries_filecat_tper>years</tseries_filecat_tper>
        <tseries_filecat_n>50</tseries_filecat_n>
      </file_extension>
      <file_extension suffix=".h1.[0-9]">
        <subdir>hist</subdir>
        <tseries_create>FALSE</tseries_create>
        <tseries_output_format>netcdf4c</tseries_output_format>
        <tseries_tper>day_1</tseries_tper>
        <tseries_filecat_tper>years</tseries_filecat_tper>
        <tseries_filecat_n>10</tseries_filecat_n>
      </file_extension>
I solved my immediate problem - the env_postprocess.xml file had everything set to TRUE. I changed <entry id="TIMESERIES_GENERATE_ALL" value="FALSE" /> as well as the diagnostic packages. I also set the env_timeseries.xml flags for the mothly output of specific components to true and it seems to have worked.
 

aswann2

Abigail Swann
Member
Ok I have a different issue related to these postprocessing tools! I was successful in running the tool 6 days ago, but now the same timeseries script hangs on the last line. My timeseries script is identical to /glade/u/home/dbailey/timeseries except for CASE, CASEROOT, and the charge account.

Could someone provide guidance?

This is the line from timeseries that hangs:
Code:
mpiexec singularity run -B /glade,/var /glade/work/bdobbins/Containers/CESM_Postprocessing/image /opt/ncar/cesm_postprocessing/cesm-env2/bin/cesm_tseries_generator.py  --caseroot $CASEROOT/$CASE/postprocess >> ${log_filename} 2>&1
 
Top