Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

SC-WACCM: How to reduce output to a single monthy mean tape (cam2.h0)?

Dear CESM and WACCM experts

I am trying to reduce the amount of output in the SC-WACCM runs.
The goal is to output only a relatively small set of fields
to the atmosphere monthly mean tape (cam2.h0).
I want to prevent any additional output.

In order to achieve this, I used the following setup in atm_in (and its
build script cam.buildnml.csh):

empty_htapes = .true.
fincl1 =

I removed any additional "fincl" or "fexcl" namelist parameters.
Also, CESM/SC-WACCM is configured with the additional configuration
flag to disable "age of air" tracers, and hopefully expedite the run:
CAM_CONFIG_OPTS="-phys cam4 -chem waccm_ghg -noage_of_air_trcs"

Unfortunately trying to restrict output to a single monthly means tape
causes the run stops with a *segmentation fault*.

**

However, if I use this setup (just to exclude the "age of air" tracers):

fexcl1 = 'AOA1', 'AOA1SRC', 'AOA2', 'AOA2SRC',
'HORZ', 'HORZSRC', 'VERT', 'VERTSRC'

with no further "fincl" or "fexcl" namelist parameters,
the run proceeds normally.
Neverhtelss, it forcefully produces *also* a daily data
tape (cam2.h1), which I don't really want or need.
(My goal is to reduce the I/O to a single monthly mean tape!)

QUESTIONS:

1) Is there a bug in the code that forces the cam2.h1 daily data
tape to be mandatory?

2) Is there any other possible workaround via namelist, that would
let me output *only* the monthly means file (cam2.h0)?

Thank you,
Gus Correa
 
Hi Gus,

Francis Vitt advises that you should set fincl2, 3, etc. to single quotes with a blank space between, i.e.

fincl2 = ' '

Do this in addition to setting empty_htapes to true, and that should get rid of the extra history tapes.

Cheers,
Mike
 
Thank you very much, Mike and Francis.

This works!

With the 20-20 hindisght you provided, now following the code logic
in cam_history.F90 I can see why my run failed before.
Empty finclX lists must be set to a single blank character.

Maybe some cautionary words could be added to the
"empty_htapes" entry in the namelist documentation:

http://www.cesm.ucar.edu/cgi-bin/eaton/namelist/nldef2html-cam5_1

Something like:
"If you set empty_htapes=.true., you must also set finclX=' ' (single blank)
to each X tape that you want to suppress from the output."

Many thanks,
Gus Correa
 
Hi Mike, Francis

Sorry, I spoke too soon.
Your suggestion works for the startup run.
However, when I try the first continuation (after 1 month
of the startup) I get the segmentation fault again.

The relevant part of my atm_in namelist is (in case I am missing an essential variable,
please let me know):

empty_htapes = .true.
fincl1 = 'AEROD_v','ATMEINT','CFC11',
'CFC12','CH4','CLDHGH',
'CLDLOW','CLDMED','CLDTOT',
'CLOUD','FLDS','FLDSC',
'FLNS','FLNSC','FLNT',
'FLNTC','FLUT','FLUTC',
'FREQSH','FREQZM','FSDS',
'FSDSC','FSDTOA','FSNS',
'FSNSC','FSNT','FSNTC',
'FSNTOA','FSNTOAC','FSUTOA',
'H2O','HDEPTH','ICEFRAC',
'LANDFRAC','LHFLX','LWCF',
'MAXQ0','MSKtem','N2O',
'OCNFRAC','OMEGA','PBLH',
'PCONVB','PCONVT','PHIS',
'PRECC','PRECCDZM','PRECL',
'PRECSC','PRECSH','PRECSL',
'PRECT','PS','PSL',
'Q','QFLX','QREFHT',
'QRL_TOT','QRS_TOT','RELHUM',
'RHREFHT','SFCLDICE','SFCLDLIQ',
'SHFLX','SNOWHICE','SNOWHLND',
'SOLIN','SRFRAD','SWCF',
'T','T700','T850',
'TAUGWX','TAUGWY','TAUX',
'TAUY','TGCLDIWP','TGCLDLWP',
'TH','TMQ','TREFHT',
'TREFMNAV','TREFMXAV','TROP_P',
'TROP_T','TROP_Z','TS',
'TSMN','TSMX','TTPXMLC',
'U','U10','US',
'V','VS','WSPEED',
'Z3'
fincl2 = ' '
fincl3 = ' '
fincl4 = ' '
fincl5 = ' '
fincl6 = ' '

***

The error message in the ccsm.log is:
(It may not be releveant, but note that the segfault happens right after
opening/reading the cam restart file.)

[1,1]: Opened existing file FWSC_naat_monk.cam2.rs.0001-02-01-00000.nc 30
[1,14]:forrtl: severe (174): SIGSEGV, segmentation fault occurred
[1,14]:Image PC Routine Line Source
[1,14]:ccsm.exe 00000000005EB2AE Unknown Unknown Unknown
[1,14]:ccsm.exe 0000000000A51D78 Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000008B5833 Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000007741EE Unknown Unknown Unknown
[1,14]:ccsm.exe 000000000059A4E9 Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000005879A8 Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000004CBABF Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000004D7349 Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000004BE702 Unknown Unknown Unknown
[1,14]:libc.so.6 0000003DED61D8B4 Unknown Unknown Unknown
[1,14]:ccsm.exe 00000000004BE629 Unknown Unknown Unknown
[1,17]:forrtl: severe (174): SIGSEGV, segmentation fault occurred
[1,17]:Image PC Routine Line Source
[1,17]:ccsm.exe 00000000005EB2AE Unknown Unknown Unknown

... repeated for all 32 processors ...

--------------------------------------------------------------------------
mpiexec has exited due to process rank 28 with PID 11985 on
node node31 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------

***

Thank you for any help to sort this out.

Gus Correa
 
Is the restart file from this run with the single history tape, or a different run? Are you just continuing the run, or trying to restart from a previous run. You might try to rebuild with DEBUG set to TRUE in your env_build.xml to get more information about where the crash is happening.
 
Hi Mike

Thank you for your answer, and sorry for my late reply.

I was just continuing the run.
There were no changes in the history files.

The first run, which finished right, was a startup/initial,
and last one month (January):

Run type flag (NSREST) 0=initial, 1=restart, 3=branch 0

The second run, which failed with segfault, was a restart/continue,
and was supposed to run for another month (February):

Run type flag (NSREST) 0=initial, 1=restart, 3=branch 1

I ran both for one month only to test quickly the ability of reducing
the amount of output.


The relevant part of atm_in namelist,
which is the same in both cases (startup and continue),
is what I posted before.

The program segfaults during the atm component initialization (on the restart run),
and the segfault is reported in the ccsm log file (by all processes).
The other components don't even start.

Hence, I am still unable to reduce the amount of output in SC-WACCM.

I have yet to try this reduced output (namelist in my previous posting)
in the "F" compset, to see if it works there.
Somehow the F compset outputs by default only the "h0" history file,
whereas SC-WACCM seems to have "h0" and "h1" as default (if I remember
right).
Would this perhaps be related to the segfault that I've got in SC-WACCM?


Thank you,
Gus
 
Gus,

We recently discovered a bug in cam_history associated with removing fields from the waccm default auxiliary history files (h1). It will crash (seg fault) on restarts when outfld is called.

The fix is to add this block of code at the beginning of subroutine bld_htapefld_indices in models/atm/cam/src/control/cam_history.F90:


Code:
! reset all the active flags to false
 ! this is needed so that restarts work properly -- fvitt
 listentry=>masterlinkedlist
 do while(associated(listentry))
    listentry%actflag(:) = .false.
    listentry%act_sometape = .false.
    listentry=>listentry%next_entry
 end do

Please let us know if that solves your problem.
 
Hi Mike

Yes, this bugfix, which Francis Vitt sent me offline, works.
Now I can output only the h0 monthly mean files,
using the namelist in my previous posting,
and the run restarts correctly.

Thank you, Francis, and Dan for your help.

Gus Correa
 
Hello,

I tried that:
1. added the block of code suggested at the beginning of subroutine bld_htapefld_indices in models/atm/cam/src/control/cam_history.F90 (inSourceMods/src.cam/)
2. I configure with user_nl_cam:

&cam_inparm
qbo_cyclic = .false.
qbo_use_forcing = .false.
fincl1 = 'U',
fincl2 = ' '
fincl3 = ' '
fincl4 = ' '
fincl5 = ' '
fincl6 = ' '
/

It works!

If I try to use '' and not ' ' in fincl lists it give this error immediately after configure:
Generating resolved namelist, prestage, and build scripts
ERROR(Build::Namelist::_parse_next): expect a F90 constant for a namelist instead got: ''
ERROR: generate_resolved.csh error for atm template
configure error: configure generated error in attempting to created resolved scripts

3. build fails with the errors at the end of this message below with some parts of /run/atm.bldlog.120207-162347

Could you help me?
Best,
Serge



/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/dynamics/fv/interp_mod.F90(76): warning #6178: The return value of this FUNCTION has not been defined. [THISLAT]
function get_interp_lat() result(thislat)
-----------------------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/dynamics/fv/interp_mod.F90(82): warning #6178: The return value of this FUNCTION has not been defined. [THISLON]
function get_interp_lon() result(thislon)
-----------------------------------^
mpif90 -c -I. -I/cm/shared/apps/netcdf/intel/64/4.1.1/include -I/cm/shared/apps/netcdf/intel/64/4.1.1/include -I/usr/mpi/qlogic/include -I. -I/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/chemistry/pp_waccm_ghg -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/chemistry/mozart -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/physics/waccm -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/chemistry/bulk_aero -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/chemistry/utils -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/physics/cam -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/dynamics/fv -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/cpl_mct -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/cpl_share -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/control -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/utils -I/scratch/scratch/ucaksgu/cesm1_0_3/models/atm/cam/src/utils/pilgrim -I/home/ucaksgu/Scratch/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/lib/include -DCO2A -DMAXPATCH_PFT=numpft+1 -DLSMLAT=1 -DLSMLON=1 -DPLON=144 -DPLAT=96 -DPLEV=66 -DPCNST=7 -DPCOLS=16 -DPTRM=1 -DPTRN=1 -DPTRK=1 -DSTAGGERED -DSPMD -DWACCM_GHG -DWACCM_PHYS -DMCT_INTERFACE -DHAVE_MPI -DCO2A -DLINUX -DSEQ_ -DFORTRANUNDERSCORE -DNO_R16 -DNO_MPI2 -DNO_SHR_VMATH -g -fp-model precise -convert big_endian -assume byterecl -ftz -traceback -O2 /scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5399): error #6236: A specification statement cannot appear in the executable section.
integer :: f
---^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5400): error #6236: A specification statement cannot appear in the executable section.
integer :: t
---^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5405): error #6236: A specification statement cannot appear in the executable section.
type(master_entry), pointer :: listentry
---^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5374): error #6404: This name does not have a type, and must have an explicit type. [LISTENTRY]
listentry=>masterlinkedlist
^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5374): error #6795: The target must be of the same type and kind type parameters as the pointer. [LISTENTRY]
listentry=>masterlinkedlist
^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5376): error #6460: This is not a field name that is defined in the encompassing structure. [ACTFLAG]
listentry%actflag(:) = .false.
------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5376): error #6303: The assignment operation or the binary expression operation is invalid for the data types of the two operands.
listentry%actflag(:) = .false.
-------------------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5377): error #6460: This is not a field name that is defined in the encompassing structure. [ACT_SOMETAPE]
listentry%act_sometape = .false.
------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5377): error #6303: The assignment operation or the binary expression operation is invalid for the data types of the two operands.
listentry%act_sometape = .false.
---------------------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5378): error #6460: This is not a field name that is defined in the encompassing structure. [NEXT_ENTRY]
listentry=>listentry%next_entry
-----------------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5378): error #6796: The variable must have the TARGET attribute or be a subobject of an object with the TARGET attribute, or it must have the POINTER attribute. [LISTENTRY]
listentry=>listentry%next_entry
-------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5407): error #6404: This name does not have a type, and must have an explicit type. [T]
do t = 1, ptapes
------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5407): error #6063: An INTEGER or REAL data type is required in this context. [T]
do t = 1, ptapes
------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5408): error #6404: This name does not have a type, and must have an explicit type. [F]
do f = 1, nflds(t)
---------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5408): error #6063: An INTEGER or REAL data type is required in this context. [F]
do f = 1, nflds(t)
---------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5409): error #6795: The target must be of the same type and kind type parameters as the pointer. [LISTENTRY]
listentry => get_entry_by_name(masterlinkedlist, tape(t)%hlist(f)%field%name)
---------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5416): error #6303: The assignment operation or the binary expression operation is invalid for the data types of the two operands.
listentry%act_sometape = .true.
----------------------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5417): error #6303: The assignment operation or the binary expression operation is invalid for the data types of the two operands.
listentry%actflag(t) = .true.
--------------------------------^
/scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90(5418): error #6460: This is not a field name that is defined in the encompassing structure. [HTAPEINDX]
listentry%htapeindx(t) = f
-------------------^
compilation aborted for /scratch/scratch/ucaksgu/cesm1_0_3/scripts/F_2000_WACCM_SC_gwcorrect_calibtrial/SourceMods/src.cam/cam_history.F90 (code 1)
gmake: *** [cam_history.o] Error 1
 
Serge,

From your build error messages, it looks like you have put the block of code before the variable type declarations within the subroutine. All executable code must be after the type declarations (integer, pointer, etc.). The corrected block of code follows. This should be released next week as part of CESM1.0.4.



Code:
integer :: f
   integer :: t

!
!  Initialize htapeindx to an invalid value.
!
   type(master_entry), pointer :: listentry

   ! reset all the active flags to false 
   ! this is needed so that restarts work properly -- fvitt
   listentry=>masterlinkedlist
   do while(associated(listentry))
      listentry%actflag(:) = .false.
      listentry%act_sometape = .false.
      listentry=>listentry%next_entry
   end do

   do t = 1, ptapes
 
Top