Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Problem with CAM4 while running for restart Run.

Dear All,

I am running CAM4 initial run for 1 day and try to restart for the remaining time period. i used following namelist options for it.

namelist :
****************
/shome/2009ast3222/CAM4/ccsm4_0/models/atm/cam/bld/build-namelist -test -config /shome/2009ast3222/CAM4/Phd_Work/CAM4_runs_0.47x0.63_Resolution_30years/bld_32P_0.47x0.63/config_cache.xml -ignore_ic_date -namelist "&camexp start_ymd=19900101 stop_ymd=20001231 start_type='continue' stop_option='ndays' stop_n=31 nhtfrq=0,-24,-24,-24,-3,-3 ndens=2 mfilt=1,30,30,30,240,240 empty_htapes=.true. sstcyc= .false. bndtvs= '/shome/2009ast3222/CAM4/inputdata/atm/cam/sst/sst_HadOIBl_bc_0.47x0.63_1850_2008_c100128.nc' stream_year_first=1970 stream_year_last=2000 fincl1='TREFHT:A','PRECT:A','PS:A','U:A','V:A','OMEGA:A','T:A','Q:A','RELHUM:A','CLOUD:A','TAUX:A','TAUY:A','SNOWHICE:A','SNOWHLND:A','Z3:A','OMEGA500:A','OMEGA850:A','Q200:A','Q850:A','T300:A','T850:A','U200:A','U850:A','V200:A','V850:A','Z300:A','Z500:A','Z700:A' fincl2='TREFHT:A','PRECT:A','PS:A','RELHUM:A','SOLIN:A' fincl3='TREFHT:M' fincl4='TREFHT:X' fincl5='U:A' fincl6='V:A' scenario_ghg = 'RAMPED' bndtvghg= '/shome/2009ast3222/CAM4/inputdata/atm/cam/ggas/ghg_hist_1850-2005_c090419.nc'/"
********************

but the run get aborted with the following error message in log file.
****************
(seq_frac_check) [ice init] ifrac min/max = 0.00000000000000000 0.00000000000000000
(seq_frac_check) [ocn init] afrac min/max = 1.00000000000000000 1.00000000000000000
(seq_frac_check) [ocn init] ofrac min/max = 0.00000000000000000 1.00000000000000000
(seq_frac_check) [ocn init] ifrac min/max = 0.00000000000000000 0.00000000000000000
(seq_frac_check) [atm init] afrac min/max = 1.00000000000000000 1.00000000000000000
(seq_frac_check) [atm init] lfrac min/max = 0.00000000000000000 1.00000000000000000
(seq_frac_check) [atm init] ofrac min/max = 0.00000000000000000 1.00000000000000000
(seq_frac_check) [atm init] ifrac min/max = 0.00000000000000000 0.00000000000000000
(seq_frac_check) [atm init] lfrin min/max = 0.00000000000000000 1.00000000000000000 min/max = 0.00000000000000000 1.00000000000000000
(seq_frac_check) [atm init] sum min/max = 1.00000000000000000 1.00000000000000000
(seq_frac_check) [atm init] sum ncnt/maxerr = 0 0.00000000000000000

(seq_frac_check) [atm init] sum min/max = 1.00000000000000000 1.00000000000000000
(seq_frac_check) [atm init] sum ncnt/maxerr = 0 0.00000000000000000
(seq_mct_drv) : Calling map_ice2atm_mct for mapping i2x_ix to i2x_ax
(seq_mct_drv) : Calling mrg_x2a_run_mct
(seq_mct_drv) : Calling atm_init_mct
(GETFIL): attempting to find local file camrun.cam2.rs.1990-01-02-00000.nc
(GETFIL): using camrun.cam2.rs.1990-01-02-00000.nc in current working directory
Opened existing file camrun.cam2.rs.1990-01-02-00000.nc 48
FV subcycling - n2 nsplit = 4 4
Divergence damping: use 2nd order damping
nstep, te 49 0.33337629555473680E+10 0.33337718070631623E+10 0.49053362075683518E-03 0.98524346065247999E+05
ENDRUN:OUTFLD: invalid avgflag=^P
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 6[cli_6]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 6
ENDRUN:OUTFLD: invalid avgflag=
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 19[cli_19]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 19
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
cam 00000000005369FF Unknown Unknown Unknown
cam 0000000000536076 Unknown Unknown Unknown
cam 0000000000BA8DB1 Unknown Unknown Unknown
cam 0000000000DE3AFE Unknown Unknown Unknown
cam 0000000000AB76AD Unknown Unknown Unknown
cam 00000000005218DF Unknown Unknown Unknown
cam 00000000004F3D3F Unknown Unknown Unknown
cam 00000000005732DB Unknown Unknown Unknown
cam 0000000000404D82 Unknown Unknown Unknown
libc.so.6 000000391FC1D974 Unknown Unknown Unknown
cam 0000000000404CA9 Unknown Unknown Unknown
ENDRUN:OUTFLD: invalid avgflag=^P
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 20[cli_20]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 20
ENDRUN:OUTFLD: invalid avgflag=

**************************************************************
I cant upload the log file due to size limit.
Please have a look and suggest the needful.

Thank you in anticipation.
Ram
Indian Institute of Technology Delhi
New Delhi-INDIA
 

eaton

CSEG and Liaisons
Please post the configure and build-namelist commands that you're using for the initial run.
 
Sir,

the configure and build-namelist commands i used to are as given below.

configure command:
*************
/shome/2009ast3222/CAM4/ccsm4_0/models/atm/cam/bld/configure -dyn fv -hgrid 0.47x0.63 -ntasks 32 -nosmp -ldflags "-lmpichf90 -lmpich -lpthread -lrt" -fc ifort -cc icc -test

*********************
Namelist command for 1 day initial run:

*******************
/shome/2009ast3222/CAM4/ccsm4_0/models/atm/cam/bld/build-namelist -test -config /shome/2009ast3222/CAM4/Phd_Work/CAM4_runs_0.47x0.63_Resolution_30years/bld_32P_0.47x0.63/config_cache.xml -ignore_ic_date -namelist "&camexp start_ymd=19900101 stop_ymd=19900102 stop_option='ndays' stop_n=1 nhtfrq=0,-24,-24,-24,-3,-3 ndens=2 mfilt=1,30,30,30,240,240 empty_htapes=.true. sstcyc= .false. bndtvs= '/shome/2009ast3222/CAM4/inputdata/atm/cam/sst/sst_HadOIBl_bc_0.47x0.63_1850_2008_c100128.nc' stream_year_first=1970 stream_year_last=2000 fincl1='TREFHT:A','PRECT:A','PS:A','U:A','V:A','OM EGA:A','T:A','Q:A','RELHUM:A','CLOUD:A','TAUX:A',' TAUY:A','SNOWHICE:A','SNOWHLND:A','Z3:A','OMEGA500 :A','OMEGA850:A','Q200:A','Q850:A','T300:A','T850: A','U200:A','U850:A','V200:A','V850:A','Z300:A','Z 500:A','Z700:A' fincl2='TREFHT:A','PRECT:A','PS:A','RELHUM:A','SOL IN:A' fincl3='TREFHT:M' fincl4='TREFHT:X' fincl5='U:A' fincl6='V:A' scenario_ghg = 'RAMPED' bndtvghg= '/shome/2009ast3222/CAM4/inputdata/atm/cam/ggas/ghg_hist_1850-2005_c090419.nc'/"
**********************************
namelist command for Restart run:
*************************
/shome/2009ast3222/CAM4/ccsm4_0/models/atm/cam/bld/build-namelist -test -config /shome/2009ast3222/CAM4/Phd_Work/CAM4_runs_0.47x0.63_Resolution_30years/bld_32P_0.47x0.63/config_cache.xml -ignore_ic_date -namelist "&camexp start_ymd=19900102 stop_ymd=20001231 start_type='continue' stop_option='ndays' stop_n=4014 nhtfrq=0,-24,-24,-24,-3,-3 ndens=2 mfilt=1,30,30,30,240,240 empty_htapes=.true. sstcyc= .false. bndtvs= '/shome/2009ast3222/CAM4/inputdata/atm/cam/sst/sst_HadOIBl_bc_0.47x0.63_1850_2008_c100128.nc' stream_year_first=1970 stream_year_last=2000 fincl1='TREFHT:A','PRECT:A','PS:A','U:A','V:A','OM EGA:A','T:A','Q:A','RELHUM:A','CLOUD:A','TAUX:A',' TAUY:A','SNOWHICE:A','SNOWHLND:A','Z3:A','OMEGA500 :A','OMEGA850:A','Q200:A','Q850:A','T300:A','T850: A','U200:A','U850:A','V200:A','V850:A','Z300:A','Z 500:A','Z700:A' fincl2='TREFHT:A','PRECT:A','PS:A','RELHUM:A','SOL IN:A' fincl3='TREFHT:M' fincl4='TREFHT:X' fincl5='U:A' fincl6='V:A' scenario_ghg = 'RAMPED' bndtvghg= '/shome/2009ast3222/CAM4/inputdata/atm/cam/ggas/ghg_hist_1850-2005_c090419.nc'/"
**********************

i made few run earlier in the same fashion (as above) but now it is creating the problem. yesterday i tried by removing all the output fields (taking by defaults in account only).
than model run restarted successfully .
So please have look over it and suggest any specific way to define the output history file specifications to achieve the required.

Thank you.
Ram
Indian Institute of Technology
Delhi- INDIA
 

eaton

CSEG and Liaisons
I have been able to reproduce this problem and confirm that it is a bug. It turns out that the bug is in the empty_htapes functionality. This was recently discovered and fixed on CAM's trunk. The fix will be part of the cesm1_0_4 release which should happen in a month or so. Until then there are two possible workarounds:

1. remove empty_htapes from the namelist and instead use fexcl1 to remove the unneeded history fields from the default output. The output fields are listed in the logfile which can be used as a source for the field names to use in the fexcl1 specification. But there are quite a few of them so this isn't very convenient.

2. Here is patch for the ccsm4_0 version of cam_history.F90.

520c520
< if ( htapes_defined ) then
---
> if ( nsrest==1 ) then
1442c1442
< htapes_defined = .true.
---
>
1876c1876
< htapes_defined = .true.
---
>
6121a6122,6130
>
> listentry=>masterlinkedlist
> do while(associated(listentry))
> listentry%actflag(:) = .false.
> listentry%act_sometape = .false.
> listentry=>listentry%next_entry
> end do
> htapes_defined = .true.
>
 
Dear Eaton,

Thank you so much..i prefer the 2nd option to fix the issue..

before fixing it i tried one run by removing the following two outfields from fincl2 history file.

1. 'RELHUM:A'
2.. 'SOLIN:A'
both were daily averaged.
than it restarted successfully . so i doubt whether inclusion of these outfields invoking the issue..???

anyways your suggestions are quite valuable.

Thank You so much.

Ram
Indian Institute of Technology
Delhi-INDIA
 

eaton

CSEG and Liaisons
I don't think this bug is associated with any particular output fields. It appears that some variables used to keep track of which fields are active in the various history files were not properly initialized during a restart.
 
I am having this same problem that is described in this post. I dont understand the patch. What does "520c520" and "6121a6122,6130" and
other such symbols. Does this refer to line numbers somehow? Where do I put the written code fixes, right at these line numbers? sorry, Im
confused. You help is much appreciated

Cara-Lyn
 
lappen said:
I am having this same problem that is described in this post. I dont understand the patch. What does "520c520" and "6121a6122,6130" and
other such symbols. Does this refer to line numbers somehow? Where do I put the written code fixes, right at these line numbers? sorry, Im
confused. You help is much appreciated

Cara-Lyn

Yes, those are the line numbers. 'a' stands for add, 'c' stands for change and the < or > tell which direction the differences are going (which file).

This Wikipedia page (diff) should help.
 

eaton

CSEG and Liaisons
I see that the patch I posted for ccsm4_0 was corrupted by cut and paste into the reply window. I'll try it again as an attachment.

Also note that this output from the diff utility can be used with the patch utility to apply the changes. If you apply them to the ccsm4_0 version of cam_history.F90 you'll then be able to see how to apply the same changes to later released versions of cam_history.F90. This fix didn't get into the release code until cesm1_0_4.
 
Top