Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Error running CAM5 with 1 degree resolution

Hello,

I'm trying to run compset B_1850_CAM5_CN at 1-deg resolution (f09_g16) on a IBM AIX power6 similar to bluefire. We can run the compset at 0.5 or 2-deg resolution using release 1.0.3 or 1.0.2, but with 1-deg resolution it fails during initialization with the output from the ccsm logfile:

0: Reading setup_nml
0: Reading grid_nml
0: Reading ice_nml
0: Reading tracer_nml
0:CalcWorkPerBlock: Total blocks: 512 Ice blocks: 157 IceFree blocks: 302 Land blocks: 53
0: PartitionCurve: nblocks,nproc 459 96
0: -------------------------wSFC-----------------------
0:MCT::m_Router::initp_: GSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: GSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: GSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
0:MCT::m_Router::initp_: GSMap indices not increasing...Will correct
0:(shr_mct_sMatReaddnc) NetCDF: Not a valid ID
0:(shr_mct_sMatReaddnc) NetCDF: Not a valid ID
0:MCT::m_AttrVectComms::GSM_scatter_: myID = 0. Invalid input, GSMap%gsize = 55296, lsize(iV) = 0
0:shr_mct_sMatReaddnc: Error on scatter of areasrc0
0:000.MCT(MPEU)::die.: from shr_mct_sMatReaddnc()

From the error messages it seems to be initializing the ice component, but failing in broadcasting netcdf variables, perhaps. I've tried playing around with the netcdf library versions but it doesn't seem to help (I usually have netcdf/4.1.1_nc3 and parallel-netcdf/1.1.1 loaded).


Again, runs with CAM4 at 1 degree resolution, or runs with CAM5 at other resolutions don't show this error. Does anyone recognize what's going on?

Thanks,
Lawrence.
 
Any solutions to this? I am having a similar issue. I can run CESM 1.0.3 fine with E_2000 compset at 1.9x2.5 and 0.9x1.25 resolutions fine, but I get this error when trying to run at 0.47x0.63 or 0.23x0.31 resolutions:


Opened existing file
/usr/local/cluster/geophysics/Data/CESM/1.0/inputdata//lnd/clm2/surfdata/surfda
ta_0.47x0.63_simyr2000_c091023.nc 20
Reading setup_nml
Reading grid_nml
Reading ice_nml
Reading tracer_nml
CalcWorkPerBlock: Total blocks: 512 Ice blocks: 171 IceFree blocks: 305 Land blocks: 36
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: RGSMap indices not increasing...Will correct
MCT::m_Router::initp_: GSMap indices not increasing...Will correct
(shr_mct_sMatReaddnc) NetCDF: Not a valid ID
(shr_mct_sMatReaddnc) NetCDF: Not a valid ID
MCT::m_AttrVectComms::GSM_scatter_: myID = 0. Invalid input, GSMap%gsize = 221184, lsize(iV) = 0

ccsm.exe:16196 terminated with signal 11 at PC=103ed27 SP=7fff11fc98d0. Backtrace:
./ccsm.exe(m_attrvectcomms_mp_gsm_scatter__+0x1287)[0x103ed27]
 

eaton

CSEG and Liaisons
I just tried the same configuration that Lawrence was using on NCAR's bluefire. It worked out of the box for me.

The error I see in both the log files in this thread is

(shr_mct_sMatReaddnc) NetCDF: Not a valid ID

My guess is that the problem is a missing dataset. Perhaps there's a message at the end of one of the component log files with the name of a dataset that it was trying to open. It's tedious, but another possible way to find the problem is to look in all the namelist input files and confirm that all the listed input datasets are available. The CESM scripts are supposed to check this for you, but maybe there's a bug in that checking. Sometimes intermittent filesystem problems can prevent files from being opened that are actually there. That kind of problem is usually not reproducible, so resubmitting the failed run would check for that possibility.
 
Top