Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Building CAM standalone in serial

lneef@geomar_de

New Member
I posted earlier about difficulties running CAM (standalone) in serial mode, but my problems there turned out to be due to heaps of other stuff. Having cleared those problems up, I now find that I can't even build CAM for running in serial. Here is what I do and what happens:

- Create my case using create_newcase, compset F1850C5, resolution 1.9x2.5 -- works fine.

- Open env_conf.xml and change USE_MPISERIAL to TRUE

- run ./configure -case --> this works fine as well

- run the build script. This fails and directs me to the following log file:

.../F1850C5_1.9x2.5_1.9x2.5/mct/mct.bldlog.110412-114542

which has the following error message:

make[1]: Entering directory `/work/bb0519/CESM/cesm1_0_2/exerootdir/b325004/F1850C5_1.9x2.5_1.9x2.5/mct/mpi-serial'
cp -f mpif.real4double8.h mpif.h
chmod -w mpif.h
chmod: mpif.h: new permissions are r-xrwxr--, not r-xr-xr--
make[1]: *** [lib] Error 1
make[1]: Leaving directory `/work/bb0519/CESM/cesm1_0_2/exerootdir/b325004/F1850C5_1.9x2.5_1.9x2.5/mct/mpi-serial'
make: *** [subdirs] Error 2

All I understand from this is that something is wrong with the MPI-serial libraries. Does anyone know how to fix this problem?

(by the way, doing everything described above but leaving USE_MPISERIAL as FALSE works fine for me.)

Thanks,
Lisa
 

lneef@geomar_de

New Member
Well, I've managed to build CAM4 serially. My fix to the above problem was weird and not exactly something I can justify, but it worked: I simply went into the subdirectory mct/mpi-serial and ran gmake there, and everything in it compiled just fine. Then running the build script again worked successfully.

However, I still have problems running. Now matter whether I try to submit the job via Load Leveler or when I just try to launch ccsm.exe interactively, I get the same error message:

(seq_comm_setcomm) initialize ID ( 7 GLOBAL ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)
MPI_Group_range_incl: more than 1 proc in group
Abort (core dumped)

...Ok, so it dies when it tries to use seq_comm_setcomm, so my guess is that the core dumpt has something to do with the MCT codes (?) used for sequential running. Maybe my fix to the compilation didn't help after all! I hope someone can help me get forward because now I am just puzzled.
 

eaton

CSEG and Liaisons
I'm not sure whether the serial build is working from the CESM scripts. I'd recommend just using the CAM standalone scripts to do a serial run. Samples of standalone scripts are in models/atm/cam/bld/run-*.csh. To build CAM in serial mode just use the arguments '-nospmd -nosmp' in the configure command. The sample scripts are set up to use mpi, so you'll also need to remove the command for the mpi job launcher. For example, to run serial cam the commandline starts with ./cam instead of something like 'mpirun cam'.
 

lneef@geomar_de

New Member
Thanks for your feedback -- I started clean with one of the scripts in the CAM dir and indeed, it compiled successfully. Great!

Unfortunately my brand new standalone CAM gets to a core dump pretty quickly. here is the error message:

Initial dataset is: /work/bb0519/b325005/CESM/cesm1_0_1/inputdata/atm/cam/inic/fv/cami_0000-01-01_1.9x2.5_L26_c070408.nc
Topography dataset is: /work/bb0519/CESM/cesm1_0_2/inputdata/atm/cam/topo/USGS-gtopo30_1.9x2.5_remap_c050602.nc
Time-invariant (absorption/emissivity) factor dataset is: /work/bb0519/CESM/cesm1_0_2/inputdata/atm/cam/rad/abs_ems_factors_fastvx.c030508.nc
Time-invariant modal aerosol optics datset is: /work/bb0519/b325005/CESM/cesm1_0_1/inputdata/atm/cam/physprops/modal_optics_3mode.nc
Run type flag (NSREST) 0=initial, 1=restart, 3=branch 0
Summary of restart module options:
Restart pointer file is: ./rpointer.atm
Initial conditions history files will be written yearly.
Time filter coefficient (EPS) 0.060
DEL2 Horizontal diffusion coefficient (DIF2) 0.000E+00
DEL4 Horizontal diffusion coefficient (DIF4) 0.000E+00
Number of levels Courant limiter applied 5
Lowest level for dry adiabatic adjust (NLVDRY) 3
Frequency of Shortwave Radiation calc. (IRADSW) 2
Frequency of Longwave Radiation calc. (IRADLW) 2
ISCCP calcs and history IO will NOT be done
********** Time Manager Configuration **********
Calendar type: GREGORIAN
Timestep size (seconds): 1800
Start date (yr mon day tod): 0 1 1 0
Stop date (yr mon day tod): 0 1 2 0
Reference date (yr mon day tod): 0 1 1 0
Current step number: 0
Current date (yr mon day tod): 0 1 1 0
************************************************
CNST_CHK_DIM: number of advected tracer 23 not equal to pcnst = 25
ENDRUN: called without a message string

---------------------------------------------------------------------

So evidently it counts up to 23 advected tracers which doesn't match the 25 that are prescribed somewhere. I have no idea where those are prescribed, or how to fix this.

Thanks for your help!
Lisa
 

eaton

CSEG and Liaisons
What is the configure commandline that was used?

25 constituents (pcnst=25) is correct for the standard cam5 physics package.
 

lneef@geomar_de

New Member
Hi,

this is what my configure command looks like:

$cfgdir/configure
-verbose
-dyn fv
-hgrid 1.9x2.5
-nlev 26
-nospmd -nosmp
-ice cice
-cppdefs "-DDISABLE_TIMERS "
-esmf_libdir "/work/bb0519/b325004/esmf/lib/libg/AIX.default.64.mpiuni.default"
|| echo "configure failed" && exit 1

..I couldn't find much in either the CAM or CESM manuals telling me how the pcnst setting relates to various configure options.
 

eaton

CSEG and Liaisons
Assuming that you're working with a cesm1_0_x release, the default physics is cam5, and that physics package does not work with -nlev 26 (it works with 30 levels). You should remove the -nlev argument from the configure command. If you want to use the cam4 physics package (which does use 26 levels) then specify '-phys cam4' on the configure commandline.

The 25 constituents for the cam5 physics package include 20 species for the modal aerosol code, 4 for the MG microphysics, and 1 for water vapor. If you tried to override the default cam5 microphysics scheme by setting microp_scheme='RK' in your namelist, then you'd get 2 less constituents since RK only uses 2. That would explain why at runtime the code only finds 23 constituents.

Note that to support one of CAM's major roles as a research tool, there is alot of flexibility provided in the configure and build-namelist tools. But this also means that it is possible to configure the model in many ways which will not work. Unless you need the flexibility, the only safe way to change the configuration of the physics package is to use the -phys option to specify the entire cam4 or cam5 packages. In that case the configure and build-namelist utilities will set up consistent default values.
 
Top