Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

ERROR: cpu distribution

Hi,

I have a problem when I run CCSM on SGI altix3700 with Intel compiler.
I run CCSM with no changes to any code, timesteps, initial conditions, except for the Makefile of ./model/bld/Macros.* and the number and distribution of processors.
Macros.Linux-ia64 shows:
**********************************
# CVS $Id: Macros.Linux-ia64,v 1.11 2004/07/12 14:09:50 gerardo Exp $
# CVS $Source: /fs/cgd/csm/models/CVS.REPOS/shared/bld/Macros.Linux,v $
# CVS $Name: ccsm3_0 $
#===============================================================================
# Makefile macros for "Linux-ia64", supports Intel 8.x compilers
#===============================================================================

# INCLDIR := -I. -I/usr/local/include
-I$(INCROOT) -I/include

# SLIBS := -L/usr/local/lib -lnetcdf -lmpi -lscs
INCLDIR := -I. -I/usr/local/include -I/usr/include -I/usr/local/netcdf3.6.2/include -I$(INCROOT)
SLIBS := -L/usr/local/lib -L/usr/lib -L/usr/local/netcdf3.6.2/lib -lnetcdf -lmpi -lscs
ULIBS := -L$(LIBROOT) -lesmf -lmct -lmpeu -lmph
CPP := NONE
CPPFLAGS :=
CPPDEFS := -DSGI_IA64 -DLINUX -DNO_SHR_VMATH
CC := icc
CFLAGS := -c -cpp -O2 -ftz -tpp2 -fno-alias -fno-fnalias -ip
FIXEDFLAGS :=
FREEFLAGS :=
FC := ifort
FFLAGS := -c -cpp -r8 -i4 -O2 -ftz -mp -convert big_endian

MOD_SUFFIX := mod
LD := $(FC)
LDFLAGS := -Wl,-noinhibit-exec -Vaxlib -posixlib
AR := ar

ifeq ($(MODEL),pop)
CPPDEFS := $(CPPDEFS) -DPOSIX -Dimpvmix -Dcoupled
-DNPROC_X=$(NX) -DNPROC_Y=$(NY)
FIXEDFLAGS := -132 -assume byterecl
endif

ifeq ($(MODEL),csim)
CPPDEFS := $(CPPDEFS) -Dcoupled -DNPROC_X=$(NX) -DNPROC_Y=$(NY) -D_MPI
FIXEDFLAGS := -132 -assume byterecl
endif

ifeq ($(THREAD),TRUE)
# CPPFLAGS := $(CPPFLAGS) -D_OPENMP
# -D_OPENMP is redundant if -openmp is enabled in ifort/efc/icc/ecc
CPPDEFS := $(CPPDEFS) -DTHREADED_OMP
# FREEFLAGS := $(FREEFLAGS) -mp
FREEFLAGS := $(FREEFLAGS) -openmp
FIXEDFLAGS := $(FIXEDFLAGS) -openmp
CFLAGS := $(CFLAGS) -openmp
# LDFLAGS := $(LDFLAGS) -mp
LDFLAGS := $(LDFLAGS) -openmp
endif

**********************************
When I setting the cpu distribution in file env_mach.altix:
set ntasks_atm = 8; set nthrds_atm = 1
set ntasks_lnd = 4; set nthrds_lnd = 1
set ntasks_ice = 2; set nthrds_ice = 1
set ntasks_ocn = 4; set nthrds_ocn = 1
set ntasks_cpl = 1; set nthrds_cpl = 1
the running is ok. But when I change
set ntasks_cpl = 1; set nthrds_cpl = 1
to
set ntasks_cpl = 2; set nthrds_cpl = 1

the running is ended with an error:
**************************************
(main) -------------------------------------------------------------------------
(main) partial map data init, frac init
(main) -------------------------------------------------------------------------
(cpl_map_init) initialize map: map_Fo2a
(cpl_map_init) scatter matrix by column...
[1] Exit 1 mpirun -v -d /lun/home/ccl/model/T42/A2/all -np 2 cpl : -np 2 csim : -np 4 clm ...
Tue Apr 21 14:18:54 CST 2009 -- CSM EXECUTION HAS FINISHED
Model did not complete - see cpl.log.090421-141801
**************************************

the last few lines in cpl.log.090421-141801 are:
*************************************
(main) -------------------------------------------------------------------------
(main) partial map data init, frac init
(main) -------------------------------------------------------------------------
(cpl_map_init) initialize map: map_Fo2a
(cpl_map_read) reading mapping matrix data...
(cpl_map_read) * file name : map_gx1v3_to_T42_aave_da_010709.nc
(cpl_map_read) * matrix dimensions rows x cols : 122880 x 8192
(cpl_map_read) * number of non-zero elements: 145548
(cpl_map_read) ... done reading file
(cpl_map_init) skipping map test, dbug level = 1
(cpl_map_init) scatter matrix by column...
*************************************

It means that task of cpl can not be larger than 1. What is the problem? How can I fix it?
Thanks!
 
Top