CCSM3.0 runtime error

Hi !!
I am newbie to CCSM. Just to start with I had setup the following test by using create_test.
testcase: ER.01a
grid: T31_gx3v5
compset: B
machine : generic_linux

Configuration I am using is
-------------------------- Configuration ------------------------------
Dual Intel® Xeon® quad core E5365 (3 GHz, 1333 MHz FSB) cpu
Lustre based cluster file system
20Gbps 4x DDR Infiniband Interconnect
Linux (kernel version : 2.6.9-55.9hp.4sp.XCsmp)
hpmpi (2.02.05.01)
Intel compilers (10.0.026)
LSF
-------------------------------------------------------------------------------------

default processor distribution is used
cpl - 2
csim - 8
clm - 8
pop - 24
cam - 16

Batch Information is added to the test and run scripts as
#=============================
=================================
# This is a CCSM batch job script for eka
#===============================================================
## BATCH INFO
#BSUB -J TER.01a.T31_gx3v5.B.generic_linux.114959
#BSUB -oo log.out -eo log.err
#BSUB -n 56


The model builds successfully.
I am trying to submit the scripts as

>> bsub < TESTCASE.test

But I am getting following warning when I tried to run

XLSF_UIDDIR=/opt/hptc/lsf/top/6.2/linux2.6-glibc2.3-x86_64-slurm/lib/uid
COMP_ATM=cam
COMP_LND=clm
COMP_ICE=csim
COMP_OCN=pop
COMP_CPL=cpl
RAMP_CO2_START_YMD=00000000
Tue Aug 5 15:43:41 IST 2008 -- CSM EXECUTION BEGINS HERE
cpl
(main) =============================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_rel04 $
(main) date & time: 2008-08-05 15:43:41
(main) ==============================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
(main) ==============================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_rel04 $
(main) date & time: 2008-08-05 15:43:41
(main) =============================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
warning: global processor 0 is overlapped
(cpl_comm_init) cpl_comm_comp, size: 43 2
warning: global processor 1 is overlapped
(cpl_comm_init) cpl_comm_comp, size: 43 2

===================================================================

What does above warning means? How should I handle this?

I am aware that I have to tailor the scripts to make use of bsub, but I don't know how to :(

Thanks & Regards,
Sandip.
 
Hello Sandip and list

You seem to have oversubscribed the processors/cores,
which is not very good for MPI.

The total number of tasks is 58 ( = 2 + 8 + 8 + 24 + 16).
However, your LSF script requests only 56 processors.

You may want to change the line:

#BSUB -n 56

to

#BSUB -n 58

If you don't have 58 cores available, try reducing the number of tasks on
some components.

I hope this helps.

Gus Correa
 
Back
Top