Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CCSM3.0 runtime error

Hi !!
I am newbie to CCSM. Just to start with I had setup the following test by using create_test.
testcase: ER.01a
grid: T31_gx3v5
compset: B
machine : generic_linux

Configuration I am using is
-------------------------- Configuration ------------------------------
Dual Intel® Xeon® quad core E5365 (3 GHz, 1333 MHz FSB) cpu
Lustre based cluster file system
20Gbps 4x DDR Infiniband Interconnect
Linux (kernel version : 2.6.9-55.9hp.4sp.XCsmp)
hpmpi (2.02.05.01)
Intel compilers (10.0.026)
LSF
-------------------------------------------------------------------------------------

default processor distribution is used
cpl - 2
csim - 8
clm - 8
pop - 24
cam - 16

Batch Information is added to the test and run scripts as
#=============================
=================================
# This is a CCSM batch job script for eka
#===============================================================
## BATCH INFO
#BSUB -J TER.01a.T31_gx3v5.B.generic_linux.114959
#BSUB -oo log.out -eo log.err
#BSUB -n 56


The model builds successfully.
I am trying to submit the scripts as

>> bsub < TESTCASE.test

But I am getting following warning when I tried to run

XLSF_UIDDIR=/opt/hptc/lsf/top/6.2/linux2.6-glibc2.3-x86_64-slurm/lib/uid
COMP_ATM=cam
COMP_LND=clm
COMP_ICE=csim
COMP_OCN=pop
COMP_CPL=cpl
RAMP_CO2_START_YMD=00000000
Tue Aug 5 15:43:41 IST 2008 -- CSM EXECUTION BEGINS HERE
cpl
(main) =============================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_rel04 $
(main) date & time: 2008-08-05 15:43:41
(main) ==============================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
(main) ==============================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_rel04 $
(main) date & time: 2008-08-05 15:43:41
(main) =============================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
warning: global processor 0 is overlapped
(cpl_comm_init) cpl_comm_comp, size: 43 2
warning: global processor 1 is overlapped
(cpl_comm_init) cpl_comm_comp, size: 43 2

===================================================================

What does above warning means? How should I handle this?

I am aware that I have to tailor the scripts to make use of bsub, but I don't know how to :(

Thanks & Regards,
Sandip.
 
Hello Sandip and list

You seem to have oversubscribed the processors/cores,
which is not very good for MPI.

The total number of tasks is 58 ( = 2 + 8 + 8 + 24 + 16).
However, your LSF script requests only 56 processors.

You may want to change the line:

#BSUB -n 56

to

#BSUB -n 58

If you don't have 58 cores available, try reducing the number of tasks on
some components.

I hope this helps.

Gus Correa
 
Top