Performance issue during running a job in CAM 5.3

Hello, We have PGI CDK 13.9 compiler installed on LINUX(centos 6.2) cluster with SUN GRID ENGINE (SGE) job schduler. Also, We have 12 processor in each compute node(1 master node and 9 compute nodes). I am building the CAM with following way:


$/home/2012asz8344/cam/cam5/cesm1_0/models/atm/cam/bld/configure -dyn fv -hgrid 1.9x2.5 -ntasks 32 -nosmp  -fc  pgf90  -cc  pgcc -test >& config.log & $gmake -j8 $/home/2012asz8344/cam/cam5/cesm1_0/models/atm/cam/bld/build-namelist -test -config /home/2012asz8344/cam5/x1/bld/config_cache.xml >& bld.log & -----------------------------Job schduler SGE script for CAM--------------------------------------------------------------------------------------------------#!/bin/sh#$ -pe mpi 32#$ -cwd#$ -j y#$ -S /bin/bash#export PGI=/opt/pgicd /home/2012asz8344/cam5/x1/run/opt/pgi/linux86-64/2013/mpi/mpich/bin/mpirun -np 32  -machinefile /home/2012asz8344/list_node /home/2012asz8344/cam5/x1/bld/cam----------------------------------------------------------------------------------------------------------------------------------------------------------------  I am getting the performance issue in my cluster. When I am running the job in queue it uses three compute nodes with 32 cores. qstat command displaying that job is running on three nodes with 32 cores. But qhost  commad is displaying that job is using all memory usage of other compute nodes also. Some Cam processes are also running on other compute nodes which causes the performance issue in machine  Job is taking much time for execution. Also it decrease the cluster performance. I don't know why this is happening . Please do needful suggestion. Also if you need some more details please let me know.   Thanks and Regards: Ankush. 
 

eaton

CSEG and Liaisons
A cam5 run w/ FV, 1.9x2.5, and 32 tasks reported memory usage as 528-MB max and 344-MB min over the 32 tasks.  Assuming the average is under 500-MB per task, the total memory use on 1 node running 12 tasks should be less than 6-GB.  A rough performance guide is that on intel sandy bridge processors and making use of the hyperthreading, that model configuration runs at about 4 model years/day.  You'll need to work with your system administrators to understand how to run efficiently on your cluster.
 
Back
Top