rambhari0123@gmail_com
Member
Hello, We have PGI CDK 13.9 compiler installed on LINUX(centos 6.2) cluster with SUN GRID ENGINE (SGE) job schduler. Also, We have 12 processor in each compute node(1 master node and 9 compute nodes). I am building the CAM with following way:
$/home/2012asz8344/cam/cam5/cesm1_0/models/atm/cam/bld/configure -dyn fv -hgrid 1.9x2.5 -ntasks 32 -nosmp -fc pgf90 -cc pgcc -test >& config.log & $gmake -j8 $/home/2012asz8344/cam/cam5/cesm1_0/models/atm/cam/bld/build-namelist -test -config /home/2012asz8344/cam5/x1/bld/config_cache.xml >& bld.log & -----------------------------Job schduler SGE script for CAM--------------------------------------------------------------------------------------------------#!/bin/sh#$ -pe mpi 32#$ -cwd#$ -j y#$ -S /bin/bash#export PGI=/opt/pgicd /home/2012asz8344/cam5/x1/run/opt/pgi/linux86-64/2013/mpi/mpich/bin/mpirun -np 32 -machinefile /home/2012asz8344/list_node /home/2012asz8344/cam5/x1/bld/cam---------------------------------------------------------------------------------------------------------------------------------------------------------------- I am getting the performance issue in my cluster. When I am running the job in queue it uses three compute nodes with 32 cores. qstat command displaying that job is running on three nodes with 32 cores. But qhost commad is displaying that job is using all memory usage of other compute nodes also. Some Cam processes are also running on other compute nodes which causes the performance issue in machine Job is taking much time for execution. Also it decrease the cluster performance. I don't know why this is happening . Please do needful suggestion. Also if you need some more details please let me know. Thanks and Regards: Ankush.
$/home/2012asz8344/cam/cam5/cesm1_0/models/atm/cam/bld/configure -dyn fv -hgrid 1.9x2.5 -ntasks 32 -nosmp -fc pgf90 -cc pgcc -test >& config.log & $gmake -j8 $/home/2012asz8344/cam/cam5/cesm1_0/models/atm/cam/bld/build-namelist -test -config /home/2012asz8344/cam5/x1/bld/config_cache.xml >& bld.log & -----------------------------Job schduler SGE script for CAM--------------------------------------------------------------------------------------------------#!/bin/sh#$ -pe mpi 32#$ -cwd#$ -j y#$ -S /bin/bash#export PGI=/opt/pgicd /home/2012asz8344/cam5/x1/run/opt/pgi/linux86-64/2013/mpi/mpich/bin/mpirun -np 32 -machinefile /home/2012asz8344/list_node /home/2012asz8344/cam5/x1/bld/cam---------------------------------------------------------------------------------------------------------------------------------------------------------------- I am getting the performance issue in my cluster. When I am running the job in queue it uses three compute nodes with 32 cores. qstat command displaying that job is running on three nodes with 32 cores. But qhost commad is displaying that job is using all memory usage of other compute nodes also. Some Cam processes are also running on other compute nodes which causes the performance issue in machine Job is taking much time for execution. Also it decrease the cluster performance. I don't know why this is happening . Please do needful suggestion. Also if you need some more details please let me know. Thanks and Regards: Ankush.