Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

How to submit my jobs?

I want to run the ccsm3.0 on my cluters with Linux. But the LSF (Load Sharing Facility) cannot bear the command : mpirun.
So the sentence "mpirun -pg mpirun.pgfile ./$COMPONENTS" must be modified properly. But I don't know. Pleast help me!!!
 
Please, could you explain this
hare said:
the LSF (Load Sharing Facility) cannot bear the command : mpirun.
in more details?
Do you see any error messages?

In general, mpirun utility is the part of an MPI distribution, not the LSF.
Please, check your $PATH variable and if your MPI is properly installed.

If you are using MPI-2, then you must use mpiexec instead of mpirun. You should read the respective manuals and correct the script, because it uses the different command line syntax.

I have the following in my run script:

# -------------------------------------------------------------------------
# Create processor count input files
# -------------------------------------------------------------------------

cd $EXEROOT/all
@ PROC = 0 # counts total number of tasks
#rm -rf mpirun.pgfile1 #Create new pgfile
#rm -rf mpirun.pgfile #Create new pgfile
#echo "0" >! mpirun.pgfile1;
rm -rf gforker.cmdline

foreach n (1 2 3 4 5)
set comp = $COMPONENTS[$n]
set model = $MODELS[$n]
set nthrd = $NTHRDS[$n]
set ntask = $NTASKS[$n]
@ M = 0

ln -s $EXEROOT/$model/$comp $EXEROOT/all/. # link binaries into all dir
echo -n "-n $ntask $EXEROOT/all/$comp " >>! gforker.cmdline
if ($n < 5) then
echo ": " >>! gforker.cmdline
endif

while ( $M < $ntask )
if (($n == 1) && ($M == 0)) then
# echo "skipping first model"
else
# echo "1 $EXEROOT/all/$comp" >>! mpirun.pgfile1;
endif
@ M++
@ PROC++
end
# ln -s $EXEROOT/$model/$comp $EXEROOT/all/. # link binaries into all dir
end
# -------------------------------------------------------------------------
# Run the model
# -------------------------------------------------------------------------

env | egrep '(MP_|LOADL|XLS|FPE|DSM|OMP|MPC)' # document env vars

cd $EXEROOT/all
#paste ${PBS_NODEFILE} mpirun.pgfile1 > mpirun.pgfile
#echo local $PROC > mpirun.pgfile
echo "`date` -- CSM EXECUTION BEGINS HERE"
#mpirun -p4pg mpirun.pgfile ./$COMPONENTS[1]
mpiexec `cat gforker.cmdline`
wait
echo "`date` -- CSM EXECUTION HAS FINISHED"


I didn't delete the initial lines, I have only commented them in order you can easily find the respective part of the script.

My script generates the file gforker.cmdline of the following format
-n 1 cpl :
-n 2 clm :
-n 4 pop :
-n 4 csim :
-n 16 cam

This file contains the command line for mpiexec (I use the gforker variant), which is substituted with the `cat` (in reverse quotes)
 

rneale

Rich Neale
CAM Project Scientist
Staff member
I have included the wrapper mpirun.lsf that is included for running the mpirun command through poe on one of the NCAR systems with LSF. Hopefully you can mine it for anything useful!






#! /bin/sh
#$Id: mpirun.lsf,v 1.13 2005/12/16 19:05:18 llee Exp $
#
# ---------------------------------------------------------------
# mpirun.lsf is used for lammpi, mpich_gm, poe, and mpichp4
# it take following options
# mpirun.lsf [-pam "pam_options"] [mpi_options] job [job_options]
# mpirun.lsf generates pam command line like this:
# pam "pam_options" -g PJL_wrapper mpi_options job job_options
# Name of PJL_wrapper is get through envarionment variable LSF_PJL_TYPE
# set by esub. LSF_PJL_TYPE=lammpi|mpich_gm|poe|mpichp4|mvapich
#
# user have to define this envarionment variable if esub is not
# invoked during job submission.
# ---------------------------------------------------------------

# to start TotalView to debug MPI jobs.
# User should add correct path of totalview to $PATH
# at submission host.
TV="totalview"
Corefile=$TVCORE #TV's corefile name.

# -------------------------------------------------------
# parse command line options
# -------------------------------------------------------
PAM_OPTS=""
TV_ARGS=""
MPI_JOB_CMDLN=""
while [ "$#" -gt 0 ]
do
case "$1" in
-pam)
shift
PAM_OPTS="$PAM_OPTS $1"
shift
;;
-tvopt)
shift
TV_ARGS="$TV_ARGS $1"
shift
;;
*)
MPI_JOB_CMDLN="$MPI_JOB_CMDLN $1"
shift
;;
esac
done

# we want to break pam options into two groups, one contains the
# debug options that starts with "-pass" and assumes that all option after
# are also debug options
# everything before seeing "-pass" will considered non-debug pam options
_PAM_OPTS="$PAM_OPTS"
PAM_OPTS=""
PASS_OPTS=""
PASS=""
for OPTION in $_PAM_OPTS; do
if [ "$OPTION" = "-pass" ]; then
PASS="$OPTION"
fi
if [ -n "$PASS" ]; then
PASS_OPTS="$PASS_OPTS $OPTION"
else
PAM_OPTS="$PAM_OPTS $OPTION"
fi
done

# -------------------------------------------------------
# get pam debug flags from pam options
# -------------------------------------------------------

# -------------------------------------------------------
# find out PJL_TYPE dynamically, if LSF_PJL_TYPE empty
# -------------------------------------------------------

if [ "$LSF_PJL_TYPE" = "" ]
then
. $LSF_ENVDIR/lsf.conf
[ "$LSB_DEFAULT_PJLTYPE" = "" ] && {
# MPI Autodetection is not set. Print an error message and exit
echo "mpirun.lsf: LSF_PJL_TYPE is undefined. Exit ..." 1>&2
exit 1
}

[ "$LSB_MCPU_HOSTS" = "" ] && {
echo "mpirun.lsf: LSB_MCPU_HOSTS is undefined. Aborting." 1>&2
exit 1
}

case `uname` in
SunOS) AWK="nawk";;
*) AWK="awk";;
esac

hosts=`echo $LSB_MCPU_HOSTS |
sed 's/[ ][ ]*[1-9][0-9]*[ ]/ /g;s/[ ][ ]*[1-9][0-9]*$//'`

#
# we want both fast processing and preserved preference order
# thus an associative array (tbl) and an auxiliary indexed array (arr)
#
PJL_TYPE=`lshosts $hosts |
$AWK '
BEGIN {
typesz = split("'"$LSB_DEFAULT_PJLTYPE"'",typearr);
for (i = 1; i
 
Thanks your answer!

The command is : mpirun.lsf -v -pg mpirun.pgfile ./$COMPONENTS[1]
But I cannot run it. The prompt is as follows:
Aug 31 09:32:25 2006 13573 3 6.1 PAM: pjlSpawn: cannot exec the PJL: No such file or directory
Aug 31 09:32:26 2006 13558 3 6.1 PAM: An error occurred starting the PJL.

Anyone can help me!
 
Top