Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Test case run problem

wlee@purdue_edu

New Member
Dear guru of the CCSM S/W development!

I came to the stage of running test cases after the long walk of ccsm build.

I am running a test case at PBS of linux intel clusters.
I set the mpi library path at the compute nodes by using "module load mpich2-nemesis-pgi64".

However when I run the case, I got message like below
--------------------------------------------------------------------------
[wlee@tg-steele TER.01a.T42_gx1v3.B.steele.233425]$ cat TestStatus.out
doing a 10 day initial test
--------------------------------------------------------------------------


The PBS error message shows like this
--------------------------------------------------------------------------
[wlee@tg-steele TER.01a.T42_gx1v3.B.steele.233425]$ cat TER.01a.T42_gx1.e116806
FORTRAN STOP
/grp/tgportal/CCSM/ccsm-data/archive/wlee/TER.01a.T42_gx1v3.B.steele.233425/cpl: No such file or directory.
--------------------------------------------------------------------------


The output message of PBS ends like this
-----------------------------------------------------------------------------------------------------
[wlee@tg-steele TER.01a.T42_gx1v3.B.steele.233425]$ tail TER.01a.T42_gx1.o116806
-------------------------------------------------------------------------
- CCSM BUILD HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
skipping first model
PBS_MOMPORT=15003
OMP_NUM_THREADS=1
COMP_ATM=cam
COMP_LND=clm
COMP_ICE=csim
COMP_OCN=pop
COMP_CPL=cpl
RAMP_CO2_START_YMD=00000000
Tue Jul 22 05:42:14 EDT 2008 -- CSM EXECUTION BEGINS HERE
Unrecognized argument mpirun.pgfile ignored.
(main) =========================================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_rel04 $
(main) date & time: 2008-07-22 05:42:15
(main) =========================================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
ERROR: no filetag in mph_processors_map.in
Tue Jul 22 05:42:15 EDT 2008 -- CSM EXECUTION HAS FINISHED
Model did not complete - see cpl.log.080722-054152
---------------------------------------------------------------------------------------------------------------------

My run script that is called from test script looks like this (only part that seems to make error)
--------------------------------------------------------------------------------------------------------
cd $EXEROOT/all
paste ${PBS_NODEFILE} mpirun.pgfile1 > mpirun.pgfile
echo "`date` -- CSM EXECUTION BEGINS HERE"
mpirun -pg mpirun.pgfile ./$COMPONENTS[1]
 
Hi,

I don't know if you have solved this problem yet. I am using mpiexec not mpirun, but here is an example for using mpirun. You might want to take a look at that.
http://nf.apac.edu.au/facilities/software/CCSM3/CCSM3_ac.html

Good luck!

-odden


wjlee96 said:
Dear guru of the CCSM S/W development!

I came to the stage of running test cases after the long walk of ccsm build.

I am running a test case at PBS of linux intel clusters.
I set the mpi library path at the compute nodes by using "module load mpich2-nemesis-pgi64".

However when I run the case, I got message like below
--------------------------------------------------------------------------
[wlee@tg-steele TER.01a.T42_gx1v3.B.steele.233425]$ cat TestStatus.out
doing a 10 day initial test
--------------------------------------------------------------------------


The PBS error message shows like this
--------------------------------------------------------------------------
[wlee@tg-steele TER.01a.T42_gx1v3.B.steele.233425]$ cat TER.01a.T42_gx1.e116806
FORTRAN STOP
/grp/tgportal/CCSM/ccsm-data/archive/wlee/TER.01a.T42_gx1v3.B.steele.233425/cpl: No such file or directory.
--------------------------------------------------------------------------


The output message of PBS ends like this
-----------------------------------------------------------------------------------------------------
[wlee@tg-steele TER.01a.T42_gx1v3.B.steele.233425]$ tail TER.01a.T42_gx1.o116806
-------------------------------------------------------------------------
- CCSM BUILD HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
skipping first model
PBS_MOMPORT=15003
OMP_NUM_THREADS=1
COMP_ATM=cam
COMP_LND=clm
COMP_ICE=csim
COMP_OCN=pop
COMP_CPL=cpl
RAMP_CO2_START_YMD=00000000
Tue Jul 22 05:42:14 EDT 2008 -- CSM EXECUTION BEGINS HERE
Unrecognized argument mpirun.pgfile ignored.
(main) =========================================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_rel04 $
(main) date & time: 2008-07-22 05:42:15
(main) =========================================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
ERROR: no filetag in mph_processors_map.in
Tue Jul 22 05:42:15 EDT 2008 -- CSM EXECUTION HAS FINISHED
Model did not complete - see cpl.log.080722-054152
---------------------------------------------------------------------------------------------------------------------

My run script that is called from test script looks like this (only part that seems to make error)
--------------------------------------------------------------------------------------------------------
cd $EXEROOT/all
paste ${PBS_NODEFILE} mpirun.pgfile1 > mpirun.pgfile
echo "`date` -- CSM EXECUTION BEGINS HERE"
mpirun -pg mpirun.pgfile ./$COMPONENTS[1]
 
Top