wl@eimb_ru
Member
I am trying to run the CCSM3 on Cray XD1.
uname -a gives
$ uname -a
Linux c613n6 2.6.5_H_01_03 #41 SMP Wed Sep 13 09:21:55 CDT 2006 x86_64 x86_64 x86_64 GNU/Linux
Compiler: PGI 6.1.4
MPI: mpich-1.2.6
Since I have limitations on number of physical processors to use (16 maximum, however, other researchers also use some of them), I have asked the support staff to increase the limit of the number of software processes, running simultaneously on a compute node (the "ppn" resource of the PBS queues).
It was 2, they increased it to 8.
However, when number of processes on a node (ppn) gets over 2, the model execution becomes terribly slow, from 30 min to 1 hour per one modelling day.
I have run the CCSM3 already, on SMP computer with 4 physical processors (AMD dual core Opterons), under Linux, compiled with PGI.
I had no problems with running 24 CCSM3 processes (2 cpl, 2 csim, 2 pop, 2 clm and 16 cam), that is 6 processes per processor (plus other system processes: kernel, X, daemons, etc). Even more CCSM3 processes were possible. The maximum simulation speed I have achieved was about 2 min per modeling day.
I have took the same CCSM3 sources and transferred them to the Cray, which is more powerful, but gives me such a slow simulation.
Where could be the problem?
uname -a gives
$ uname -a
Linux c613n6 2.6.5_H_01_03 #41 SMP Wed Sep 13 09:21:55 CDT 2006 x86_64 x86_64 x86_64 GNU/Linux
Compiler: PGI 6.1.4
MPI: mpich-1.2.6
Since I have limitations on number of physical processors to use (16 maximum, however, other researchers also use some of them), I have asked the support staff to increase the limit of the number of software processes, running simultaneously on a compute node (the "ppn" resource of the PBS queues).
It was 2, they increased it to 8.
However, when number of processes on a node (ppn) gets over 2, the model execution becomes terribly slow, from 30 min to 1 hour per one modelling day.
I have run the CCSM3 already, on SMP computer with 4 physical processors (AMD dual core Opterons), under Linux, compiled with PGI.
I had no problems with running 24 CCSM3 processes (2 cpl, 2 csim, 2 pop, 2 clm and 16 cam), that is 6 processes per processor (plus other system processes: kernel, X, daemons, etc). Even more CCSM3 processes were possible. The maximum simulation speed I have achieved was about 2 min per modeling day.
I have took the same CCSM3 sources and transferred them to the Cray, which is more powerful, but gives me such a slow simulation.
Where could be the problem?