liushan@mail_iap_ac_cn
New Member
Hi everyone,
Does anyone know how to set up the CCSM3 running under MPI version 1.2.7 ?
In my run script, I wrote “/opt/pgi/linux86-64/7.1/mpi/mpich/bin/mpirun -p4pg $EXEROOT/mpirun.pgfile ./$COMPONENTS[1]”,but there seems to be something wrong with it.
I got error message like this:
ccsm.o contains
…
p23_27852: p4_error: : 197
rm_l_23_27871: (1.363281) net_send: could not write to fd=5, errno = 32
(cpl_comm_init) cpl_comm_comp, size: 137 4
(cpl_comm_init) comm world : comm,npe,pid 133 32 22
(cpl_comm_init) comm component: comm,npe,pid 137 4 2
(cpl_comm_init) comm world pe0: atm,ice,lnd,ocn,cpl,me 8 1 2 6 0 6
(cpl_comm_init) mph cid : atm,ice,lnd,ocn,cpl,me 1 2 3 4 5 4
…
p6_25585: p4_error: net_recv read: probable EOF on socket: 1
P4 procgroup file is /mnt/storage-space/disk1/lius/exe/TDB .01a .T31_gx3v5.B.jazz.205822/mpirun.pgfile.
…
AND ccsm.e contains:
23 - MPI_CART_SHIFT : Null communicator
[23] Aborting program !
[23] Aborting program!
PGFIO/stdio: Bad file descriptor
PGFIO-F-/OPEN/unit=10/error code returned by host stdio - 9.
File name = mph_processors_map.in formatted, sequential access record = 8
In source file /mnt/storage-space/disk1/lius/ccsmroot/ccsm3/models/csm_share/shr/shr_msg_mod.F90, at line number 102
22 - MPI_CART_SHIFT : Null communicator
[22] Aborting program !
[22] Aborting program!
Here is my hardware information:
Machine: Intel xeon cluster with Linux, 16 cpus per node
mpi: mpich 1.2.7 , using mpirun
pgi: 7.1
Batch: PBS
Network: Gbit Ethernet
Anyone can give some advice?
Thanks.
Liu. S
Does anyone know how to set up the CCSM3 running under MPI version 1.2.7 ?
In my run script, I wrote “/opt/pgi/linux86-64/7.1/mpi/mpich/bin/mpirun -p4pg $EXEROOT/mpirun.pgfile ./$COMPONENTS[1]”,but there seems to be something wrong with it.
I got error message like this:
ccsm.o contains
…
p23_27852: p4_error: : 197
rm_l_23_27871: (1.363281) net_send: could not write to fd=5, errno = 32
(cpl_comm_init) cpl_comm_comp, size: 137 4
(cpl_comm_init) comm world : comm,npe,pid 133 32 22
(cpl_comm_init) comm component: comm,npe,pid 137 4 2
(cpl_comm_init) comm world pe0: atm,ice,lnd,ocn,cpl,me 8 1 2 6 0 6
(cpl_comm_init) mph cid : atm,ice,lnd,ocn,cpl,me 1 2 3 4 5 4
…
p6_25585: p4_error: net_recv read: probable EOF on socket: 1
P4 procgroup file is /mnt/storage-space/disk1/lius/exe/TDB .01a .T31_gx3v5.B.jazz.205822/mpirun.pgfile.
…
AND ccsm.e contains:
23 - MPI_CART_SHIFT : Null communicator
[23] Aborting program !
[23] Aborting program!
PGFIO/stdio: Bad file descriptor
PGFIO-F-/OPEN/unit=10/error code returned by host stdio - 9.
File name = mph_processors_map.in formatted, sequential access record = 8
In source file /mnt/storage-space/disk1/lius/ccsmroot/ccsm3/models/csm_share/shr/shr_msg_mod.F90, at line number 102
22 - MPI_CART_SHIFT : Null communicator
[22] Aborting program !
[22] Aborting program!
Here is my hardware information:
Machine: Intel xeon cluster with Linux, 16 cpus per node
mpi: mpich 1.2.7 , using mpirun
pgi: 7.1
Batch: PBS
Network: Gbit Ethernet
Anyone can give some advice?
Thanks.
Liu. S