Main menu

Navigation

MPICH on linux cluster

1 post / 0 new
lillyw@...
MPICH on linux cluster

hi,
Configured MPICH 1.2.7 with ch_p4, with-comm=shared on linux cluster, the MPI test program runs good on SMP. but it fails if it is tried on remote machine, machines.LINUX includes the remote machine name:
>
>****************************
>breeze:~/mpich-1.2.7/bin/mpirun -np 4 -v -machinefile machines.LINUX ~/mpich-1.2.7/examples/basic/cpi
>running /home/joy/mpich-1.2.7/examples/basic/cpi on 4 LINUX ch_p4
>processors
>Created /home/joy/mpich-1.2.7/examples/basic/PI20515
>rm_7455: p4_error: rm_start: net_conn_to_listener failed: 45830
>p0_20647: p4_error: Child process exited while making connection to
>remote process on haze: 0
>p0_20647: (14.683511) net_send: could not write to fd=7, errno = 32

is it necessary that common filesystem is needed on all machines in machinelist? does that mean to mount a shared filesystem on all machines?
will copy work?

Thanks,
Joy

Who's new

  • 1658093099@...
  • mborreggine@...
  • kabirtam@...
  • suns@...
  • liangpeng0405@...