This site is migrating to a new forum software on Tuesday, September 24th 2019, you may experience a short downtime during this transition

Main menu

Navigation

MPICH on linux cluster

1 post / 0 new
lillyw@...
MPICH on linux cluster

hi,
Configured MPICH 1.2.7 with ch_p4, with-comm=shared on linux cluster, the MPI test program runs good on SMP. but it fails if it is tried on remote machine, machines.LINUX includes the remote machine name:
>
>****************************
>breeze:~/mpich-1.2.7/bin/mpirun -np 4 -v -machinefile machines.LINUX ~/mpich-1.2.7/examples/basic/cpi
>running /home/joy/mpich-1.2.7/examples/basic/cpi on 4 LINUX ch_p4
>processors
>Created /home/joy/mpich-1.2.7/examples/basic/PI20515
>rm_7455: p4_error: rm_start: net_conn_to_listener failed: 45830
>p0_20647: p4_error: Child process exited while making connection to
>remote process on haze: 0
>p0_20647: (14.683511) net_send: could not write to fd=7, errno = 32

is it necessary that common filesystem is needed on all machines in machinelist? does that mean to mount a shared filesystem on all machines?
will copy work?

Thanks,
Joy

Who's new

  • jwolff
  • tinna.gunnarsdo...
  • sarthak2235@...
  • eolivares@...
  • shubham.gandhi@...