bug in parallel computing

I am getting the following message arbitrarily at times when running a parallel job using OpenFoam Application complied by Icc and compiler and intel mpi. When I have one running job, it is fine, but all the jobs crashe for multiple running jobs.

lsb_launch(): Failed while waiting for tasks to finish.
[mpiexec@ys0271] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:101): one of the processes terminated badly; aborting
[mpiexec@ys0271] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:18): bootstrap device returned error waiting for completion
[mpiexec@ys0271] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:521): bootstrap server returned error waiting for completion
[mpiexec@ys0271] main (./ui/mpich/mpiexec.c:548): process manager error waiting for completion
 

jedwards

CSEG and Liaisons
Staff member
This seems like a system issue not a cesm issue. Please consult with your system administrators, and if you are using cesm to generate this error perhaps you should try something like a simple hello_world program instead.
 
Back
Top