CAM hangs when run with more than one node

Hi all,

I'm trying to run CAM 3.1 on a Sun Opteron linux cluster, but am having problems using more than one node. I can successfully build and run CAM using multiple processors, but only if I stay within a single node. As soon as I attempt to run CAM on more than one node, CAM will hang. No output will occur, and the job will eventually time out. The last line in my output is something like this:

nlong( 64 )= 128 wnummax(64 ) = 42

The system I am using is running SuSE, PathScale 2.5 compiler, and Voltaire Infiniband. Also, when I run configure with the -test option, MPI (and everything else) seem to check out just fine.

Has anyone experienced something like this before? Any thoughts you may have would be greatly appreciated.

Thanks,

Matt Higgins
 
these questions might be of help!
-->configured with spmd?
-->how many procs are you trying to use and for what resolutions? in the sense for eul and sld in cam3.1 you cannot use running on procs more than nlat!?
 
Back
Top