Tutorial gets error on yellowstone: something to do with too many tasks

mahowald@cornell_edu · Mar 16, 2015

Hi!

I am teaching a course, running the CESM, using a tutorial I used two years ago. Using a script I
used two years ago, which ran, I now get the following error:
Execute poe command line: poe /glade/scratch/mahowald/TestCLM_1/bld/cesm.exe
ATTENTION: 0031-393 Ignoring -resd/MP_RESD specified for batch job
ATTENTION: 0031-408 64 tasks allocated by Resource Manager, continuing...
ATTENTION: 0031-606 Unrecognized environment variable, MP_EAGER_LIMIT_LOCAL.
ERROR: 0031-758 AFFINITY: [ys5922] Oversubscribe: 32 tasks in total, each task requires 1 resource,
but there are only 16 available resource. Affinity can not be applied
ERROR: 0031-161 EOF on socket connection with node ys5922-ib
INFO: 0031-639 Exit status from pm_respond = -1

The run directory for the CESM is:
~mahowald/TestCLM_1

the scratch directory is:
/glade/scratch/mahowald/TestCLM_1/

If you could help me figure out what changed on yellowstone in the last two years that might have
impacted this, and/or how to fix this problem?CISL suggested I add the following:"setenv MP_TASK_AFFINITY cpu

before submitting the CESM run script. Better yet you may just enter into your cesm run script
somewhere before the command mpirun.lsf."

Which I did, but it still didn't work. Does anyone have any other suggestions?

Thanks very much.
Natalie

jedwards · Mar 16, 2015

In env_mach_pes.xml change MAX_TASKS_PER_NODE to 16. This should solve the problem. Currently you are trying to use 32 MPI tasks per node.Each node has 16 cpu's - they can run up to 32 threads, but we recommend not using more than 16 mpi tasks per node.

mahowald@cornell_edu · Mar 20, 2015

Thanks, this fixed it! I think I could also fix it by just changing ptile=16 (instead of 32) in the run script, as I tried that also, and it worked). Thanks!Natalie

Tutorial gets error on yellowstone: something to do with too many tasks

mahowald@cornell_edu

New Member

jedwards

CSEG and Liaisons

mahowald@cornell_edu

New Member