rmokos@ncsa_illinois_edu
New Member
Hi, I'm helping a user get CESM 2.0 working on Blue Waters (Cray XE6). We check out the code like this: --------------------
git clone -b release-clm5.0 https://github.com/ESCOMP/ctsm.git clm5.0cd clm5.0
./manage_externals/checkout_externals
-------------------- After fiddling with some parameters and modules, we're able to get it to build, but it fails when submitting the case using qsub. The subprocess.Popen call in clm5.0/cime/scripts/lib/CIME/utils.py returns nothing in "output," which leads to the following error: --------------------
...
Check case OK
submit_jobs case.run
job is case.run
Submit job case.run
Submitting job script qsub -q normal -l walltime=24:00:00 -A fyy case.run
ERROR: Couldn't match jobid_pattern '^(S+)$' within submit output:
''
-------------------- I tried adding the following logger statement to utils.py to look at the inputs to Popen: --------------------
if (verbose != False and (verbose or logger.isEnabledFor(logging.DEBUG))): logger.info(" arg_stdout=%s arg_stderr=%s stdin=%s from_dir=%s env=%s"%(arg_stdout,arg_stderr,stdin,from_dir,env))
-------------------- I then ran the submit script with --debug, but it doesn't seem to provide any further insight: --------------------
...
Check case OK
RUN: /mnt/bwpy/single/usr/bin/xmllint --format --output /mnt/a/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/testI/env_run.xml -
arg_stdout=-1 arg_stderr=-1 stdin=-1 from_dir=None env=None
RUN: /mnt/bwpy/single/usr/bin/xmllint --format --output /mnt/a/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/testI/env_batch.xml -
arg_stdout=-1 arg_stderr=-1 stdin=-1 from_dir=None env=None
submit_jobs case.run
job is case.run
Submit job case.run
Submitting job script qsub -q normal -l walltime=24:00:00 -A fyy case.run
RUN: qsub -q normal -l walltime=24:00:00 -A fyy case.run
arg_stdout=-1 arg_stderr=-2 stdin=None from_dir=None env=None
> /mnt/a/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/lib/CIME/utils.py(49)expect()
-> raise exc_type("{} {}".format(error_prefix, error_msg))
(Pdb)
--------------------
The user created a test script to try to replicate the issue with the execution of the same qsub command, but instead of failing like the CESM code, it works: test script: --------------------
import os, sys
import subprocess
sys.path.append(os.path.abspath("/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/lib"))
from CIME.utils import run_cmd_no_fail
cmd='qsub -q normal -l walltime=24:00:00 -A fyy case.run'
arg_stdout=subprocess.PIPE
arg_stderr=subprocess.STDOUT
stdin=None
from_dir=None
env=None
input_str=None
print 'arg_stdout=%s arg_stderr=%s stdin=%s from_dir=%s env=%s'%(arg_stdout,arg_stderr,stdin,from_dir,env)
print '*****direct call subprocess.Popen******'
proc= subprocess.Popen(cmd, shell=True, stdout=arg_stdout, stderr=arg_stderr, stdin=stdin, cwd=from_dir, env=env)
output, errput = proc.communicate(input_str)
output = output.strip() if output is not None else output
errput = errput.strip() if errput is not None else errput
stat = proc.wait()
print 'output=%s'%output
print 'errput=%s'%errput
print 'stat=%s'%stat
print '*****call CIME.utils.run_cmd_no_fail******'
output = run_cmd_no_fail(cmd, combine_output=True)
print 'output=%s'%output
-------------------- output: --------------------
> python test.py
arg_stdout=-1 arg_stderr=-2 stdin=None from_dir=None env=None
*****direct call subprocess.Popen******
output=INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
WARNING: Job script does not invoke any 'aprun'/'ccmrun' command.
Job will be submitted as usual, but please ensure your job script eventually
invokes 'aprun'/'ccmrun' command to execute tasks on allocated compute nodes.
Please contact help+bw@ncsa.illinois.edu if you need any assistance.
INFO: Job submitted to account: fyy
8612426.bw
errput=None
stat=0
*****call CIME.utils.run_cmd_no_fail******
output=INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
WARNING: Job script does not invoke any 'aprun'/'ccmrun' command.
Job will be submitted as usual, but please ensure your job script eventually
invokes 'aprun'/'ccmrun' command to execute tasks on allocated compute nodes.
Please contact help+bw@ncsa.illinois.edu if you need any assistance.
INFO: Job submitted to account: fyy
8612427.bw
-------------------- 2 jobs are submitted, which is what should happen. We're using python 2.7.14. Also note that Blue Waters has a special python environment that is used (details are here if you care: https://bluewaters.ncsa.illinois.edu/python). Because the version of perl in the bwpy module needs updating, we use the system default. As a result of these things, the following commands are issued to enter into the python environment before building and submitting the case: --------------------
module load bwpy/0.3.2
module load /u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/CESM-ENV # sets CESMDATAROOT
export PERL5LIB=/usr/lib/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi:$PERL5LIB
export PATH=~/bin:$PATH
bwpy-environ
-------------------- The same is also done before running the test script. Any help would be appreciated. Ryan
git clone -b release-clm5.0 https://github.com/ESCOMP/ctsm.git clm5.0cd clm5.0
./manage_externals/checkout_externals
-------------------- After fiddling with some parameters and modules, we're able to get it to build, but it fails when submitting the case using qsub. The subprocess.Popen call in clm5.0/cime/scripts/lib/CIME/utils.py returns nothing in "output," which leads to the following error: --------------------
...
Check case OK
submit_jobs case.run
job is case.run
Submit job case.run
Submitting job script qsub -q normal -l walltime=24:00:00 -A fyy case.run
ERROR: Couldn't match jobid_pattern '^(S+)$' within submit output:
''
-------------------- I tried adding the following logger statement to utils.py to look at the inputs to Popen: --------------------
if (verbose != False and (verbose or logger.isEnabledFor(logging.DEBUG))): logger.info(" arg_stdout=%s arg_stderr=%s stdin=%s from_dir=%s env=%s"%(arg_stdout,arg_stderr,stdin,from_dir,env))
-------------------- I then ran the submit script with --debug, but it doesn't seem to provide any further insight: --------------------
...
Check case OK
RUN: /mnt/bwpy/single/usr/bin/xmllint --format --output /mnt/a/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/testI/env_run.xml -
arg_stdout=-1 arg_stderr=-1 stdin=-1 from_dir=None env=None
RUN: /mnt/bwpy/single/usr/bin/xmllint --format --output /mnt/a/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/testI/env_batch.xml -
arg_stdout=-1 arg_stderr=-1 stdin=-1 from_dir=None env=None
submit_jobs case.run
job is case.run
Submit job case.run
Submitting job script qsub -q normal -l walltime=24:00:00 -A fyy case.run
RUN: qsub -q normal -l walltime=24:00:00 -A fyy case.run
arg_stdout=-1 arg_stderr=-2 stdin=None from_dir=None env=None
> /mnt/a/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/lib/CIME/utils.py(49)expect()
-> raise exc_type("{} {}".format(error_prefix, error_msg))
(Pdb)
--------------------
The user created a test script to try to replicate the issue with the execution of the same qsub command, but instead of failing like the CESM code, it works: test script: --------------------
import os, sys
import subprocess
sys.path.append(os.path.abspath("/u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/clm5.0/cime/scripts/lib"))
from CIME.utils import run_cmd_no_fail
cmd='qsub -q normal -l walltime=24:00:00 -A fyy case.run'
arg_stdout=subprocess.PIPE
arg_stderr=subprocess.STDOUT
stdin=None
from_dir=None
env=None
input_str=None
print 'arg_stdout=%s arg_stderr=%s stdin=%s from_dir=%s env=%s'%(arg_stdout,arg_stderr,stdin,from_dir,env)
print '*****direct call subprocess.Popen******'
proc= subprocess.Popen(cmd, shell=True, stdout=arg_stdout, stderr=arg_stderr, stdin=stdin, cwd=from_dir, env=env)
output, errput = proc.communicate(input_str)
output = output.strip() if output is not None else output
errput = errput.strip() if errput is not None else errput
stat = proc.wait()
print 'output=%s'%output
print 'errput=%s'%errput
print 'stat=%s'%stat
print '*****call CIME.utils.run_cmd_no_fail******'
output = run_cmd_no_fail(cmd, combine_output=True)
print 'output=%s'%output
-------------------- output: --------------------
> python test.py
arg_stdout=-1 arg_stderr=-2 stdin=None from_dir=None env=None
*****direct call subprocess.Popen******
output=INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
WARNING: Job script does not invoke any 'aprun'/'ccmrun' command.
Job will be submitted as usual, but please ensure your job script eventually
invokes 'aprun'/'ccmrun' command to execute tasks on allocated compute nodes.
Please contact help+bw@ncsa.illinois.edu if you need any assistance.
INFO: Job submitted to account: fyy
8612426.bw
errput=None
stat=0
*****call CIME.utils.run_cmd_no_fail******
output=INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
INFO: The qsub '-V' option is deprecated. Please include your environment variables directly in your job script.
WARNING: Job script does not invoke any 'aprun'/'ccmrun' command.
Job will be submitted as usual, but please ensure your job script eventually
invokes 'aprun'/'ccmrun' command to execute tasks on allocated compute nodes.
Please contact help+bw@ncsa.illinois.edu if you need any assistance.
INFO: Job submitted to account: fyy
8612427.bw
-------------------- 2 jobs are submitted, which is what should happen. We're using python 2.7.14. Also note that Blue Waters has a special python environment that is used (details are here if you care: https://bluewaters.ncsa.illinois.edu/python). Because the version of perl in the bwpy module needs updating, we use the system default. As a result of these things, the following commands are issued to enter into the python environment before building and submitting the case: --------------------
module load bwpy/0.3.2
module load /u/staff/rmokos/tickets/BWAPPS-3553_Cannot_Locate_XML_in_INC_CESM_2.0/CESM-ENV # sets CESMDATAROOT
export PERL5LIB=/usr/lib/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi:$PERL5LIB
export PATH=~/bin:$PATH
bwpy-environ
-------------------- The same is also done before running the test script. Any help would be appreciated. Ryan