Welcome to the new DiscussCESM forum!
We are still working on the website migration, so you may experience downtime during this process.

Existing users, please reset your password before logging in here: https://xenforo.cgd.ucar.edu/cesm/index.php?lost-password/

the errors when running case_submit

pjiang

pjiang
New Member
Hello,
When running ./case.submit, I encountered the following errors:
# ./case.submit
Setting resource.RLIMIT_STACK to -1 from (8388608, -1)
- Prestaging REFCASE (/root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01) to /root/cesm/scratch/mycase/run
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.lnd
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.atm
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.tavg.5
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.rof
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.drv
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ice
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.restart
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.glc
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.ovf
Creating component namelists
Calling /opt/CESM/CESM-release-cesm2.1.3/components/cam//cime_config/buildnml
CAM namelist copy: file1 /opt/CESM/CESM-release-cesm2.1.3/cime/scripts/mycase/Buildconf/camconf/atm_in file2 /root/cesm/scratch/mycase/run/atm_in
Calling /opt/CESM/CESM-release-cesm2.1.3/components/clm//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/cice//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/pop//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/mosart//cime_config/buildnml
Running /opt/CESM/CESM-release-cesm2.1.3/components/cism//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/ww3//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/cime/src/components/stub_comps/sesp/cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/cime/src/drivers/mct/cime_config/buildnml
Finished creating component namelists
Checking that inputdata is available as part of case submission
Setting resource.RLIMIT_STACK to -1 from (-1, -1)
Loading input file list: 'Buildconf/refcase.input_data_list'
Loading input file list: 'Buildconf/cam.input_data_list'
Loading input file list: 'Buildconf/clm.input_data_list'
Loading input file list: 'Buildconf/cice.input_data_list'
Loading input file list: 'Buildconf/pop.input_data_list'
Loading input file list: 'Buildconf/mosart.input_data_list'
Loading input file list: 'Buildconf/cism.input_data_list'
Loading input file list: 'Buildconf/ww3.input_data_list'
Loading input file list: 'Buildconf/cpl.input_data_list'
- Prestaging REFCASE (/root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01) to /root/cesm/scratch/mycase/run
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.lnd
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.atm
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.tavg.5
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.rof
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.drv
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ice
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.restart
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.glc
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.ovf
- Prestaging REFCASE (/root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01) to /root/cesm/scratch/mycase/run
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.lnd
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.atm
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.tavg.5
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.rof
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.drv
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ice
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.restart
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.glc
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.ovf
Creating component namelists
Finished creating component namelists
Check case OK
submit_jobs case.run
Submit job case.run
Starting job script case.run
Setting resource.RLIMIT_STACK to -1 from (-1, -1)
Generating namelists for /opt/CESM/CESM-release-cesm2.1.3/cime/scripts/mycase
- Prestaging REFCASE (/root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01) to /root/cesm/scratch/mycase/run
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.lnd
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.atm
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.tavg.5
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.rof
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.drv
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ice
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.restart
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.glc
Copy rpointer /root/cesm/inputdata/cesm2_init/b.e20.B1850.f19_g17.release_cesm2_1_0.020/0301-01-01/rpointer.ocn.ovf
Creating component namelists
Calling /opt/CESM/CESM-release-cesm2.1.3/components/cam//cime_config/buildnml
CAM namelist copy: file1 /opt/CESM/CESM-release-cesm2.1.3/cime/scripts/mycase/Buildconf/camconf/atm_in file2 /root/cesm/scratch/mycase/run/atm_in
Calling /opt/CESM/CESM-release-cesm2.1.3/components/clm//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/cice//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/pop//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/mosart//cime_config/buildnml
Running /opt/CESM/CESM-release-cesm2.1.3/components/cism//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/components/ww3//cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/cime/src/components/stub_comps/sesp/cime_config/buildnml
Calling /opt/CESM/CESM-release-cesm2.1.3/cime/src/drivers/mct/cime_config/buildnml
Finished creating component namelists
-------------------------------------------------------------------------
- Prestage required restarts into /root/cesm/scratch/mycase/run
- Case input data directory (DIN_LOC_ROOT) is /root/cesm/inputdata
- Checking for required input datasets in DIN_LOC_ROOT
-------------------------------------------------------------------------
2021-03-05 11:52:20 MODEL EXECUTION BEGINS HERE
run command is mpiexec --allow-run-as-root --mca btl ^openib -np 576 /root/cesm/scratch/mycase/bld/cesm.exe >> cesm.log.$LID 2>&1
ERROR: RUN FAIL: Command 'mpiexec --allow-run-as-root --mca btl ^openib -np 576 /root/cesm/scratch/mycase/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /root/cesm/scratch/mycase/run/cesm.log.210305-115216
[root@v10sp2b03 mycase]#

# cat /root/cesm/scratch/mycase/run/cesm.log.210305-115216
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 576
slots that were requested by the application:

/root/cesm/scratch/mycase/bld/cesm.exe

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:

1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
[root@v10sp2b03 mycase]#

how to change 576 to 1,Please tell me how to do it step by step. Thank you in advance.
 

fischer

CSEG and Liaisons
Staff member
In your case directory run the following commands.
./xmlchange NTASKS=1
./xmlchange ROOTPE=0

But this will run very slow, and may not even run successfully. From the error message, it looks like you're
trying to request more processors than you have access to. You'll need to find out how many processors you
have access to. Also, it looks like you have a extra special character "^" in "--mca btl ^openib"

Chris
 
Top