Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CASE submit error

demibing

New Member
When I submited the BGC_spinup case 1.5.5. Spinup of CLM5.0-BGC-Crop — ctsm release-clm5.0 documentation,scripts/BGC_spinup$ ./case.submit
I got Error

ERROR: CONTINUE_RUN is true but this case does not appear to have restart files staged in /home/clm/clm45/CASE_DIR/BGC_spinup/run rpointer.drv
pic for CASE_DIR/BGC_spinup/run
How can I do to solve this problem? Look forward for your advice. Thank you.

Best wishes.
 

Attachments

  • QQ图片20211003180918.jpg
    QQ图片20211003180918.jpg
    123 KB · Views: 17

demibing

New Member
When I submited the BGC_spinup case 1.5.5. Spinup of CLM5.0-BGC-Crop — ctsm release-clm5.0 documentation,scripts/BGC_spinup$ ./case.submit
I got Error

ERROR: CONTINUE_RUN is true but this case does not appear to have restart files staged in /home/clm/clm45/CASE_DIR/BGC_spinup/run rpointer.drv
pic for CASE_DIR/BGC_spinup/run
How can I do to solve this problem? Look forward for your advice. Thank you.

Best wishes.
I had just solve this problem, set CONTINUE_RUN value is FALSE, this ERROR hasn't be reported again.
 

demibing

New Member
But another Error came out when case has been submitted,
run command is mpirun -np 24 /home/clm/CASE_DIR/BGC_spinup/bld/cesm.exe >> cesm.log.$LID 2>&1
Exception from case_run: ERROR: RUN FAIL: Command 'mpirun -np 24 /home/clm/CASE_DIR/BGC_spinup/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /home/clm/CASE_DIR/BGC_spinup/run/cesm.log.211003-200335

How can I do to solve this problem? Look forward for your advice. Thank you!
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
First look in the cesm.log file that it reports above and you'll find the error reported somewhere in it. That should help you get started in tracking the error down. The next thing is to check the other component log files (such as the atm.log, lnd.log, etcetera) they will all end in the same suffix ".log.211003-200335" for a given submission, it refers to the date and time it was sent.

There's a troubleshooting chapter in the User'd Guide for CLM5.0 here...

 

demibing

New Member
cesm.log.211003-200335
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 24 slots that were requested by the application:

/home/clm/CASE_DIR/BGC_spinupd/cesm.exe

Either request fewer slots for your application, or make more slots available for use.

A "slot" is the Open MPI term for an allocatable unit where we can launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:

A "slot" is the Open MPI term for an allocatable unit where we can launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:

1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number of hardware threads instead of the number of processor cores, use the--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the number of available slots when deciding the number of processes to launch.
_______________________________________________________________________________________________________________________
does it mean I should rectify modify the value "MAX_TASKS_PER_NODE" and "MAX_MPITASKS_PER_NODE" (config_machines.xml)values 12 to 6?
thank you!
 
Top