Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

How to create dependencies for slurm jobs

Hi all,

We're needing some help on how to make jobs (i.e. model runs) dependent on each other. We would like to submit e.g. a spinup run and a historical run so that the historical starts as soon as the spinup exits successfully.

We use slurm for jobs. The question is not how this is done in slurm ( --dependency=afterok:<job_id>) but how this option is set on a case by case basis in CESM/CTSM, where the -d option and job_id need to be passed.

Would using xlmchange to modify BATCH_COMMAND_FLAGS: be a solution or would this mess up the dependency already existing for case.st_archive?

Or maybe there is another way everyone is using but we have missed?

We are working with the latest ctsm
ctsm1.0.dev104
branch_tags/cime5.8.24_a01
cism2_1_68
mosart1_0_36
rtm1_0_71
sci.1.30.0_api.8.0.0
PTCLM2_20200121

Thanks,
Fernando
 

jedwards

CSEG and Liaisons
Staff member
Hi Fernando,

The case.submit has options for that.

--prereq PREREQ Specify a prerequisite job id, this job will not start until the
job with this id is completed (batch mode only).
--prereq-allow-failure
Allows starting the run even if the prerequisite fails.
This also allows resubmits to run if the original failed and the
resubmit was submitted to the queue with the orginal as a dependency,
as in the case of --resubmit-immediate.
 
Ah! Thank you. Exactly what I needed.

Now I just need to capture the job_id. Is there a tool or variable so I can automatically get the job id at job submission? That would spare me some extra coding.

I'd be happy to read documentation, but I haven't found anything related.

Fernando
 

jedwards

CSEG and Liaisons
Staff member
There is code in cime to capture the job id and use it in subsequent submits of the same case, however there isn't any feature that extends this beyond the scope of a single case.
 
Ah! Thank you. Exactly what I needed.

Now I just need to capture the job_id. Is there a tool or variable so I can automatically get the job id at job submission? That would spare me some extra coding.

I'd be happy to read documentation, but I haven't found anything related.

Fernando

For info, this seems to do the trick:
./case.submit > submit_out
JOB_ID=$(grep "Submitted job id is" submit_out | tail -1 | cut -f 5 -d' ')
 
Top