Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

job submit limit, user's size and/or time limits when run the case.submit

CGL

CGL
Member
Hi,everyone. I try to run CESM2.1.3. I got the sbatch error:
ERROR: Command: 'sbatch --time 0:20:00 -p cpu_parallel --dependency=afterok:7408958 case.st_archive --resubmit' failed with error 'sbatch: error: QOSMinCpuNotSatisfied
sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)' from dir '/data/sxh/CESM2/CESM/cgl/scratch/test2'

It seems like the case.st_archive limited the progress. Where can i change the limited size or set it?
1681369068094.png
 

nusbaume

Jesse Nusbaumer
CSEG and Liaisons
Staff member
Hi CGL,

It looks like you have two different issues listed on this thread. For your first issue you can change the case.st_archive program’s queue and wallclock time by running the following commands in your case directory:

Job queue:
./xmlchange --subgroup case.st_archive --id JOB_QUEUE --val <value>

Wallclock time:
./xmlchange --subgroup case.st_archive --id JOB_WALLCLOCK_TIME --val <value>

Where <value> is whatever job queue name or wallclock time length you want.

In terms of your second issue, it looks like the error is happening in the atmosphere model (CAM). I first would check whether there is a useful error message in the atm.log. Otherwise I would try running with debugging on, which can be done by doing the following:

./xmlchange –id DEBUG –val TRUE

And then re-building and re-running the model, which should then hopefully give you a more specific error message.

Hope that helps, and have a great day!

Jesse
 

CGL

CGL
Member
Hi CGL,

It looks like you have two different issues listed on this thread. For your first issue you can change the case.st_archive program’s queue and wallclock time by running the following commands in your case directory:

Job queue:


Wallclock time:


Where <value> is whatever job queue name or wallclock time length you want.

In terms of your second issue, it looks like the error is happening in the atmosphere model (CAM). I first would check whether there is a useful error message in the atm.log. Otherwise I would try running with debugging on, which can be done by doing the following:



And then re-building and re-running the model, which should then hopefully give you a more specific error message.

Hope that helps, and have a great day!

Jesse
it seems like the xmlchange do not have the DEBUG function.View attachment 3527
 

CGL

CGL
Member
Hi CGL,

It looks like you have two different issues listed on this thread. For your first issue you can change the case.st_archive program’s queue and wallclock time by running the following commands in your case directory:

Job queue:


Wallclock time:


Where <value> is whatever job queue name or wallclock time length you want.

In terms of your second issue, it looks like the error is happening in the atmosphere model (CAM). I first would check whether there is a useful error message in the atm.log. Otherwise I would try running with debugging on, which can be done by doing the following:



And then re-building and re-running the model, which should then hopefully give you a more specific error message.

Hope that helps, and have a great day!

Jesse
I think the first question related to the second. Maybe some system limit the progress or other mistake. I will check this. Thanks for your reply:)
 

CGL

CGL
Member
Hello, have you solved the second issue? I have the same problem.
I changed a cluster to run the model. It worked. I think the reason is the process or enviroment get wrong configuration when you try a parallel computing. You should contact with your cluster managers.
 
Top