Questions on NTASKS, ROOTPE, and submission

xiangli · Jan 30, 2024

Hi all,

I have several questions on pelayout and task submission.

1) The default setting of NTASKS in the env_mach_pes.xml for a B1850 case is like this:

By running ./pelayout, I have:

It looks like that one node corresponds to 32 tasks. Can I or should I change this correspondence (1 node ~ 32 tasks), considering that each node corresponds to 92 cpus in my supercomputer?

Can I uniformly set all NTASKS to 32? In other words, how should I set these NTASKS so that the model will run more efficiently?

2) With respect to ROOTPE, why there is 4 and 2 nodes for OCN and ICE, respectively, while 0 nodes for others?

Can I change the "-4" and "-2" to "1" or "0"?

3) If I am going to submit the CESM job to another partition, should I add "#SBATCH -p" at the top of the case.submit file like this:

However, it did not work because the partition did not change. How can I change the partition correctly?

Thanks,
Xiang

jedwards · Jan 30, 2024

The relationship between nodes and tasks is defined in config_machines.xml and used by pelayout. If
pelayout is getting it wrong then you should review your machine definition.

The case.submit job is not submitted, case.submit prepares the case for submission and submits the .case.run or ,case.test (hidden)
scripts. If the header of these scripts is not correct it's because it is not defined correctly in config_batch.xml,
Did you read the cime porting guide?

xiangli · Feb 7, 2024

jedwards said:
The relationship between nodes and tasks is defined in config_machines.xml and used by pelayout. If
pelayout is getting it wrong then you should review your machine definition.

The case.submit job is not submitted, case.submit prepares the case for submission and submits the .case.run or ,case.test (hidden)
scripts. If the header of these scripts is not correct it's because it is not defined correctly in config_batch.xml,
Did you read the cime porting guide?

Hi Jim,

I detailedly read this guide and modified my config_batch.xml, config_machines.xml, and config_compilers.xml.

I am currently testing my configuration by running ./scripts_regression_test.py. The output looks generally good. However, the test jobs submitted by this script did not start, even if they were given priority and there were adequate computing resources. Any possible reasons for that?

Looking forward to your suggestions.

Thanks,
Xiang

jedwards · Feb 7, 2024

This is a system issue, you may need to consult your system administrator. - but it looks as if you have set a runtime for those jobs of 90 days - that can't be right.

xiangli · Feb 7, 2024

jedwards said:
This is a system issue, you may need to consult your system administrator. - but it looks as if you have set a runtime for those jobs of 90 days - that can't be right.

Hi Jim,

I have been actively communicating with our system administrator on this issue, but we could not figure it out yet.

I set the max wall time to 120 hours, but could not know why the runtime would be 90 days after submission.

Here is how I set my configuration. I would appreciate it if you could take a look at it and provide some hints on possible fault.

config_batch.xml:

config_machines.xml:

Thanks,
Xiang

jedwards · Feb 7, 2024

I think that the walltime_format field should just be 00:00:00
try playing with different values of the walltime, I'm pretty sure that's where the problem is.
You can create a single test like:
cd cime/scripts
./create_test SMS.f19_g17.X
then cd to the test directory and try different values of wallclock time with
./xmlchange JOB_WALLCLOCK_TIME=00:10:00 (for example)

xiangli · Feb 7, 2024

jedwards said:
I think that the walltime_format field should just be 00:00:00
try playing with different values of the walltime, I'm pretty sure that's where the problem is.
You can create a single test like:
cd cime/scripts
./create_test SMS.f19_g17.X
then cd to the test directory and try different values of wallclock time with
./xmlchange JOB_WALLCLOCK_TIME=00:10:00 (for example)

Hi Jim,

This was what I did:

I used a much smaller walltimemax. And the JOB_WALLCLOCK_TIME in the env_workflow.xml was correspondingly changed.

However, the time limit of this test job was still 90 days. I tried 00:00:00 for walltime_format as well. That made no difference.

Actually, the time limit for a bash job was also 90 days. The bash job was created by running this:

Therefore, I think the 90-day time limit may have nothing to do with the CESM configuration.

My test job was still not able to run. Happy to hear your opinion!

Thanks,
Xiang

jedwards · Feb 7, 2024

What is the output of ./preview_run from your run directory?

xiangli · Feb 7, 2024

jedwards said:
What is the output of ./preview_run from your run directory?

Here it is:

jedwards · Feb 7, 2024

In config_batch you need to add some submit_args, the following is for machine perlmutter, yours should be similar.

Code:

<submit_args>                                                                                                                 
      <arg flag="--time" name="$JOB_WALLCLOCK_TIME" />                                                                           
      <arg flag="-q" name="$JOB_QUEUE" />                                                                                         
      <arg flag="--account" name="$PROJECT" />                                                                                   
    </submit_args>

xiangli · Feb 8, 2024

jedwards said:

In config_batch you need to add some submit_args, the following is for machine perlmutter, yours should be similar.

Code:

<submit_args>                                                                                                               
      <arg flag="--time" name="$JOB_WALLCLOCK_TIME" />                                                                         
      <arg flag="-q" name="$JOB_QUEUE" />                                                                                       
      <arg flag="--account" name="$PROJECT" />                                                                                 
    </submit_args>

Hi Jim,

It turned out that there would be an error when I add one or all of these submit_args. In contrast, the case could be successfully created, builded, and submitted if I did not add the submit_args.

Here are some most recent updates.

Without adding the submit_args, my config_batch.xml looks like this:

I did 3 kinds of tests.

1) ./create_test SMS.f19_g17.X

This test was successful. The case could be created, builded, submitted, and it ran for 2 mins.

Here is CaseStatus:

Here is TestStatus:

2) I also submitted several B1850 cases, but they could not start to run. Some of them were pending on resources, but I checked the resources were adequate. We plan to run B1850 cases for research.

3) ./scripts_regression_tests.py

There were some test runs submitted by this script, and some of them finished running, with some output:

However, some test runs are still pending. I'm not sure it would start to run. Resources should be adequate.

Looking forward to your comments and suggestions.

Thanks,
Xiang

jedwards · Feb 8, 2024

It looks like rather than correcting the error you introduced with the submit args you just abandoned that approach. Since you didn't provide an information other than the error I can't really be sure of what the problem was, but it looks to me like you ordered the batch_system section incorrectly. Submit args should immediatly precede the <directives> entry and follow any <batch_ fields provided.

xiangli · Feb 8, 2024

jedwards said:
It looks like rather than correcting the error you introduced with the submit args you just abandoned that approach. Since you didn't provide an information other than the error I can't really be sure of what the problem was, but it looks to me like you ordered the batch_system section incorrectly. Submit args should immediatly precede the <directives> entry and follow any <batch_ fields provided.

Hi Jim,

I reordered the section like this:

But there was error with create case, as I mentioned previously:

I also tried several other order, which made no difference.

Any suggestions would be appreciated.

Thanks,
Xiang

jedwards · Feb 8, 2024

The error says that the order of fields in this file matters. submit_args should follow the batch_mail_type and be followed by the directives and queues.

xiangli · Feb 9, 2024

jedwards said:
The error says that the order of fields in this file matters. submit_args should follow the batch_mail_type and be followed by the directives and queues.

Hi Jim,

Thanks! I adjusted the order and the ./create_test SMS.f19_g17.X test ran successfully! Here is my config_batch.xml:

However, the B1850 case was not still able to start to run. I am requesting 4 nodes with 8 tasks per node. Resources should be enough.

It should be noted that the TIME LIMIT was successfully changed!

The ./scripts_regression_tests.py test was not able to finish maybe because two test jobs were always pending:

There were only 2 FAIL in the output, and all others were ok.

Looking forward to your comments and suggestions.

Thanks,
Xiang

jedwards · Feb 9, 2024

The scripts regression tests prints the nature of the failure later in the output. In your B case you
have NTASKS_ATM=32 but ROOTPE_OCN=16, so they are overlapping and cannot progress.
Change ROOTPE_OCN=32, likewise change ROOTPE_ICE=16

I'm confused by having 92 cpus per node but only using 32 of them? In config_machines.xml this is set in variables MAX_TASKS_PER_NODE and MAX_MPITASKS_PER_NODE.

xiangli · Feb 9, 2024

jedwards said:
The scripts regression tests prints the nature of the failure later in the output. In your B case you
have NTASKS_ATM=32 but ROOTPE_OCN=16, so they are overlapping and cannot progress.
Change ROOTPE_OCN=32, likewise change ROOTPE_ICE=16

I'm confused by having 92 cpus per node but only using 32 of them? In config_machines.xml this is set in variables MAX_TASKS_PER_NODE and MAX_MPITASKS_PER_NODE.

Hi Jim,

As you see, the partition hulab only has 5 nodes with 92 CPUs per node, but some of CPUs in each node may have been allocated.

Here is the default setting of NTASKS and ROOTPE, which request 6 nodes (but we only have 5 for our partition).

I tried to reduce ROOTPE to reduce the nodes. If I set the ROOTPE for OCN and ICE to half, I should also set the NTASKS for all components to half, right?

Yes, currently, MAX_TASKS_PER_NODE and MAX_MPITASKS_PER_NODE are set to 8.

Thanks,
Xiang

jedwards · Feb 9, 2024

We generally use systems with dedicated nodes, shared node systems introduce a huge complication and frankly I just don't have any experience using them.

xiangli · Feb 14, 2024

jedwards said:
We generally use systems with dedicated nodes, shared node systems introduce a huge complication and frankly I just don't have any experience using them.

Hi Jim,

Our system administrator has installed all the prerequisites following this list:

But we got this error when testing:

xl468@dcc-hulab-01 /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/scripts $ module load CESM/prereqs

OpenMPI/4.1.6

NetCDF/c-4.9.2

NetCDF/fortran-4.6.1

cmake/3.28.3

OpenBLAS 3.23

Subversion/1.14.3

CESM/prereqs

Loading CESM/prereqs

Loading requirement: OpenMPI/4.1.6 NetCDF/c-4.9.2 NetCDF-F/fortran-4.6.1 cmake/3.28.3 OpenBLAS/3.23 Subversion/1.14.3

xl468@dcc-hulab-01 /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/scripts $ module list

Currently Loaded Modulefiles:

1) OpenMPI/4.1.6 2) NetCDF/c-4.9.2 3) NetCDF-F/fortran-4.6.1 4) cmake/3.28.3 5) OpenBLAS/3.23 6) Subversion/1.14.3 7) CESM/prereqs

Key:

auto-loaded

xl468@dcc-hulab-01 /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/scripts $ ./create_test SMS.f19_g17.X

Testnames: ['SMS.f19_g17.X.duke_gnu']

No project info available

Creating test directory /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk

RUNNING TESTS:

SMS.f19_g17.X.duke_gnu

Starting CREATE_NEWCASE for test SMS.f19_g17.X.duke_gnu with 1 procs

Finished CREATE_NEWCASE for test SMS.f19_g17.X.duke_gnu in 1.407000 seconds (PASS)

Starting XML for test SMS.f19_g17.X.duke_gnu with 1 procs

Finished XML for test SMS.f19_g17.X.duke_gnu in 0.313794 seconds (PASS)

Starting SETUP for test SMS.f19_g17.X.duke_gnu with 1 procs

Finished SETUP for test SMS.f19_g17.X.duke_gnu in 1.512333 seconds (PASS)

Starting SHAREDLIB_BUILD for test SMS.f19_g17.X.duke_gnu with 1 procs

Finished SHAREDLIB_BUILD for test SMS.f19_g17.X.duke_gnu in 4.181885 seconds (FAIL). [COMPLETED 1 of 1]

Case dir: /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk

Errors were:

b'Building test for SMS in directory /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk\nERROR: /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/build_scripts/buildlib.gptl FAILED, cat /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gptl.bldlog.240214-145317'

Due to presence of batch system, create_test will exit before tests are complete.

To force create_test to wait for full completion, use --wait

At test-scheduler close, state is:

FAIL SMS.f19_g17.X.duke_gnu (phase SHAREDLIB_BUILD)

Case dir: /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk

test-scheduler took 7.832472324371338 seconds

xl468@dcc-hulab-01 /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/scripts $ cat /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gptl.bldlog.240214-145317

make -f /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/Makefile install -C /hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gnu/openmpi/nodebug/nothreads/gptl MACFILE=/hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/Macros.make MODEL=gptl GPTL_DIR=/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing GPTL_LIBDIR=/hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gnu/openmpi/nodebug/nothreads/gptl SHAREDPATH=/hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gnu/openmpi/nodebug/nothreads

make: Entering directory '/hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gnu/openmpi/nodebug/nothreads/gptl'

mpicc -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -std=gnu99 -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/gptl.c

mpicc -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -std=gnu99 -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/GPTLutil.c

mpicc -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -std=gnu99 -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/GPTLget_memusage.c

mpicc -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -std=gnu99 -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/GPTLprint_memusage.c

mpicc -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -std=gnu99 -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/gptl_papi.c

mpicc -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -std=gnu99 -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/f_wrappers.c

mpifort -c -I/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing -fconvert=big-endian -ffree-line-length-none -ffixed-line-length-none -O -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU -DHAVE_MPI -ffree-form /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/perf_utils.F90

make: Leaving directory '/hpc/group/hulab/xl468/cesm2.1/scratch/SMS.f19_g17.X.duke_gnu.20240214_145312_ldx4wk/bld/gnu/openmpi/nodebug/nothreads/gptl'

/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/gptl.c: In function ‘GPTLpr_summary_file’:

/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/gptl.c:3090:8: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

3090 | if (((int) comm) == 0)

| ^

/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/perf_utils.F90:282:18:

282 | call MPI_BCAST(vec,lsize,MPI_INTEGER,0,comm,ierr)

| 1

......

314 | call MPI_BCAST(vec,lsize,MPI_LOGICAL,0,comm,ierr)

| 2

Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/LOGICAL(4)).

make: *** [/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/Makefile:63: perf_utils.o] Error 1

ERROR: /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/gptl.c: In function GPTLpr_summary_file :

/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/gptl.c:3090:8: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

3090 | if (((int) comm) == 0)

| ^

/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/perf_utils.F90:282:18:

282 | call MPI_BCAST(vec,lsize,MPI_INTEGER,0,comm,ierr)

| 1

......

314 | call MPI_BCAST(vec,lsize,MPI_LOGICAL,0,comm,ierr)

| 2

Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/LOGICAL(4)).

make: *** [/hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/src/share/timing/Makefile:63: perf_utils.o] Error 1xl468@dcc-hulab-01 /hpc/group/hulab/xl468/cesm2.1/my_cesm_sandbox/cime/scripts $

Here is our config_compilers.xml:

Any suggestions would be appreciated!

Thanks,
Xiang

jedwards · Feb 14, 2024

For recent gnu compiler versions you will need to add flags
-fallow-argument-mismatch -fallow-invalid-bozto the FCFLAGS

Questions on NTASKS, ROOTPE, and submission

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons

Xiang Li

Member

CSEG and Liaisons