Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Usage of NTASKS, NTHRDS

Shruti

Shruti Joshi
Member
Hello,

In config_machines.xml, the value of "MAX_TASKS_PER_NODE" seems to be the one reflecting in the command "mpirun -np value".
How is this related to the NTASKS? Can you please clarify on the same?

I was checking how to change the number of tasks and threads and came across the following post
https://bb.cgd.ucar.edu/cesm/threads/modifying-batch-settings-cesm1-versus-cesm2.5151/#post-36023

According to the post, we change the NTHRDS and NTASKS values using ./xmlchange and use ./case.setup --reset.
Is there anyway these values can be incorporated from the scripts i.e. config_machine.xml or config_compiler.xml?
 

jedwards

CSEG and Liaisons
Staff member
There is a file config_pes.xml where you can set default pe layouts for your machine.
 

Shruti

Shruti Joshi
Member
Thank you for the response.

I tried changing NTASKS, NTHRDS from the env_mach_pes.xml file and re-ran ./case.setup ./case.build and ./case.submit.
Where will the timing file be generated for updated NTASKS and NTHRDS value?
The one in the timing folder doesn't seem to be updated too.

Also, after the initial setup of CESM (any particular combination of compset, res) any changes done ( Ex : NTASKS, NTHRDS etc) , requires all 3 i.e. setup, build and submit step to be re-executed?
 

jedwards

CSEG and Liaisons
Staff member
If you are interested in performance studies with different pelayouts I recommend the PFS test. It runs 20 days with minimal IO
and saves the env_mach_pes.xml for each run to make it easy to keep track.
After changing NTASKS or NTHRDS you must rerun all steps:
./case.setup --reset
./case.build
./case.submit
 

Shruti

Shruti Joshi
Member
Just needed another clarfication. So once the NTASKS and NTHRDS are changed with ./xmlchange, doesn't a new timing file get generated?

Also in config_pes.xml file i tried changing the NTASKS and NTHRDS values for the first section (any machine, any grid, any res),but it doesn't seem to be reflecting in env_mach_pes.xml. What other changes do i need to consider?
 

jedwards

CSEG and Liaisons
Staff member
New timing files are generated each time a run completes, if you aren't generating new timing files make sure that the run is successful.

Changes in config_pes.xml will be reflected in a case created after that change has been made, but not in preexisting cases.
 

Shruti

Shruti Joshi
Member
Generally i used to check if the timing file is generated to ensure successful run. How do i make sure the run is successful in this case?
 

Shruti

Shruti Joshi
Member
Also,

Changes in config_pes.xml will be reflected in a case created after that change has been made, but not in preexisting cases.
>> I did the changes in config_pes.xml file and then created a newcase. But the NTASKS and NTHRDS for the components still seem to be reflecting the value -1 and 1 respectively. Should i do any other changes?
 

jedwards

CSEG and Liaisons
Staff member
Check the cesm.log file and the cpl.log files in the run directory to make sure the run has completed.

You must be matching another case in the config_pes.xml file.
 

Shruti

Shruti Joshi
Member
Hi Edwards,
I am really stumped about this problem and i need your help.

For a compset -X and res - f19_g16 , I tried both the methods to change NTASKS and NTHRDS:
  1. config_pes.xml file(cime_config/)
  2. env_mach_pes.xml file
1. config_pes.xml file - I have done the following changes before the ./create_newcase step,
<grid name="any">
<mach name="any">
<pes pesize="any" compset="any">
<ntasks>
- - - - - - - - - - - -< changed NTASK and NTHRD values > - - - - - - - - - - - - - -
</ntasks>
</nthrds>
</pes>
</mach>
</grid>
But there seems to be some other file rewriting these values. I even tried modifying the "config_pes.xml" in the folder - <cime/config/cesm/machines/userdefined_laptop_template> but nothing seems to be reflecting the changes.
Is there any other section in this file you would recommend so that these changes get reflected?

Note : I have set all the variables to "any" ( i am assuming that this means this section has to execute irrespective of the input values) and placed it in the beginning of file so that this appears before the default section.
- Even tried with machine name, didn't work too.

2. env_mach_pes.xml file : Tried this method too. But no timing file was generated.
I checked the buildlog files of various components(X compset) in <projects/scratch/case/bld>. There doesn't seem to be any error in these files.

PFA the contents of ./case.submit step, config_pes.xml and config_batch.xml file.
(Machine name - amd. Also config_batch.xml contents are updated referring the other sections in the same file)
Please let me know if you need any further information.

Any help would be appreciated.
 

Attachments

  • case.submit_output.txt
    1.2 KB · Views: 4
  • config_batch.txt
    1.6 KB · Views: 4
  • config_pes.txt
    636 bytes · Views: 7

jedwards

CSEG and Liaisons
Staff member
The config_pes.xml file that you should change is in cime/src/drivers/mct/cime_config/config_pes.xml

In the run directory check the contents of $RUNDIR/cesm.log.346.*
 

Shruti

Shruti Joshi
Member
Hello Edwards,

>> The config_pes.xml file that you should change is in cime/src/drivers/mct/cime_config/config_pes.xml
I tried changing the config_pes.xml file in the above location. But it doesn't seem to be working for me.
The timing folder, run.$CASE files nothing is generated. The preview_run output is as shown in the attached file.

>>In the run directory check the contents of $RUNDIR/cesm.log.346.*
1. With changes in config_pes.xml file -> This file is not created.
2. With changes in env_mach_pes.xml file -> I am assuming a new cesm.log files gets created. But that doesnt seem to be happening.
PFA the cpl.log and cesm.log file( These files are present after initializing a case and then doing changes with xmlchange command)

- Also the cesm.log file and cpl.log file is present in project/scratch/archive folder. Is this ok?

Please let me know what should i be checking.
 

Attachments

  • cesm.log.364.200515-164631.gz
    1,020 bytes · Views: 1
  • previw_run.txt
    1.5 KB · Views: 2
  • cpl.log.364.200515-164631.gz
    8.3 KB · Views: 2

jedwards

CSEG and Liaisons
Staff member
Logs show that you are running to completion. I don't know why you aren't seeing timing files, perhaps if you went through the
porting and testing procedure it would illuminate the cause of the problem.
 

Shruti

Shruti Joshi
Member
Ok sure will go through again.

Is there any particular threads/task and compset values combination which has worked for you?
Actually i have been trying this problem from many days not able to resolve it so a little desperate. :(

Also :
the cesm.log file and cpl.log file is present in project/scratch/archive folder. Is this ok?
 

Shruti

Shruti Joshi
Member
Just another thing. The log files which are attached seem to be from the initial run. Not the one after changes in env_mach_pes.xml file.
The new files aren't generated. Any thoughts on the same?
Is there any other flag which i should enable to generate timing files after changes are done using xmlchange command?
 

jedwards

CSEG and Liaisons
Staff member
Adjusting the pe layout should have nothing to do with getting timing files. If you run the cime scripts_regression_tests.py it should
report any problems with timing file generation on your system.
 
Top