Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CAM6 with SE dynamical core performance decrease from Cheyenne to Derecho

Joas Müller

Joas Müller
New Member
Hi Derecho / Cheyenne / CESM users,

I am planning to run some simulations but realised while setting up and running some test simulations that there is a performance issue that seems to occur using Derecho.
A strong decrease in the performance is visible an I/we do not understand why this occurs. The model cost has typically been under 2000 rather than over 15,000.

Some information:
--res ne30pg3_ne30pg3_mg17

I ran two (one with default NTASKS and one with NTASKS set to 256) test simulations:
Run directory:
/glade/derecho/scratch/jmueller/f.e22.F2000.NE30.F2000_ne30_test_defaultNTASKS.1./run/
Case directory:
/glade/work/jmueller/VRM_CESM/cases/F2000_ne30_test_defaultNTASKS/f.e22.F2000.NE30.F2000_ne30_test_defaultNTASKS.1./

We do not have a perfect comparable run, but some similar ones that ran on Cheyenne can be found here:
/glade/u/home/rwills/cesm_runs/*ne30*

I and we are happy for any hints and help that coud resolve the issue!
Thanks in advance!
Cheers,
Joas
 

Joas Müller

Joas Müller
New Member
Hi Derecho / Cheyenne / CESM users,

I am planning to run some simulations but realised while setting up and running some test simulations that there is a performance issue that seems to occur using Derecho.
A strong decrease in the performance is visible an I/we do not understand why this occurs. The model cost has typically been under 2000 rather than over 15,000.

Some information:
--res ne30pg3_ne30pg3_mg17

I ran two (one with default NTASKS and one with NTASKS set to 256) test simulations:
Run directory:
/glade/derecho/scratch/jmueller/f.e22.F2000.NE30.F2000_ne30_test_defaultNTASKS.1./run/
/glade/derecho/scratch/jmueller/f.e22.F2000.NE30.F2000_ne30_test_256NTASKS.1./run/
Case directory:
/glade/work/jmueller/VRM_CESM/cases/F2000_ne30_test_defaultNTASKS/f.e22.F2000.NE30.F2000_ne30_test_defaultNTASKS.1./
/glade/work/jmueller/VRM_CESM/cases/F2000_ne30_test_defaultNTASKS/f.e22.F2000.NE30.F2000_ne30_test_256NTASKS.1./
We do not have a perfect comparable run, but some similar ones that ran on Cheyenne can be found here:
/glade/u/home/rwills/cesm_runs/*ne30*

I and we are happy for any hints and help that coud resolve the issue!
Thanks in advance!
Cheers,
Joas
P.S.:

I forgot to link the other testrun with NTASKS set to 256 (see above).

Additionally, one important bit of information I did not mention:
In the process of figuring out the performance issue, we changed a problematic code block in `fvm_consistant_se_cslam.F90` in the CESM2 source code following the instructions given here:

Thanks again!
Cheers,
Joas
 

aherring

Adam
Member
Joao's it occurs to me that you're using a rather small core count. To get 2000 core hours per simulated year, you need ~1800 processors. What if you set NTASKS=1920?
 

aherring

Adam
Member
Oh my mistake. A similar core hour per simulated year should still hold at small core counts. I was conflating this with the SYPD metric, which is dependent on number of cores. Please ignore my advice.
 

Joas Müller

Joas Müller
New Member
Hi all,
Based on a suggestion by Peter Lauritzen, we realized that DEBUG was unintentionally set to TRUE. Changing it to FALSE and running the simulation for at least 20 days for good stats resulted in the expected timings.

Thanks again for the help!
Cheers,
Joas
 
Top