Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Timing File Inconsistency

Jbuzan

Jonathan R. Buzan
Member
Hello DiscussCESM users:

I ran a simulation with 4608 cores on Eiger (similar to Derecho). My understanding is that scaling CESM should be strong up until the hard limits for CAM and POP are hit. However, I have an inconsistency in performance. The slowest performing model was the ATM (12.41 myears/wday). However, the total time was 6.35 myears/wday. I was not expecting this outcome.

Cheers,
-Jonathan

---------------- TIMING PROFILE ---------------------
Case : intel_cesm3_0_beta01_BLT1850_v0c_O4608_01
LID : 3210216.240715-160339
Machine : eiger
Caseroot : /capstor/scratch/cscs/jbuzan/cesm3_0_beta01/cases/intel_cesm3_0_beta01_BLT1850_v0c_O4608_01
Timeroot : /capstor/scratch/cscs/jbuzan/cesm3_0_beta01/cases/intel_cesm3_0_beta01_BLT1850_v0c_O4608_01/Tools
User : jbuzan
Curr Date : Mon Jul 15 16:15:04 2024
Driver : CMEPS
grid : a%ne30np4.pg3_l%ne30np4.pg3_oi%tx2_3v2_r%r05_g%gris4_w%null_z%null_m%tx2_3v2
compset : 1850_CAM%DEV%LT%GHGMAM4_CLM51%BGC-CROP_CICE_MOM6_MOSART_CISM2%GRIS-NOEVOLVE_SWAV_SESP
run type : startup, continue_run = FALSE (inittype = TRUE)
stop option : ndays, stop_n = 5
run length : 5 days (4.958333333333333 for ocean)
component comp_pes root_pe tasks x threads instances (stride)
--------- ------ ------- ------ ------ --------- ------
cpl = cpl 4608 0 4608 x 1 1 (1 )
atm = cam 3840 0 3840 x 1 1 (1 )
lnd = clm 1280 0 1280 x 1 1 (1 )
ice = cice 2304 1280 2304 x 1 1 (1 )
ocn = mom 768 3840 768 x 1 1 (1 )
rof = mosart 1024 0 1024 x 1 1 (1 )
glc = cism 64 3712 64 x 1 1 (1 )
wav = swav 64 3776 64 x 1 1 (1 )
esp = sesp 1 0 1 x 1 1 (1 )
total pes active : 4608
mpi tasks per node : 128
pe count for cost estimate : 4608
Overall Metrics:
Model Cost: 17427.71 pe-hrs/simulated_year
Model Throughput: 6.35 simulated_years/day
Init Time : 345.573 seconds
Run Time : 186.512 seconds 37.302 seconds/day
Final Time : 14.170 seconds
Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components
TOT Run Time: 186.512 seconds 37.302 seconds/mday 6.35 myears/wday
CPL Run Time: 7.561 seconds 1.512 seconds/mday 156.54 myears/wday
ATM Run Time: 95.364 seconds 19.073 seconds/mday 12.41 myears/wday
LND Run Time: 14.947 seconds 2.989 seconds/mday 79.18 myears/wday
ICE Run Time: 10.890 seconds 2.178 seconds/mday 108.69 myears/wday
OCN Run Time: 65.010 seconds 13.002 seconds/mday 18.21 myears/wday
ROF Run Time: 0.577 seconds 0.115 seconds/mday 2051.59 myears/wday
GLC Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
WAV Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
ESP Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 80.675 seconds 16.135 seconds/mday 14.67 myears/wday
NOTE: min:max driver timers (seconds/day):
CPL (pes 0 to 4607)
ATM (pes 0 to 3839)
LND (pes 0 to 1279)
ICE (pes 1280 to 3583)
OCN (pes 3840 to 4607)
ROF (pes 0 to 1023)
GLC (pes 3712 to 3775)
WAV (pes 3776 to 3839)
ESP (pes 0 to 0)











More info on coupler timing:
 

fischer

CSEG and Liaisons
Staff member
Hi Jonathan,

CPL, ATM, and CLM all run consecutively, so the total run time will be longer than an individual component. Having said that, I would expect your total run time to be a little bit faster. But I usually do a 20 day run to look at performance, so that might skew the results some.

Could you attach your env_mach_pes.xml so I can try the run on derecho.

Thanks
Chris
 

Jbuzan

Jonathan R. Buzan
Member
Hi Chris,

Thanks for your quick reply. My guess is that it is related to the CPL. On the Derecho formulation for 18 nodes, the CPL = ATM in core allocation, but here in my allocation, I have it using the full cores (including OCN). Attached is my env_mach_pes.

Cheers,
-Jonathan
 

Attachments

  • env_mach_pes.xml.txt
    7.1 KB · Views: 2
Top