Hello,
I hope I got the right section of the forum.
I am trying to get the cesm2.2.2 running on a new machine (Mahti, 128 cores per node, gnu compilers, openmpi). I was able to get a test case to run, but I am not able to get good load balance.
This is the output of ./describe_version:
We are missing some modules, but they do not matter for the cases that are important to us.
Have you made any changes to files in the source tree?
I changed the xml file to reflect the machine we are running (compilation, bash, pes)
Describe every step you took leading up to the problem:
I create a test case usingg:
I have try different configuration with different number of nodes assigned to the LND and ICE components and I am not able to get higher than 2.86 simulated years per day.
Here are the best timings I got:
18 nodes for ATM, 64 tasks per node, 2 threads per task:
If this is a port to a new machine: Please attach any files you added or changed for the machine port (e.g., config_compilers.xml, config_machines.xml, and config_batch.xml) and tell us the compiler version you are using on this machine.
Describe your problem or question:
I am trying to run cesm2.2.2 as efficient as possible. Unfortunately I was no able to balance the time and resources. The LND component takes 12-14 s, while the ICE componet 2-4 s. I tried to use more nodes for the LND, but it ended up in worse performance. If there a way/strategy to improive the performance?
Cristian
I hope I got the right section of the forum.
I am trying to get the cesm2.2.2 running on a new machine (Mahti, 128 cores per node, gnu compilers, openmpi). I was able to get a test case to run, but I am not able to get good load balance.
This is the output of ./describe_version:
Code:
$ ./describe_version
------------------------------------------------------------------------
git describe:
release-cesm2.2.2-0-g779b0a3
------------------------------------------------------------------------
We are missing some modules, but they do not matter for the cases that are important to us.
Have you made any changes to files in the source tree?
I changed the xml file to reflect the machine we are running (compilation, bash, pes)
Describe every step you took leading up to the problem:
I create a test case usingg:
Code:
/projappl/project_2008521/cesm2.2.2/cime/scripts/create_newcase --compset FWmadSD --res f09_f09_mg17 --case test_omp_atm_36_lnd_5_ice_1_thrds_2_cpt_4 --mach mahti
I have try different configuration with different number of nodes assigned to the LND and ICE components and I am not able to get higher than 2.86 simulated years per day.
Here are the best timings I got:
18 nodes for ATM, 64 tasks per node, 2 threads per task:
Code:
Case : test_omp_atm_18_lnd_5_ice_1_thrds_2_cpt_2
LID : 4423577.250425-161416
Machine : mahti
Caseroot : /users/cristian/test_omp_atm_18_lnd_5_ice_1_thrds_2_cpt_2
Timeroot : /users/cristian/test_omp_atm_18_lnd_5_ice_1_thrds_2_cpt_2/Tools
User : cristian
Curr Date : Fri Apr 25 16:25:26 2025
grid : a%0.9x1.25_l%0.9x1.25_oi%0.9x1.25_r%r05_g%null_w%null_z%null_m%gx1v7
compset : HIST_CAM60%WCMD%SDYN_CLM50%SP_CICE%PRES_DOCN%DOM_MOSART_SGLC_SWAV_SIAC_SESP
run type : startup, continue_run = FALSE (inittype = TRUE)
stop option : ndays, stop_n = 5
run length : 5 days (4.979166666666667 for ocean)
component comp_pes root_pe tasks x threads instances (stride)
--------- ------ ------- ------ ------ --------- ------
cpl = cpl 4608 0 1152 x 2 1 (1 )
atm = cam 4608 0 1152 x 2 1 (1 )
lnd = clm 1280 0 320 x 2 1 (1 )
ice = cice 256 1088 64 x 2 1 (1 )
ocn = docn 256 1152 64 x 2 1 (1 )
rof = mosart 1024 0 256 x 2 1 (1 )
glc = sglc 256 0 64 x 2 1 (1 )
wav = swav 256 0 64 x 2 1 (1 )
iac = siac 2 0 1 x 1 1 (1 )
esp = sesp 2 0 1 x 1 1 (1 )
total pes active : 4864
mpi tasks per node : 64
pe count for cost estimate : 1216
Overall Metrics:
Model Cost: 10771.50 pe-hrs/simulated_year
Model Throughput: 2.71 simulated_years/day
Init Time : 226.028 seconds
Run Time : 436.840 seconds 87.368 seconds/day
Final Time : 0.002 seconds
Actual Ocn Init Wait Time : 382.250 seconds
Estimated Ocn Init Run Time : 0.000 seconds
Estimated Run Time Correction : 0.000 seconds
(This correction has been applied to the ocean and total run times)
Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components
TOT Run Time: 436.840 seconds 87.368 seconds/mday 2.71 myears/wday
CPL Run Time: 2.336 seconds 0.467 seconds/mday 506.66 myears/wday
ATM Run Time: 416.642 seconds 83.328 seconds/mday 2.84 myears/wday
LND Run Time: 12.373 seconds 2.475 seconds/mday 95.66 myears/wday
ICE Run Time: 2.939 seconds 0.588 seconds/mday 402.71 myears/wday
OCN Run Time: 0.058 seconds 0.012 seconds/mday 20321.21 myears/wday
ROF Run Time: 0.817 seconds 0.163 seconds/mday 1448.67 myears/wday
GLC Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
WAV Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
IAC Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
ESP Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 26.208 seconds 5.242 seconds/mday 45.16 myears/wday
NOTE: min:max driver timers (seconds/day):
CPL (pes 0 to 1151)
ATM (pes 0 to 1151)
LND (pes 0 to 319)
ICE (pes 1088 to 1151)
OCN (pes 1152 to 1215)
ROF (pes 0 to 255)
GLC (pes 0 to 63)
WAV (pes 0 to 63)
IAC (pes 0 to 0)
ESP (pes 0 to 0)
If this is a port to a new machine: Please attach any files you added or changed for the machine port (e.g., config_compilers.xml, config_machines.xml, and config_batch.xml) and tell us the compiler version you are using on this machine.
Describe your problem or question:
I am trying to run cesm2.2.2 as efficient as possible. Unfortunately I was no able to balance the time and resources. The LND component takes 12-14 s, while the ICE componet 2-4 s. I tried to use more nodes for the LND, but it ended up in worse performance. If there a way/strategy to improive the performance?
Cristian