Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Problem of increasing processing speed of CESM2.1.3

ykp990521

ykp990521
Member
While running B1850, f19_g17 case, my processing speed remains slow no matter how I change the pelayout. I let the components run in sequence with ROOTPE=0, NTHRDS=1 for all, increased NTASKS from 168 (28pes/node) to 420 and found that the processing speed remain less than 5 years/wallclock day.
Then I changed the OCN to be ran concurrently with other components, I found the best performance is NTASKS_OCN=168 with other NTASKS=336 under NTHRDS=1, the speed is almost 9 years/wallclockday. However, setting NTASKS_OCN=168, other NTASKS=336 as a reference, no matter how I increase the NTASKS of OCN or other components, the speed remains lower than 9.5 years/day. I have checked the timing provided by CESM2's official website and found that they could process with more than 20yrs/d with more than 1000 or even 2000pes at B1850 f19_g17, does anyone know why I couldn't get a higher speed by increasing pes? (My total pes is only no more than several hundreds and the speed won't increase!)
Thanks a lot! Attached is a cpl log for NTASKS=336 for components other than OCN while NTASKS_OCN=168.
 

Attachments

  • cpl.log.211025-080306.txt
    95.1 KB · Views: 5

dbailey

CSEG and Liaisons
Staff member
Load balancing is always tricky. What machine are you on? I have moved this to the infrastructure subforum.
 

jedwards

CSEG and Liaisons
Staff member
The coupler log isn't very helpful here. Please provide the output of ./pelayout and a cesm_timing file.
 

ykp990521

ykp990521
Member
Load balancing is always tricky. What machine are you on? I have moved this to the infrastructure subforum.
My machine is Tianhe-1A of China, also, I attached the config_machine.xml and config_compilers.xml, thanks a lot!
 

Attachments

  • config_compilers.txt
    1.6 KB · Views: 9
  • config_machines.txt
    1.4 KB · Views: 6

ykp990521

ykp990521
Member
The coupler log isn't very helpful here. Please provide the output of ./pelayout and a cesm_timing file.
1635205963124.png
I could only find the 3 timing profiles in the timing directory, however, they don't look like the ones provided by NCAR official website, do you know where I can find the ones look like NCAR timing?
 

Attachments

  • model_timing.000.txt
    30.7 KB · Views: 5
  • model_timing.336.txt
    6.6 KB · Views: 3
  • model_timing_stats.txt
    32 KB · Views: 4

jedwards

CSEG and Liaisons
Staff member
Look for examples at cesm/cime_config/config_pes.xml The lnd and ice can run concurrently after atm, that is:
ntasks_lnd=168,ntasks_rof=168,ntasks_ice=168,rootpe_ice=168
 

ykp990521

ykp990521
Member
Look for examples at cesm/cime_config/config_pes.xml The lnd and ice can run concurrently after atm, that is:
ntasks_lnd=168,ntasks_rof=168,ntasks_ice=168,rootpe_ice=168
Thanks a lot, changing pelayout (running concurrently) did increased a little, but the important issue confusing me is why I couldn't increase my speed by adding pe number? The NCAR official website could use more than 1000 or 2000 pes to run while keeping the speed high (more than 20 yrs/d with all components active), but my speed will drop when NTASKS is only 200-300.
 

ganbaranaito

takufuu
Member
Hello, I thought it may be a common issue. I met the same problem in the NUIST supercomputer. I also ran B1850 f19_g17 case for my purpose. If I ran in consecutive processing with total ntasks are 168, the model's output speed is about 4(model years)/day. Then I increase ntasks to 196, the speed will become slower than 168. Similarly, I also let ocean model be parallel with other models. Ocean model's ntasks are 140, other models' ntasks are also 140, output speed is about 6.5 (model year)/day. Then I increase both ntasks to 168, the speed increase is negligible. It is about 7(model year)/day. (nthrds are all set to 1)
 
Top