Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CESM Optimization in the distribution of computational resources

Kihang Youn

Kihang Youn
New Member
Hi All,

I am trying to optimize CESM 1.2.2.1 with a resolution ne120_t12 and a component B1850C5.

Component B1850C5 contains a lot of sub-models like OCN, LND, ATM, ICE, SPL, GLC, ROF, WAV.

So, I want to allocate the computational resources to each sub-models in order to find the best performance.

Computational resources and time are limited to test under various conditions.

Could I get any documentation on how to check the approximate amount of computation for each model or how to best allocate computational resources?

I do not have enough experience with CESM and sub-models, So, to keep things as simple as possible, So, to make the problem as simple as possible, I want to adjust only ntasks and rootpe with nthrds=1.

Please let me know if you have any suggestions.

Best Regards,
Kihang
 

Kihang Youn

Kihang Youn
New Member
Hi Chris,


Thank you for your suggestions.
I have a few questions about the timing data provided by CESM.

1. Functions Hierarchy
I found that the timing log file contains elapsed time data for each significant function. but it is not sure which functions contain others.
Is there any reference that indicates which functions contain others?

2. TOT Time Formula
I noticed that ATM must run sequentially ICE, LND and ROF from the load balancing manual.
(https://www.cesm.ucar.edu/models/cesm1.2/cesm/doc/usersguide/x1516.html)
My question is, how does CPL work? What is the formula for calculating TOT on the leftmost side of the figure below?
ex) TOT=LND+ICE+ATM+CPL(Run) (sea blue cells)
*GLC, ROF, WAV have a very small amount of elapsed time.
1650079870396.png

3. I/O Bottleneck
I have summarized them by function as follows. It seems that bottlenecks in I/O(pink cells) should be reduced prior to load balancing of computational resources. Currently, PIO_TYPENAME is not modified at all and it is performed by default. Is there room for improvement if I change it to netcdf or pnetcdf? And could you please provide a document related to Parallel I/O performance improvement?
1650080110228.png

Last and most importantly, I'm not sure if it's okay to ask you for detailed advice.
If this is outside the scope of the forum policy, please let me know.
Thanks for your help.

Best Regards,
Kihang
 

fischer

CSEG and Liaisons
Staff member
Hi Kihang,

Asking detailed advice isn't outside of the scope for the forum. But since we don't support cesm1.2.2.1 anymore, it'll be difficult to get
answers.

1. I'm not sure this is true for cesm1.2.2.1. But the newer versions of the model have timing files that have the functions
indented to show which functions are inside of each other.

2. The CPL handles the communication and regridding of the data between the models. There's some flux calculations that are
handled too. The calculation of TOT isn't that straight forward unfortunately. But you can look at scripts/ccsm_utils/Tools/timing/getTiming2.pl
to see the calculations.

3. Can't help you with the I/O. But I can point you to the PIO documentation. But I'm not sure how much this applies to the version your using.


Chris
 

Kihang Youn

Kihang Youn
New Member
Hi Chris,

Thank you for your warm reply.
I don't really think that cesm1.2.2.1 is outdated. I'll check first if the problem I'm having is because of an older version.
We should also check whether the issues that occurred in IO are compatible with the previous versions of netcdf and pnetcdf.
Thanks again for your reply.

Best Regards,
Kihang
 
Top