Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Questions about model performance and grids distribution onto processors

kezhoulumelody

Kezhou Lu
New Member
Hi all,

I have some specific questions for model developers:

  1. For high-resolution models, (0.25 degree in the atmosphere x 1 degree in the ocean for example), is it feasible to map all grids in one node on cheyenne? If it's technically feasible, what's a rough estimate of wall time?
  2. Does each grid have similar computation amount/latency in a irregular/unstructured grid partition? If not, how to distribute grids to different prossesors to maximize the computation efficiency?
  3. How much FLOP does one node usually computes within one model year? Do coupled simulations (eg., compset B1850CN) differ from uncoupled simulations (e.g., compset FHIST)
Thank you so much for the help!

Best,

Melody
 

jedwards

CSEG and Liaisons
Staff member
Hi Melody,

1. It depends on the compset but for a fully active compset, one with prognostic atmosphere, land, sea ice and ocean one node of cheyenne is much too small.

2. grids are associated with components and possilbly with configuration within a component - the computation requirement of an ocean model is considerably different than that of an atmosphere. The computationaly requirements of a spectral element dycore within the atmosphere is much different than that of a finite element dycore.

3. FHIST is not an uncoupled simulation, it simply replaces the prognositic ocean with a data component. FLOPS depend on compset as well as resolution and are highly variable.
 

kezhoulumelody

Kezhou Lu
New Member
Hi Jedward,

Thank you so much for rapid response! I am sorry my questions are not specific enough.

In terms of question 2, I did find a documentation that briefly talks about the parallel implementation in pop2 (1. Cover — popdoc documentation) but I can't find one for atmosphere. So in the CESM model scripts, where can I find the part controlling the parallel implementation?

In terms of question 3, is it possible to output FLOPS (or a rough estimate of FLOPS) for a specific run. I know there is a timing log in the case root with some overall run time metrics, but I couldn't get more specific information out of it.

Thanks,

Melody
 

jedwards

CSEG and Liaisons
Staff member
Again I think that your question is too vague. CAM6.3 User’s Guide — camdoc documentation may help.

We have a metric called Model Cost given in pe-hours per simulated year. This is the closest thing that we have to flops. FLOPS is a pretty theoretical and not really a practical measurement of performance. A model can easily use more FLOPS while getting worse throughput.
 
Top