Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

How to evaluate computational performance with modified code

Yuan Sun

Yuan Sun
Member
Hi all,

I modified the CESM code for scientific research and I plan to evaluate the computational performance using the default version and the modified version.

Besides calculating core-hours that one job costs, are there any other methods or tools? Or are any cpu-monitoring tools useful in HPC?

Thank you very much :)

Best,
Yuan
 

jedwards

CSEG and Liaisons
Staff member
It depends on what you want to measure - a cam (F) case or a clm (I) case is fairly straight forward
but a fully coupled (B) case is much more complicated.
 

Yuan Sun

Yuan Sun
Member
Dear Jedwards, thanks for your reply. I modified the urban module and you know the urban fraction is tiny. So I plan to run a land-only compset (I). Will it be a simple comparison in core hours?
 

jedwards

CSEG and Liaisons
Staff member
Yes - you can just use the timing files generated by the model to compare. You should compare at small,
medium and large task counts. For an f09 resolution you might try 32, 512, 1024 tasks.
 

Yuan Sun

Yuan Sun
Member
Yes - you can just use the timing files generated by the model to compare. You should compare at small,
medium and large task counts. For an f09 resolution you might try 32, 512, 1024 tasks.
Dear Jedwards,

Thanks for your suggestions.

Do you mean to './xmlchange NTASK=32,or 512, or1024' ?

Best,
Yuan
 

jedwards

CSEG and Liaisons
Staff member
Yes - you should base those numbers on the number of processors on a node on your system.
so if your system has 32 cpus on a node you might do 1, 16 and 32 nodes. If you have 128 cpus per node
maybe 1, 4, and 8 nodes. But ultimatly you want to do the performance comparison at several different cpu counts.
 

Yuan Sun

Yuan Sun
Member
Thank you Jedwards. I run jobs on the Archer2 (UK national HPC). A standard queue is a maximum of 1024 nodes, and 128 cores per node. I will run according to your guidance '1, 4, 8 nodes'. Thanks again.

Best,
Yuan
 
Top