Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Improve performance, calculation time of CTSM on new supercomputer

adrienD

Adrien Damseaux
Member
Hi everyone,

I am trying to run CTSM on a 446x450 regional grid over the arctic on the new supercomputer mistral. So far, I am able to run a one-year simulation with 24 nodes (160 nodes/h) in 6 hours 43 minutes real-time. I don't have any particular issue but I'd like to know if anyone has a suggestion to improve the performance or reduce the calculation time of the model?

I am very not familiar with CTSM but I was thinking of increasing the time-step, or removing the canopy scheme?

I know it's a long shot but any help would be appreciated.

Adrien
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
Hi Adrien

There's several things that you can do. What you do depends on the science you are trying to do with the model. It also depends on whether you want to just increase the throughput of the model (how fast it runs per wallclock day), or if you care about the total cost on the machine, or some balance of both.

Because you have about 200k gridcells you can run up to that number of processors. One thing we also often do is to run concurrent, with the datm running on different processors than the land model. The datm can't really take advantage of using a ton of processors since it's mainly just reading in data. So it can run efficiently on fewer processors. What we'll typically do is to give it a single node, and then try to use as many processors in CTSM that will run at about the same rate. Now, in general the more processors you give the model the less efficient it will be, so the throughput will likely increase, but the cost will start going up. So you have to have some kind of appropriate balance between the two.

In terms of changing the science, are you running with the full BGC model or the SP version? SP will be faster. But, if you want to prognostically predict plant growth, SP (using Satellite observed phenology wouldn't be appropriate). With the latest development model we also created some options for performance for NWP applications where you can set the env variables CLM_STRUCTURE and CLM_CONFIGURATION which can be set for use faster options. Setting CLM_STRUCTURE==fast will speed up the canopy scheme. But, whether that's appropriate depends on what you are trying to use the model for. You could probably increase the time-step to maybe an hour, which is often done for tower sites with hourly data. You should study whether that works for your case and gives reasonable results though. Anything that changes the science you'll need to vet to make sure it's really OK for your application.
 
Top