Thank you very much for the information! We will follow up on CESM3 progress and look forward to the new release soon!Thanks for writing, Xin. I believe GPU-enabled code is a CESM3 new development (Community Earth System Model 3 (CESM3): Plans, progress, timelines | Community Earth System Model) so it would be present in the upcoming CESM3 release.
Hi Xin,
Let me add a bit more information here -- Haipeng is correct that there will be some GPU-accelerated code within CESM3, as part of various efforts towards running on GPUs, but we are still quite a ways off from a fully performant, GPU-resident CESM. Basically, parts of the atmosphere model (some physics, some dynamical cores) are indeed runnable now, but other parts are not - on a lot of current equipment, this results in slow data-transfers between the CPU and GPU parts of the code, which effectively mitigate the benefits of the GPUs. On much newer platforms that offer more tightly coupled GPU/CPU memory systems, I expect that to be a lot better, but I don't have numbers on that yet. There are also ongoing efforts to GPU-ize other components, like the ocean model, but those are also still early into development.
In short, this is very much a work in progress. If you're doing GPU development of CAM physics, say, then yes, CESM3 will have some code you can look at, use, modify, etc. But if you want to do science runs on a GPU system, that won't be a target of CESM3.
Hope that helps, and if you have specific needs or questions, I'm happy to provide more info!
Cheers,
- Brian
Hi Xin,
Just to be clear, running on many GPUs right now (such as the A100s in NCAR's Derecho) results in slower performance than the CPU-based runs, due to those data transfers. Some parts of the code run ~4x faster, but overall, it's a slowdown, so I'd recommend against running on those GPUs. I haven't yet tried on the newer systems, like NVIDIA's Grace-Hopper or above, or the newer AMD accelerators with unified memory, which avoid some of the dramatic costs with the data transfers, so those might fare better... but they're likely still not yet at the point where they're cost-effective yet, since a lot of code still runs on CPUs. Note that there are also issues with scaling out on GPUs (depending on how much memory you have on one) due to the large state size of CESM.
If you have a GPU you're looking to use, I'd recommend running on CPUs (or, using existing datasets?), then doing some analysis on the GPU, which likely does have the potential to speed things up considerably.
As for other models, I think some parts are likely not necessary on GPUs - eg, every GPU system still has a CPU on the server, and for some low-cost parts of the model, like the land portion, you can likely run concurrently on CPUs. Typically, our focus is on getting the atmosphere and ocean running on GPUs, since they dominate the cost.
Cheers,
- Brian