Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Parallel Scalability of CESM

Hello,I am testing parallel scalability of CESM - CAM4 standalone model of cubed sphere grid for quarter degree resolution in the HPC installed in our institution.422 nodes are available in total and the model is not scalable beyond 20 nodes.Could you please help on troubleshooting process for this problem?Thanks,Veeramanikandan 
 

jedwards

CSEG and Liaisons
Staff member
I can try - what version of the model do you have?  What hardware including node and network?  What model configuration?   I expect scaling for the ne120 cubed sphere dycore to be close to linear out to 86400 mpi tasks.   
 
Hi Jedwards,The hardware specification is,Basic configuration: GPU: 2x NVIDIA K40 (12GB, 2880 CUDA cores)Xeon Phi: 2x Intel Xeon Phi 7120P (16GB, 1.238 GHz, 61 cores)CPU: 2x E5-2680 v3 2.5GHz/12-Core RAM: 62 GB 8 CPU, 8 GPU and 4 Xeon Phi nodes have 505 GB RAM eachTotal number of compute nodes: 422 CPU nodes: 238GPU accelerated nodes: 161 Xeon Phi co-processor nodes: 23 [/list]I am using CESM1.2.0 Thanks,Veeramanikandan
 

jedwards

CSEG and Liaisons
Staff member
What model configuration and which of these nodes are you using.   The CPU nodes are appropriate for CESM, but there is no support for GPU nodes and CESM has not performed well on phi.  Which nodes did you use for the scaling study?   You also didn't say anything about network.
 
Model configuration:$CAMCFG/configure -fc_type pgi -fc mpif90 -cc mpicc -dyn se -hgrid ne120np4 -spmd-ntasks120-phys cam4 -chem none -nosmp-test
CPU, mic and GPU nodes are all having the same architecture here and I am running on all these three nodes. I am not using any of the GPU or mic cards. Although, I tried running CESM only in CPU nodes and the scalability results are same. Regarding system and network:HP Proliant XL230a Gen9 and XL250a Gen9 based cluster (Intel Xeon E5-2680v3 @ 2.5 GHz dual twelve-core CPU and dual 2880-core NVIDIA Kepler K40 GPU nodes) w/Infiniband Rmax = 524.40 TFlopsRpeak = 861.74 TFlops
 
Top