Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Increasing the n blocks value in ocean code

Hi all,I have found that there exists an Nblocks loop in the baroclinic routine, which has an openMP section on top of it.I have a multicore environment that I wish to use to exploit parallelism at the Nblocks level.However the problem is that the Nblocks loop always at best has a value 1.How do I increase the value of the iteration value of this Nblocks loop?I want to make it as high as possible.Is there a way to do it?  
 

klindsay

CSEG and Liaisons
Staff member
Aketh,The sections in the POP User's Guide (http://www.cesm.ucar.edu/models/cesm1.2/pop2/doc/users/POPusers_main.html) on the Compile-time options Domain and Blocks are useful. Briefly, POP decomposes its logically rectangular horizontal grid into rectangular blocks. The number of blocks is NTASKS_OCN*NTHRDS_OCN. (It might differ slightly if this product doen't divide evenly into the grid size.) These are settable variables in env_mach_pes.xml. There are NTASKS_OCN MPI tasks. Each MPI task runs on NTHRDS_OCN threads, with 1 block per thread. In order to increase the block per task count, increase NTHRDS_OCN, but not to a value larger than the number of cores that share memory. Note that if you do this and don't change NTASKS_OCN, the resulting blocks will be smaller, which increases the total block edge size. This in turn will increase communication cost. It also impacts the performance of the global sums in the barotropic solver (we are changing the barotropic solver in CESM2 to avoid this). So in general, you don't want to use blocks that are very small. I generally do not use blocks smaller than 16x16. The size threshold at which performance degrades when you drop below it depends on the machine you are running on.Keith
 

klindsay

CSEG and Liaisons
Staff member
Aketh,The sections in the POP User's Guide (http://www.cesm.ucar.edu/models/cesm1.2/pop2/doc/users/POPusers_main.html) on the Compile-time options Domain and Blocks are useful. Briefly, POP decomposes its logically rectangular horizontal grid into rectangular blocks. The number of blocks is NTASKS_OCN*NTHRDS_OCN. (It might differ slightly if this product doen't divide evenly into the grid size.) These are settable variables in env_mach_pes.xml. There are NTASKS_OCN MPI tasks. Each MPI task runs on NTHRDS_OCN threads, with 1 block per thread. In order to increase the block per task count, increase NTHRDS_OCN, but not to a value larger than the number of cores that share memory. Note that if you do this and don't change NTASKS_OCN, the resulting blocks will be smaller, which increases the total block edge size. This in turn will increase communication cost. It also impacts the performance of the global sums in the barotropic solver (we are changing the barotropic solver in CESM2 to avoid this). So in general, you don't want to use blocks that are very small. I generally do not use blocks smaller than 16x16. The size threshold at which performance degrades when you drop below it depends on the machine you are running on.Keith
 
Top