Increasing the n blocks value in ocean code

aketh_tm@gmail_com · Nov 9, 2015

Hi all,I have found that there exists an Nblocks loop in the baroclinic routine, which has an openMP section on top of it.I have a multicore environment that I wish to use to exploit parallelism at the Nblocks level.However the problem is that the Nblocks loop always at best has a value 1.How do I increase the value of the iteration value of this Nblocks loop?I want to make it as high as possible.Is there a way to do it?

klindsay · Nov 9, 2015

Aketh,The sections in the POP User's Guide (http://www.cesm.ucar.edu/models/cesm1.2/pop2/doc/users/POPusers_main.html) on the Compile-time options Domain and Blocks are useful. Briefly, POP decomposes its logically rectangular horizontal grid into rectangular blocks. The number of blocks is NTASKS_OCN*NTHRDS_OCN. (It might differ slightly if this product doen't divide evenly into the grid size.) These are settable variables in env_mach_pes.xml. There are NTASKS_OCN MPI tasks. Each MPI task runs on NTHRDS_OCN threads, with 1 block per thread. In order to increase the block per task count, increase NTHRDS_OCN, but not to a value larger than the number of cores that share memory. Note that if you do this and don't change NTASKS_OCN, the resulting blocks will be smaller, which increases the total block edge size. This in turn will increase communication cost. It also impacts the performance of the global sums in the barotropic solver (we are changing the barotropic solver in CESM2 to avoid this). So in general, you don't want to use blocks that are very small. I generally do not use blocks smaller than 16x16. The size threshold at which performance degrades when you drop below it depends on the machine you are running on.Keith

klindsay · Nov 9, 2015

Aketh,The sections in the POP User's Guide (http://www.cesm.ucar.edu/models/cesm1.2/pop2/doc/users/POPusers_main.html) on the Compile-time options Domain and Blocks are useful. Briefly, POP decomposes its logically rectangular horizontal grid into rectangular blocks. The number of blocks is NTASKS_OCN*NTHRDS_OCN. (It might differ slightly if this product doen't divide evenly into the grid size.) These are settable variables in env_mach_pes.xml. There are NTASKS_OCN MPI tasks. Each MPI task runs on NTHRDS_OCN threads, with 1 block per thread. In order to increase the block per task count, increase NTHRDS_OCN, but not to a value larger than the number of cores that share memory. Note that if you do this and don't change NTASKS_OCN, the resulting blocks will be smaller, which increases the total block edge size. This in turn will increase communication cost. It also impacts the performance of the global sums in the barotropic solver (we are changing the barotropic solver in CESM2 to avoid this). So in general, you don't want to use blocks that are very small. I generally do not use blocks smaller than 16x16. The size threshold at which performance degrades when you drop below it depends on the machine you are running on.Keith

Increasing the n blocks value in ocean code

aketh_tm@gmail_com

Member

klindsay

CSEG and Liaisons

klindsay

CSEG and Liaisons