aketh_tm@gmail_com
Member
Hi all, I have been running the threaded version of POP (4 MPI tasks with 16 openMP threads each) and non threaded version (4 MPI tasks each on a node) of the POP noticed something strange. The resolution is at 1 degree(384 * 328).
In the latter (just 4 tasks, each on a node) the data in a block is at 160 * 192 size arrays for a given node. that is a total of 160 * 192 = 30720 elements. While on the other hand when the same is decomposed b/w 16 threads on that node it is at per block is 80 * 48 with 16 blocks.. that is a total of 80 * 48 * 16 = 61440 elements.
That is the data per node with OpenMP switched on, is double the regular decomposition. What I am interested is to figure out, how and why is the domain decomposition this way?, which module in ocean takes care of the task of data per block?, and how the arrays of the initial values of tracers are populated when performing the decomposition?.
Thanks in advance.
In the latter (just 4 tasks, each on a node) the data in a block is at 160 * 192 size arrays for a given node. that is a total of 160 * 192 = 30720 elements. While on the other hand when the same is decomposed b/w 16 threads on that node it is at per block is 80 * 48 with 16 blocks.. that is a total of 80 * 48 * 16 = 61440 elements.
That is the data per node with OpenMP switched on, is double the regular decomposition. What I am interested is to figure out, how and why is the domain decomposition this way?, which module in ocean takes care of the task of data per block?, and how the arrays of the initial values of tracers are populated when performing the decomposition?.
Thanks in advance.