otieno_1@osu_edu
New Member
I need to run CCSM3 on bluevista as efficiently as possible. I built the code using default values in the setup/run scripts as suggested in the user manual but it appears the code is 42% slower. I would like to keep the configuration as close to standard as possible.
The steps in the tutorial produce
#BSUB -n 104 and
setenv LSB_PJL_TASK_GEOMETRY "{(0,1,2,3,4,5,6,7)(8,9,10,11,12,13,14,15)(16,17,18,19,20,21,22,23)(24,25,26,27)(28,29,30,31,32,33,34,35)(36,37,38,39,40,41,42,43)
(44,45,46,47,48,49,50,51)(52,53,54,55,56,57,58,59)(60,61,62,63,64,65,66,67)
(68,69)(70,71)(72,73)(74,75)}"
My limited understanding is that this implies that I am asking for 104 processors and 13-nodes. What I am wondering is whether the last 4 parenthesis (68,69)(70,71)(72,73)(74,75) mean that these 4 nodes only have two tasks assigned; how can that be when bluevista has 8 processors on each node?
Second the suggestion for increasing throughput/reducing costs described here (http://www.cisl.ucar.edu/docs/bluevista/ccsm.html) refer to ccsm3_1_beta39 tags yet the latest code at ESG is beta14.
Finally where can I get information to help interpret the information printed when you run getTiming.csh so I can know when I have got improvements on the task geometry?
The steps in the tutorial produce
#BSUB -n 104 and
setenv LSB_PJL_TASK_GEOMETRY "{(0,1,2,3,4,5,6,7)(8,9,10,11,12,13,14,15)(16,17,18,19,20,21,22,23)(24,25,26,27)(28,29,30,31,32,33,34,35)(36,37,38,39,40,41,42,43)
(44,45,46,47,48,49,50,51)(52,53,54,55,56,57,58,59)(60,61,62,63,64,65,66,67)
(68,69)(70,71)(72,73)(74,75)}"
My limited understanding is that this implies that I am asking for 104 processors and 13-nodes. What I am wondering is whether the last 4 parenthesis (68,69)(70,71)(72,73)(74,75) mean that these 4 nodes only have two tasks assigned; how can that be when bluevista has 8 processors on each node?
Second the suggestion for increasing throughput/reducing costs described here (http://www.cisl.ucar.edu/docs/bluevista/ccsm.html) refer to ccsm3_1_beta39 tags yet the latest code at ESG is beta14.
Finally where can I get information to help interpret the information printed when you run getTiming.csh so I can know when I have got improvements on the task geometry?