Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

about ntasks=6 in CAM4.0

Dear eaton,
These days I was confused about the option -ntasks, I remenber that last month it can be -ntasks 6, but now it do not work with -natasks 6, I must set it -ntasks 4 or 2. Is there anything wrong with my mpi?
What's more, my conputer is a HP PC, the same problem with a linux cluster.
the message from the output said:
...
Domain Information

Horizontal domain: nx = 144
ny = 96
No. of categories: nc = 1
No. of ice layers: ni = 4
No. of snow layers:ns = 1
Processors: total = 1
Processor shape: square-pop
Distribution type: cartesian
Distribution weight: latitude
max_blocks = 1
Number of ghost cells: 1

CalcWorkPerBlock: Total blocks: 6 Ice blocks: 6 IceFree blocks: 0 Land blocks: 0
Processors (X x Y) = 1 x 1
Active processors: 1
(shr_sys_abort) ERROR: ice: no. blocks exceed max: increase max to 6
(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping
p0_28551: p4_error: : -1
p0_28545: p4_error: : -1
p0_28546: p4_error: : -1
p0_28549: p4_error: : -1
p0_28544: p4_error: : -1
p0_28548: p4_error: : -1
p0_28547: p4_error: : -1
p0_31767: p4_error: : -1
p0_31770: p4_error: : -1
p0_31766: p4_error: : -1
p0_31768: p4_error: : -1
p0_31771: p4_error: : -1
p0_31772: p4_error: : -1
p0_31769: p4_error: : -1
p0_27379: p4_error: : -1
p0_27381: p4_error: : -1
p0_27383: p4_error: : -1
p0_27378: p4_error: : -1
p0_27380: p4_error: : -1
p0_27384: p4_error: : -1
p0_18051: p4_error: : -1
p0_18056: p4_error: : -1
p0_18057: p4_error: : -1
p0_18052: p4_error: : -1
p0_18054: p4_error: : -1
p0_18055: p4_error: : -1
p0_18053: p4_error: : -1
p0_18058: p4_error: : -1
p0_28550: p4_error: : -1
p0_31773: p4_error: : -1
p0_27382: p4_error: : -1
p0_27385: p4_error: : -1
cleanup
 

eaton

CSEG and Liaisons
This configuration works for me. Here is the section from my logfile that matches the section you posted:


Code:
Domain Information

  Horizontal domain: nx =    144
                     ny =     96
  No. of categories: nc =      1
  No. of ice layers: ni =      4
  No. of snow layers:ns =      1
  Processors:  total    =      6
  Processor shape:       square-pop
  Distribution type:      cartesian
  Distribution weight:     latitude
  max_blocks =                 1
  Number of ghost cells:       1

CalcWorkPerBlock: Total blocks:     6 Ice blocks:     6 IceFree blocks:     0 Land blocks:     0
  Processors (X x Y) =    6 x    1
 Active processors:             6
The difference seems to be that your output is showing only 1 active processor, not 6. So it appears that the mpi job launcher has not started the job with 6 tasks.

The very top of the logfile contains information about how the components are assigned to tasks. You should see something like this for a 6 task run:


Code:
(seq_comm_setcomm)  initialize ID (  7 GLOBAL ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
(seq_comm_setcomm)  initialize ID (  2   ATM  ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
(seq_comm_setcomm)  initialize ID (  1   LND  ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
(seq_comm_setcomm)  initialize ID (  4   ICE  ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
(seq_comm_setcomm)  initialize ID (  5   GLC  ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
(seq_comm_setcomm)  initialize ID (  3   OCN  ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
(seq_comm_setcomm)  initialize ID (  6   CPL  ) pelist   =     0     5     1 ( npes =     6) ( nthreads =  1)
This output shows that all components are running on all 6 tasks (npes=6 means the same thing as 6 tasks).
 
Top