Welcome to the new DiscussCESM forum!
We are still working on the website migration, so you may experience downtime during this process.

Existing users, please reset your password before logging in here: https://xenforo.cgd.ucar.edu/cesm/index.php?lost-password/

error by running cesm2.1.0

zongrax

New Member
Dear all, I'm trying to a case on cluster, the case was set up and built without error, but when I submit the case, I get the following errors :
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =p2p
comm fcd =2denable
max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)( suffix =)
Fatal error in PMPI_Group_range_incl: Invalid argument, error stack:
PMPI_Group_range_incl(195)........: MPI_Group_range_incl(group=0x88000000, n=1, ranges=0x7ffd7ef7dc54, new_group=0x7ffd7ef7d774) failed
MPIR_Group_check_valid_ranges(326): The 0th element of a range array ends at 191 but must be nonnegative and less than 1

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 49364 RUNNING AT tms05
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

I sincerely thank you for your help!

Best,
Dong
 

fischer

CSEG and Liaisons
Staff member
Hi Dong,

I need some additional information to be able to help you. What compset and resolution are you using? What is the processor layout you're using
in env_mach_pes.xml, and how many processors does your machine have on each node? And lastly, can you run "./xmlquery --listall | grep PIO" in
your case directory and attach that info.

Thanks
Chris
 

zongrax

New Member
Hi Dong,

I need some additional information to be able to help you. What compset and resolution are you using? What is the processor layout you're using
in env_mach_pes.xml, and how many processors does your machine have on each node? And lastly, can you run "./xmlquery --listall | grep PIO" in
your case directory and attach that info.

Thanks
Chris
Hi, Chris,

Thank for your answer, the compset and resolution of case is I1850Clm50SpCru and f09_g17, respectively, and there are a total of four 12-core CPUs, supporting hyperthreading, and sorry I don't know what's mean processor layout using in env_mach_pes.xml, so I attach this file. And lastly, after I ran "./xmlquery --listall | grep PIO" in
my case directory, i got this info:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PIO_CONFIG_OPTS:
PIO_VERSION: 1
PIO_SPEC_FILE: /home/dongwz/cesm/cime/config/cesm/machines/config_pio.xml
PIO_ASYNC_INTERFACE: FALSE
PIO_BLOCKSIZE: -1
PIO_BUFFER_SIZE_LIMIT: -1
PIO_DEBUG_LEVEL: 0
PIO_NETCDF_FORMAT: ['CPL:64bit_offset', 'ATM:64bit_offset', 'LND:64bit_offset', 'ICE:64bit_offset', 'OCN:64bit_offset', 'ROF:64bit_offset', 'GLC:64bit_offset', 'WAV:64bit_offset', 'ESP:64bit_offset']
PIO_NUMTASKS: ['CPL:-99', 'ATM:-99', 'LND:-99', 'ICE:-99', 'OCN:-99', 'ROF:-99', 'GLC:-99', 'WAV:-99', 'ESP:-99']
PIO_REARRANGER: ['CPL:1', 'ATM:1', 'LND:1', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']
PIO_REARR_COMM_ENABLE_HS_COMP2IO: TRUE
PIO_REARR_COMM_ENABLE_HS_IO2COMP: FALSE
PIO_REARR_COMM_ENABLE_ISEND_COMP2IO: FALSE
PIO_REARR_COMM_ENABLE_ISEND_IO2COMP: TRUE
PIO_REARR_COMM_FCD: 2denable
PIO_REARR_COMM_MAX_PEND_REQ_COMP2IO: 0
PIO_REARR_COMM_MAX_PEND_REQ_IO2COMP: 64
PIO_REARR_COMM_TYPE: p2p
PIO_ROOT: ['CPL:1', 'ATM:1', 'LND:1', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']
PIO_STRIDE: ['CPL:48', 'ATM:48', 'LND:48', 'ICE:48', 'OCN:48', 'ROF:48', 'GLC:48', 'WAV:48', 'ESP:48']
PIO_TYPENAME: ['CPL:pnetcdf', 'ATM:pnetcdf', 'LND:pnetcdf', 'ICE:pnetcdf', 'OCN:pnetcdf', 'ROF:pnetcdf', 'GLC:pnetcdf', 'WAV:pnetcdf', 'ESP:pnetcdf']

Many thanks.
Best,
Dong
 

Attachments

  • env_mach_pes.txt
    6.9 KB · Views: 10

fischer

CSEG and Liaisons
Staff member
Hi Dong,

Everything looks fine, I haven't been able to reproduce the error you're getting. Have you tried running a different compset, or changing the number of NTASKS?

Thanks
Chris
 

zongrax

New Member
Hi Chris,

Thanks very much for response, I have solved the problem. I tried to change the max task and mpi task per node from 48/48 to 6/6, and the case ran successfully. I sincerely thank you for your help!

Best,
Dong
 

dayon

Member
Hi Dong,
I meet to the same problem, would you like to share your setting? and how to set the max task and mpi task per node from 48/48 to 6/6 ? Thanks for your time and attention.
 

zongrax

New Member
Hi Dong,
I meet to the same problem, would you like to share your setting? and how to set the max task and mpi task per node from 48/48 to 6/6 ? Thanks for your time and attention.
Hi Dayon,
Really sorry for the late reply, I haven't logged in for a long time. You can set both max task and mpi task per node in config_machines.xml.
 
Top