Hello,
I have done a lot of searching and couldn't find an answer hence decided to post the issue I am facing here:
I ported CSEM 2.1.4 to Ubuntu 22.04.3 LTS.
when I set
<MAX_TASKS_PER_NODE>1</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>1</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="default">
<executable>mpiexec</executable>
<arguments>
<arg name="ntasks"> -np 1 </arg>
</arguments>
</mpirun>
then run
create_newcase --case /home/jguo/projects/cesm/scratch/testrun --compset QPC4 --res f45_f45_mg37 --run-unsupported; cd testrun; ./xmlchange STOP_OPTION=ndays,STOP_N=3; ./case.setup
./case.build --clean-all
./case.build
./case.submit
Everything works fine.
However, once I set the max_tasks keys above to 2, I got this error: ( full log is attached as QPC4.txt and cesm.log is also attached as cesm.log.230917-110446.txt)
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =p2p
comm fcd =2denable
max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)( suffix =)
[Johnny:818539] *** An error occurred in MPI_Group_range_incl
[Johnny:818539] *** reported by process [4086169601,0]
[Johnny:818539] *** on communicator MPI_COMM_WORLD
[Johnny:818539] *** MPI_ERR_RANK: invalid rank
[Johnny:818539] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[Johnny:818539] *** and potentially your MPI job)
Also when I run --compset I2000Clm50SpGs --res f09_g17, even if I set -np 1 it still gives an error similar to this. logs are in I2000Clm50SpGs.txt and cesm.log.230917-111344.txt
I also attached my pnetcdf-config, nc-config --all ouptput, config_machines.xml and config_compilers.xml
When I compiled hdf5, I did enable parallel
CC=mpicc CFLAGS=-w ./configure --prefix=/home/jguo/CESM/Library --with-zlib --enable-hl --enable-fortran --enable-parallel
I have been stuck for a few days. Any help is greatly appreciated!
I have done a lot of searching and couldn't find an answer hence decided to post the issue I am facing here:
I ported CSEM 2.1.4 to Ubuntu 22.04.3 LTS.
when I set
<MAX_TASKS_PER_NODE>1</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>1</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="default">
<executable>mpiexec</executable>
<arguments>
<arg name="ntasks"> -np 1 </arg>
</arguments>
</mpirun>
then run
create_newcase --case /home/jguo/projects/cesm/scratch/testrun --compset QPC4 --res f45_f45_mg37 --run-unsupported; cd testrun; ./xmlchange STOP_OPTION=ndays,STOP_N=3; ./case.setup
./case.build --clean-all
./case.build
./case.submit
Everything works fine.
However, once I set the max_tasks keys above to 2, I got this error: ( full log is attached as QPC4.txt and cesm.log is also attached as cesm.log.230917-110446.txt)
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =p2p
comm fcd =2denable
max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)( suffix =)
[Johnny:818539] *** An error occurred in MPI_Group_range_incl
[Johnny:818539] *** reported by process [4086169601,0]
[Johnny:818539] *** on communicator MPI_COMM_WORLD
[Johnny:818539] *** MPI_ERR_RANK: invalid rank
[Johnny:818539] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[Johnny:818539] *** and potentially your MPI job)
Also when I run --compset I2000Clm50SpGs --res f09_g17, even if I set -np 1 it still gives an error similar to this. logs are in I2000Clm50SpGs.txt and cesm.log.230917-111344.txt
I also attached my pnetcdf-config, nc-config --all ouptput, config_machines.xml and config_compilers.xml
When I compiled hdf5, I did enable parallel
CC=mpicc CFLAGS=-w ./configure --prefix=/home/jguo/CESM/Library --with-zlib --enable-hl --enable-fortran --enable-parallel
I have been stuck for a few days. Any help is greatly appreciated!
Attachments
-
cesm.log.230917-110446.txt1 KB · Views: 4
-
QPC4.txt9.8 KB · Views: 2
-
I2000Clm50SpGs.txt9.4 KB · Views: 1
-
cesm.log.230917-111344.txt1 KB · Views: 4
-
config_compilers.xml.txt3.5 KB · Views: 3
-
config_machines.xml.txt5.3 KB · Views: 4
-
nc-config--all.txt1.2 KB · Views: 2
-
pnetcdf-config-dump.txt1.6 KB · Views: 2