case.submit mpiexec ERROR

RG5

Ir.5NA
New Member
Hello,

I was able to build successfully a simple use_case, after resolving several porting issues (that I may list in a previous post that disappeared, due to a maintenance process, I guess) : MODEL BUILD HAS FINISHED SUCCESSFULLY

Now, almost the end, I faced a mpiexec error when ./case.submit.
I cannot really give more details, as the cesm.log in /run is almost empty.

It just says :
mpiexec: Error: unknown option "--prepend-rank"
Type 'mpiexec --help' for usage.

Here is the console :

Finished creating component namelists
-------------------------------------------------------------------------
- Prestage required restarts into /home/CESM/projects/cesm/scratch/testrun3/run
- Case input data directory (DIN_LOC_ROOT) is /home/CESM/projects/cesm/inputdata
- Checking for required input datasets in DIN_LOC_ROOT
-------------------------------------------------------------------------
2025-07-28 17:29:43 MODEL EXECUTION BEGINS HERE
run command is mpiexec -np 4 --prepend-rank /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe >> cesm.log.$LID 2>&1
[B]ERROR: RUN FAIL: Command 'mpiexec -np 4 --prepend-rank /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed[/B]
See log file for details: /home/CESM/projects/cesm/scratch/testrun3/run/cesm.log.250728-172942

Thanks a lot !
 

RG5

Ir.5NA
New Member
By overcoming the deprecated argument prepend-rank (switching to --oversubscribe also) and running directly the .exe file I got a libpnetcdf.so.6 error :

:~/CESM/projects/scratch/testrun3$ mpiexec --oversubscribe /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe
/home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe: error while loading shared libraries: libpnetcdf.so.6: cannot open shared object file: No such file or directory
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
/home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe: error while loading shared libraries: libpnetcdf.so.6: cannot open shared object file: No such file or directory

I'm running out of ideas.
 
Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
By forcing The Parallel Netcdf path : export LD_LIBRARY_PATH=/home/CESM_dep/Libs/pnetcdf-1.12.3/lib:$LD_LIBRARY_PATH
~/CESM/projects/scratch/testrun3$ mpiexec --oversubscribe /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe
ERROR: (cime_cpl_init) :: namelist read returns an end of file or end of record condition
#0 0x78a6f7223e59 in ???
#1 0x5b0228d0d0eb in ???
#2 0x5b0228d0d294 in ???
#3 0x5b02286f5b8d in ???
#4 0x5b02286f8346 in ???
#5 0x78a6f662a1c9 in __libc_start_call_main
at ../sysdeps/nptl/libc_start_call_main.h:58
#6 0x78a6f662a28a in __libc_start_main_impl
at ../csu/libc-start.c:360
#7 0x5b02286dcc54 in ???
#8 0xffffffffffffffff in ???
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1001.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
 
Vote Upvote 0 Downvote

jedwards

CSEG and Liaisons
Staff member
This last error is because it cannot find a namelist, you said you are running from the command line - are you in the run directory?
That is you should be running something like
cd /home/CESM/projects/cesm/scratch/testrun3/run
mpiexec --oversubscribe /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe
 
Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
Thanks a lot.

For some reasons I was able to deal with, modifying some paths, config or Macros.Make files.
Running ./case.build --skip-provenance-check and ./case.submit --no-batch helped.

Now, it seems that every outputs was created in /run directory (I guess, I don't know what to expect).
I'm only facing a ERROR: No result from jobs [('case.run', None), ('case.st_archive', 'case.run or case.test')] . It seems related to archive/copying files in the archive/case/atm ...
I guess it's nothing, as long as I have the .nc files in /run, right ?

I would like to thank you both (and the rest) for dealing with issues all along the forum.
 

Attachments

Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
Actually, switching from 2.1.x to 2.2 raised the .pio error again. PFA the log.

Here is the config_machines.xml :
<compiler COMPILER="gnu" MACH="NB">
<!-- LINUX -->
<CPPDEFS>
<append>-DFORTRANUNDERSCORE -DNO_R16</append>
</CPPDEFS>
<FFLAGS>
<append> -fallow-argument-mismatch -fallow-invalid-boz -fconvert=big-endian -ffree-line-length-none -ffixed-line-length-none -fallow-argument-mismatch -fallow-argument-mismatch -fallow-invalid-boz </append>
</FFLAGS>
<LDFLAGS>
<append>-L/home/CESM_dep/Libs/pnetcdf-intel/lib</append>
</LDFLAGS>
<SFC>ifx</SFC>
<SCC>gcc</SCC>
<SCXX>g++</SCXX>
<MPIFC>mpif90</MPIFC>
<MPICC>mpicc</MPICC>
<MPICXX>mpicxx</MPICXX>
<CXX_LINKER>FORTRAN</CXX_LINKER>
<SUPPORTS_CXX>TRUE</SUPPORTS_CXX>
<NETCDF_PATH>/usr/local/</NETCDF_PATH>
<PNETCDF_PATH>/home/CESM_dep/Libs/pnetcdf-intel</PNETCDF_PATH>
<SLIBS>
<append>-L/usr/local/lib/ -lnetcdff -lnetcdf -lm</append>
<append>-L/usr/local/lib/ -llapack -L/usr/local/lib/ -lblas</append>
<append>-L/home/CESM_dep/Libs/pnetcdf-intel/lib -lpnetcdf</append>
</SLIBS>
</compiler>
 

Attachments

Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
I think I have a conflict between CLM-DART ./quickbuild done on mpiifx, but the ./CLM5_setup_assimilation of CESM 2.2.x needs mpifx90, somehow.
 
Vote Upvote 0 Downvote
Back
Top