Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

case.submit mpiexec ERROR

RG5

Ir.5NA
New Member
Hello,

I was able to build successfully a simple use_case, after resolving several porting issues (that I may list in a previous post that disappeared, due to a maintenance process, I guess) : MODEL BUILD HAS FINISHED SUCCESSFULLY

Now, almost the end, I faced a mpiexec error when ./case.submit.
I cannot really give more details, as the cesm.log in /run is almost empty.

It just says :
mpiexec: Error: unknown option "--prepend-rank"
Type 'mpiexec --help' for usage.

Here is the console :

Finished creating component namelists
-------------------------------------------------------------------------
- Prestage required restarts into /home/CESM/projects/cesm/scratch/testrun3/run
- Case input data directory (DIN_LOC_ROOT) is /home/CESM/projects/cesm/inputdata
- Checking for required input datasets in DIN_LOC_ROOT
-------------------------------------------------------------------------
2025-07-28 17:29:43 MODEL EXECUTION BEGINS HERE
run command is mpiexec -np 4 --prepend-rank /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe >> cesm.log.$LID 2>&1
[B]ERROR: RUN FAIL: Command 'mpiexec -np 4 --prepend-rank /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed[/B]
See log file for details: /home/CESM/projects/cesm/scratch/testrun3/run/cesm.log.250728-172942

Thanks a lot !
 

RG5

Ir.5NA
New Member
By overcoming the deprecated argument prepend-rank (switching to --oversubscribe also) and running directly the .exe file I got a libpnetcdf.so.6 error :

:~/CESM/projects/scratch/testrun3$ mpiexec --oversubscribe /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe
/home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe: error while loading shared libraries: libpnetcdf.so.6: cannot open shared object file: No such file or directory
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
/home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe: error while loading shared libraries: libpnetcdf.so.6: cannot open shared object file: No such file or directory

I'm running out of ideas.
 
Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
By forcing The Parallel Netcdf path : export LD_LIBRARY_PATH=/home/CESM_dep/Libs/pnetcdf-1.12.3/lib:$LD_LIBRARY_PATH
~/CESM/projects/scratch/testrun3$ mpiexec --oversubscribe /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe
ERROR: (cime_cpl_init) :: namelist read returns an end of file or end of record condition
#0 0x78a6f7223e59 in ???
#1 0x5b0228d0d0eb in ???
#2 0x5b0228d0d294 in ???
#3 0x5b02286f5b8d in ???
#4 0x5b02286f8346 in ???
#5 0x78a6f662a1c9 in __libc_start_call_main
at ../sysdeps/nptl/libc_start_call_main.h:58
#6 0x78a6f662a28a in __libc_start_main_impl
at ../csu/libc-start.c:360
#7 0x5b02286dcc54 in ???
#8 0xffffffffffffffff in ???
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1001.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
 
Vote Upvote 0 Downvote

jedwards

CSEG and Liaisons
Staff member
This last error is because it cannot find a namelist, you said you are running from the command line - are you in the run directory?
That is you should be running something like
cd /home/CESM/projects/cesm/scratch/testrun3/run
mpiexec --oversubscribe /home/CESM/projects/cesm/scratch/testrun3/bld/cesm.exe
 
Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
Thanks a lot.

For some reasons I was able to deal with, modifying some paths, config or Macros.Make files.
Running ./case.build --skip-provenance-check and ./case.submit --no-batch helped.

Now, it seems that every outputs was created in /run directory (I guess, I don't know what to expect).
I'm only facing a ERROR: No result from jobs [('case.run', None), ('case.st_archive', 'case.run or case.test')] . It seems related to archive/copying files in the archive/case/atm ...
I guess it's nothing, as long as I have the .nc files in /run, right ?

I would like to thank you both (and the rest) for dealing with issues all along the forum.
 

Attachments

  • log.txt
    5.3 KB · Views: 2
Vote Upvote 0 Downvote

RG5

Ir.5NA
New Member
Actually, switching from 2.1.x to 2.2 raised the .pio error again. PFA the log.

Here is the config_machines.xml :
<compiler COMPILER="gnu" MACH="NB">
<!-- LINUX -->
<CPPDEFS>
<append>-DFORTRANUNDERSCORE -DNO_R16</append>
</CPPDEFS>
<FFLAGS>
<append> -fallow-argument-mismatch -fallow-invalid-boz -fconvert=big-endian -ffree-line-length-none -ffixed-line-length-none -fallow-argument-mismatch -fallow-argument-mismatch -fallow-invalid-boz </append>
</FFLAGS>
<LDFLAGS>
<append>-L/home/CESM_dep/Libs/pnetcdf-intel/lib</append>
</LDFLAGS>
<SFC>ifx</SFC>
<SCC>gcc</SCC>
<SCXX>g++</SCXX>
<MPIFC>mpif90</MPIFC>
<MPICC>mpicc</MPICC>
<MPICXX>mpicxx</MPICXX>
<CXX_LINKER>FORTRAN</CXX_LINKER>
<SUPPORTS_CXX>TRUE</SUPPORTS_CXX>
<NETCDF_PATH>/usr/local/</NETCDF_PATH>
<PNETCDF_PATH>/home/CESM_dep/Libs/pnetcdf-intel</PNETCDF_PATH>
<SLIBS>
<append>-L/usr/local/lib/ -lnetcdff -lnetcdf -lm</append>
<append>-L/usr/local/lib/ -llapack -L/usr/local/lib/ -lblas</append>
<append>-L/home/CESM_dep/Libs/pnetcdf-intel/lib -lpnetcdf</append>
</SLIBS>
</compiler>
 

Attachments

  • piobldlog50801-094443.txt
    51.2 KB · Views: 0
Vote Upvote 0 Downvote
Top