Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Error while case.submit: mpirun conmmand not found

changmao

Yufei Wang
New Member
Dear CESM forum,

I got an error: mpirun command could not be found when ./case.submit (res f19_g17 --compset B1850). The existing related posts seemed to be little help for me.

Details are as follows:
------------------------------------------------------------------------
- Prestage required restarts into /data1/elzd_2023_00031/cesm/scratch/b.day2.1/run
- Case input data directory (DIN_LOC_ROOT) is /data1/elzd_2023_00031/cesm/inputdata
- Checking for required input datasets in DIN_LOC_ROOT
-------------------------------------------------------------------------
run command is mpirun -n 384 /data1/elzd_2023_00031/cesm/scratch/b.day2.1/bld/cesm.exe >> cesm.log.$LID 2>&1
Exception from case_run: ERROR: RUN FAIL: Command 'mpirun -n 384 /data1/elzd_2023_00031/cesm/scratch/b.day2.1/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /data1/elzd_2023_00031/cesm/scratch/b.day2.1/run/cesm.log.231128-115340
Submit job case.st_archive
Starting job script case.st_archive
st_archive starting
moving /data1/elzd_2023_00031/cesm/scratch/b.day2.1/run/cesm.log.231128-115340 to /data1/elzd_2023_00031/cesm/scratch/archive/b.day2.1/logs/cesm.log.231128-115340
Cannot find a b.day2.1.cpl*.r.*.nc file in directory /data1/elzd_2023_00031/cesm/scratch/b.day2.1/run
Archiving history files for cam (atm)
Archiving history files for clm (lnd)
Archiving history files for cice (ice)
Archiving history files for pop (ocn)
Archiving history files for mosart (rof)
Archiving history files for cism (glc)
Archiving history files for ww3 (wav)
Archiving history files for drv (cpl)
Archiving history files for dart (esp)
st_archive completed
Submitted job case.run with id None
Submitted job case.st_archive with id None

[elzd_2023_00031@login01 b.day2.1]$ cat /data1/elzd_2023_00031/cesm/scratch/archive/b.day2.1/logs/cesm.log.23112
/bin/sh: mpirun: command not found

It was also mentioned that the. cpl *. r. *. nc file was missing, but I cannot understand what this specifically refered to. The checksum was already completed when building the case.

Run fhello_ World_ Mpi F90, the result showed as follows:
$ vi /data1/elzd_2023_00031/fhello_world_mpi.F90
$ mpif90 /data1/elzd_2023_00031/fhello_world_mpi.F90 -o hello_world
$ mpirun -n 2 ./hello_world

Process 1 says "Hello, world!" login01

HELLO_MPI - Master process:
FORTRAN90/MPI version

An MPI test program.

The number of processes is 2


Process 0 says "Hello, world!" login01

Another question: The compiler is intelmpi. In compilers.xml,<mpicc><mpicxx><mpifc>can only use the path of MPICH. If replaced by mpi/intelmpi/bin, the error will be "cannot open source file <mpi.h>". I am quite confused about this.

See the attach files for configuration files.

Looking forward to any help!
 

jedwards

CSEG and Liaisons
Staff member
The path to mpirun or mpiexec is set in config_machines.xml not in config_compilers.xml perhaps you should look there for your mistake.
 

changmao

Yufei Wang
New Member
The path to mpirun or mpiexec is set in config_machines.xml not in config_compilers.xml perhaps you should look there for your mistake.
Hi, jedwards. Thank you very much for your reply. The problem of mpirun mentioned before has been solved. However, the problem of .cn file still exists: Cannot find a b.day2.1.cpl*.r.*.nc file in directory /data1/elzd_2023_00031/cesm/scratch/b.day2.1/run。The inputdata has been checked before the case build. I really don't know what file is missing.
I have also run a simple case and show the results here (./create_newcase --case runtestX --res f19_g16 --compset X):
-------------------------------------------------------------------------
- Prestage required restarts into /data1/elzd_2023_00031/cesm/scratch/runtestX/run
- Case input data directory (DIN_LOC_ROOT) is /data1/elzd_2023_00031/cesm/inputdata
- Checking for required input datasets in DIN_LOC_ROOT
-------------------------------------------------------------------------
run command is mpirun -n 64 /data1/elzd_2023_00031/cesm/scratch/runtestX/bld/cesm.exe >> cesm.log.$LID 2>&1
Exception from case_run: ERROR: RUN FAIL: Command 'mpirun -n 64 /data1/elzd_2023_00031/cesm/scratch/runtestX/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /data1/elzd_2023_00031/cesm/scratch/runtestX/run/cesm.log.231205-163015
Submit job case.st_archive
Starting job script case.st_archive
st_archive starting
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/cesm.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/cesm.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/lnd.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/lnd.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/ocn.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/ocn.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/ice.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/ice.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/wav.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/wav.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/cpl.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/cpl.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/atm.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/atm.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/rof.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/rof.log.231205-163015
moving /data1/elzd_2023_00031/cesm/scratch/runtestX/run/glc.log.231205-163015 to /data1/elzd_2023_00031/cesm/scratch/archive/runtestX/logs/glc.log.231205-163015
Cannot find a runtestX.cpl*.r.*.nc file in directory /data1/elzd_2023_00031/cesm/scratch/runtestX/run
Archiving history files for drv (cpl)
Archiving history files for dart (esp)
st_archive completed
Submitted job case.run with id None
Submitted job case.st_archive with id None
 

Attachments

  • csm_share.bldlog.230727-104515.txt
    83.1 KB · Views: 3

jedwards

CSEG and Liaisons
Staff member
What version of cesm are you trying to build? cesm2.3 requires ESMF to be built and installed first.
 

jedwards

CSEG and Liaisons
Staff member
You need to define ESMFMKFILE in the environment - if you are using a module it should do that for you.
If not you need to set it in the environment variables section of config_machines.xml
 

xiaoxiaokuishu

Ru Xu
Member
You need to define ESMFMKFILE in the environment - if you are using a module it should do that for you.
If not you need to set it in the environment variables section of config_machines.xml
Hi, Jedwards,


I already set ESMFMKFILE in ccs_config/machines/config_machines.xml, but when

</environment_variables>
<environment_variables comp_interface="nuopc" mpilib="impi">
<env name="ESMFMKFILE">/home1/home/n02/n02/ruxu/spack/opt/spack/linux-sles15-zen2/gcc-10.3.0/esmf-8.4.1-phihmm6g6tipktv73yt5oic25efkzpxj/lib/esmf.mk</env>
<env name="ESMF_RUNTIME_PROFILE">ON</env>
<env name="ESMF_RUNTIME_PROFILE_OUTPUT">SUMMARY</env>
<env name="UGCSINPUTPATH">/work/06242/tg855414/stampede2/FV3GFS/benchmark-inputs/2012010100/gfs/fcst</env>
<env name="UGCSFIXEDFILEPATH">/work/06242/tg855414/stampede2/FV3GFS/fix_am</env>
<env name="UGCSADDONPATH">/work/06242/tg855414/stampede2/FV3GFS/addon</env>


I preview_namelist, still display ERROR: ESMFMKFILE not found None

Do you know what is the problem, I use CTSM5.1, esmf 8.4.1

Best
 

jedwards

CSEG and Liaisons
Staff member
Not sure where preview_nameilst might need ESMFMKFILE, but you might try defining it in the environment of your shell (outside of the normal build process) and see if that solves the issue.
 

xiaoxiaokuishu

Ru Xu
Member
Not sure where preview_nameilst might need ESMFMKFILE, but you might try defining it in the environment of your shell (outside of the normal build process) and see if that solves the issue.
Hi, Jedwards,

I have just noticed the problem, the gcc i use is the default in Archer2, which is gcc 7.5.0, but the emsf I install
is with spack install esmf@gcc@10.3.0. I think it is the problem of compiler difference. But when i change
gcc7.5.0 to the spack installed gcc@10.30, the new error appears:

create namelist for component clm
Calling /mnt/lustre/a2fs-work2/work/n02/n02/ruxu/cesm/CESM/CTSM5.1/cime_config/buildnml
ERROR: Command /mnt/lustre/a2fs-work2/work/n02/n02/ruxu/cesm/CESM/CTSM5.1/bld/build-namelist failed rc=2
out=
err=Can't load '/work/n02/shared/perl/5.26.2/x86_64-linux-thread-multi/auto/XML/LibXML/LibXML.so' for module XML::LibXML: /work/n02/shared/perl/5.26.2/x86_64-linux-thread-multi/auto/XML/LibXML/LibXML.so: undefined symbol: PL_hash_seed at /home1/home/n02/n02/ruxu/spack/opt/spack/linux-sles15-zen2/gcc-10.3.0/perl-5.36.0-sb5eiu6ea7kzhhhrwyopgfran5v7f5gi/lib/5.36.0/x86_64-linux-thread-multi/DynaLoader.pm line 206.
at /mnt/lustre/a2fs-work2/work/n02/n02/ruxu/cesm/CESM/CTSM5.1/cime/utils/perl5lib/Config/SetupTools.pm line 5.
BEGIN failed--compilation aborted at /work/n02/shared/perl/5.26.2/x86_64-linux-thread-multi/XML/LibXML.pm line 156.
Compilation failed in require at /mnt/lustre/a2fs-work2/work/n02/n02/ruxu/cesm/CESM/CTSM5.1/cime/utils/perl5lib/Config/SetupTools.pm line 5.
BEGIN failed--compilation aborted at /mnt/lustre/a2fs-work2/work/n02/n02/ruxu/cesm/CESM/CTSM5.1/cime/utils/perl5lib/Config/SetupTools.pm line 5.
Compilation failed in require at /mnt/lustre/a2fs-work2/work/n02/n02/ruxu/cesm/CESM/CTSM5.1/bld/CLMBuildNamelist.pm line 419.

I do not know if i need to spack install ESMF depend on the gcc-7.5.0, the version provided by archer2.

Best
Ru
 

jedwards

CSEG and Liaisons
Staff member
gcc-7.5.0 is way too old, even gcc-10 is very old - the current gcc release is gcc-13. It's easier if you build the entire stack with the
same compiler version.
 

xiaoxiaokuishu

Ru Xu
Member
gcc-7.5.0 is way too old, even gcc-10 is very old - the current gcc release is gcc-13. It's easier if you build the entire stack with the
same compiler version.
Hi, Jedwards,

I hope to confirm one point, as the gcc is default in archer2, i can only spack install gcc by myself,
if i spack install gcc13, the install esmf@gcc-13, then it shoud be no problem, right?
 

jedwards

CSEG and Liaisons
Staff member
That should work but I think that the error above indicates that perl and libXML should also be installed using the same compiler.
 

xiaoxiaokuishu

Ru Xu
Member
That should work but I think that the error above indicates that perl and libXML should also be installed using the same compiler.
Hi, Jedwards,

I have list my loaded library, I use spack installed gcc@10.3.0, and all lib are installed depends on it.
It seem that libxml2@2.10.3 and perl@5.36.0 also installed by same comipler.
if the libxml2 here is the one you mentioned libXML.

spack find --loaded
-- linux-sles15-zen2 / gcc@10.3.0 -------------------------------
autoconf@2.69 gawk@5.2.1 libevent@2.1.12 ncurses@6.4 pmix@4.2.3
autoconf-archive@2023.02.20 gcc@10.3.0 libiconv@1.17 netcdf-c@4.9.2 readline@8.2
automake@1.16.5 gdbm@1.23 libpciaccess@0.17 netcdf-fortran@4.6.0 snappy@1.1.10
berkeley-db@18.1.40 gettext@0.21.1 libsigsegv@2.14 numactl@2.0.14 tar@1.34
bison@3.8.2 gmake@4.4.1 libtool@2.4.7 openmpi@4.1.5 texinfo@7.0.3
bzip2@1.0.8 gmp@6.2.1 libxcrypt@4.4.33 openssh@9.3p1 util-macros@1.19.3
c-blosc@1.21.2 hdf5@1.14.1-2 libxml2@2.10.3 openssl@1.1.1t xz@5.4.1
ca-certificates-mozilla@2023-01-10 hwloc@2.9.1 lz4@1.9.4 parallelio@2.5.10 zlib@1.2.13
cmake@3.26.3 krb5@1.20.1 m4@1.4.19 perl@5.36.0 zstd@1.5.5
diffutils@3.9 libaec@1.0.6 mpc@1.3.1 pigz@2.7
esmf@8.4.1 libedit@3.1-20210216 mpfr@4.2.0 pkgconf@1.9.5



Best
Ru
 
Top