I’ve been asked to set up and run CESM on the HPC-AI Advisory Council’s clusters (specifically, Iris from HPC Advisory Council - Cluster Center, a 32-node cluster with two sockets per node and 28-core Cascade Lake processors in each socket). I’m pretty sure that I have my XML machine, compiler and batch files set up correctly for "iris", but I’m having trouble getting even the libraries to build. However, if I execute the failing gmake commands outside of the Python scripts, they work fine. I even modified config_compilers.xml to ensure that the appropriate compiler and library modules were loaded before the failing compilation command was executed by gmake,
Specifically, I am trying to follow the verification step in section 6.5 of 6. Porting and validating CIME on a new platform — CIME master documentation, and became stuck at:
cime/scripts/create_test --xml-category prealpha --xml-machine cheyenne --xml-compiler intel --machine iris --compiler intel 2>&1 | tee create.log
The creation and set-up steps work but the first attempt to do a build in each test case that is set up fails. Here is a sample build log output:
[gerardo@login01 cesm]$ cat /global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/gptl.bldlog.220927-134747
gmake -f /global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/Makefile install -C /global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl MACFILE=/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/Macros.make MODEL=gptl COMP_NAME=gptl GPTL_DIR=/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing GPTL_LIBDIR=/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl SHAREDPATH=/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/intel/openmpi/nodebug/nothreads/mct CIME_MODEL=cesm SMP=FALSE CASEROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z" CASETOOLS="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/Tools" CIMEROOT="/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime" COMP_INTERFACE="mct" COMPILER="intel" DEBUG="FALSE" EXEROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld" INCROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/lib/include" LIBROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/lib" MACH="iris" MPILIB="openmpi" NINST_VALUE="c1a1l1i1o1r1g1w1i1e1" OS="LINUX" PIO_VERSION="1" SHAREDLIBROOT="/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z" SMP_PRESENT="FALSE" USE_ESMF_LIB="FALSE" USE_MOAB="FALSE" CAM_CONFIG_OPTS="-phys cam6 -co2_cycle" COMP_LND="clm" COMPARE_TO_NUOPC="FALSE" CISM_USE_TRILINOS="FALSE" USE_TRILINOS="FALSE" USE_ALBANY="FALSE" USE_PETSC="FALSE"
gmake: Entering directory '/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl'
module list; mpicc -c -I/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing -qno-opt-dynamic-align -fp-model precise -std=gnu99 -xCORE-AVX512 -O2 -debug minimal -DCESMCOUPLED -DFORTRANUNDERSCORE -DCPRINTEL -DMCT_INTERFACE -DHAVE_MPI /global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/gptl.c
Currently Loaded Modulefiles:
1) cmake/3.21.4 3) intel/2022.1.2 5) compiler-rt/2022.1.0 7) mkl/2022.0.2 9) hdf5/1.10.4-i201h260
2) python/3.7 4) tbb/2021.6.0 6) compiler/2022.1.0 8) hpcx/2.12.0 10) netcdf/4.6.2-i201h260
gmake: Leaving directory '/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl'
mpicc: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
gmake: *** [/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/Makefile:82: gptl.o] Error 127
ERROR: mpicc: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
gmake: *** [/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/Makefile:82: gptl.o] Error 127[gerardo@login01 cesm]$
As I wrote above, I modified config_compilers.xml so that instead of just “mpicc …”, the compilation command executes “module list ; mpicc …”, to verify that the expected modules are indeed loaded before 'mpicc' executes.
I don’t understand the error message (in red), because the mpicc command (in blue) has the “-c” option and shouldn’t be looking for any libraries. In addition, if I load the indicated modules and execute by hand the ‘gmake’ command from the first line of the log output, the corresponding library gets built with no errors whatsoever.
Please let me know which files I should attach.
Specifically, I am trying to follow the verification step in section 6.5 of 6. Porting and validating CIME on a new platform — CIME master documentation, and became stuck at:
cime/scripts/create_test --xml-category prealpha --xml-machine cheyenne --xml-compiler intel --machine iris --compiler intel 2>&1 | tee create.log
The creation and set-up steps work but the first attempt to do a build in each test case that is set up fails. Here is a sample build log output:
[gerardo@login01 cesm]$ cat /global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/gptl.bldlog.220927-134747
gmake -f /global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/Makefile install -C /global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl MACFILE=/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/Macros.make MODEL=gptl COMP_NAME=gptl GPTL_DIR=/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing GPTL_LIBDIR=/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl SHAREDPATH=/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/intel/openmpi/nodebug/nothreads/mct CIME_MODEL=cesm SMP=FALSE CASEROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z" CASETOOLS="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/Tools" CIMEROOT="/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime" COMP_INTERFACE="mct" COMPILER="intel" DEBUG="FALSE" EXEROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld" INCROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/lib/include" LIBROOT="/global/scratch/users/gerardo/cesm/ERI.f09_g17.B1850.iris_intel.allactive-defaultio.20220927_134729_lso12z/bld/lib" MACH="iris" MPILIB="openmpi" NINST_VALUE="c1a1l1i1o1r1g1w1i1e1" OS="LINUX" PIO_VERSION="1" SHAREDLIBROOT="/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z" SMP_PRESENT="FALSE" USE_ESMF_LIB="FALSE" USE_MOAB="FALSE" CAM_CONFIG_OPTS="-phys cam6 -co2_cycle" COMP_LND="clm" COMPARE_TO_NUOPC="FALSE" CISM_USE_TRILINOS="FALSE" USE_TRILINOS="FALSE" USE_ALBANY="FALSE" USE_PETSC="FALSE"
gmake: Entering directory '/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl'
module list; mpicc -c -I/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing -qno-opt-dynamic-align -fp-model precise -std=gnu99 -xCORE-AVX512 -O2 -debug minimal -DCESMCOUPLED -DFORTRANUNDERSCORE -DCPRINTEL -DMCT_INTERFACE -DHAVE_MPI /global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/gptl.c
Currently Loaded Modulefiles:
1) cmake/3.21.4 3) intel/2022.1.2 5) compiler-rt/2022.1.0 7) mkl/2022.0.2 9) hdf5/1.10.4-i201h260
2) python/3.7 4) tbb/2021.6.0 6) compiler/2022.1.0 8) hpcx/2.12.0 10) netcdf/4.6.2-i201h260
gmake: Leaving directory '/global/scratch/users/gerardo/cesm/sharedlibroot.20220927_134729_lso12z/intel/openmpi/nodebug/nothreads/mct/gptl'
mpicc: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
gmake: *** [/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/Makefile:82: gptl.o] Error 127
ERROR: mpicc: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
gmake: *** [/global/home/groups/hpcperf/centos-7/benchmarks/CESM/cesm2_2/cime/src/share/timing/Makefile:82: gptl.o] Error 127[gerardo@login01 cesm]$
As I wrote above, I modified config_compilers.xml so that instead of just “mpicc …”, the compilation command executes “module list ; mpicc …”, to verify that the expected modules are indeed loaded before 'mpicc' executes.
I don’t understand the error message (in red), because the mpicc command (in blue) has the “-c” option and shouldn’t be looking for any libraries. In addition, if I load the indicated modules and execute by hand the ‘gmake’ command from the first line of the log output, the corresponding library gets built with no errors whatsoever.
Please let me know which files I should attach.