Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Build Error while building CESM 1.2.2 using threads

I was running CESM 1.2.2 with a threaded version. I intend to run openmp thread along with mpi tasks on a 16 core node.The component set is B. and the resolution is f19_g16.
I changed the variable in the BUILD_THREADED to true.I changed the mach_env_pes.xml to the following:   
   
   
   
   

   
   
   
   
   

   
   
   
   
   

   
   
   
   
   

   
   
   

   
   
   
   
   

   
   
   
   
   

   
   
   
   
   

   
   
   
   
   
   
   
   

   
   
   
   
   
   
   
   



 But, I am getting an error while I build the model. [nitin@master B_f19_g16_1node_omp]$ ./B_f19_g16_1node_omp.build
-------------------------------------------------------------------------
 CESM BUILDNML SCRIPT STARTING
 - To prestage restarts, untar a restart.tar file into /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/run
 infile is /storage/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/Buildconf/cplconf/cesm_namelist
CAM writing dry deposition namelist to drv_flds_in
CAM writing namelist to atm_in
CLM configure done.
CLM adding use_case 2000_control defaults for var sim_year with val 2000
CLM adding use_case 2000_control defaults for var sim_year_range with val constant
CLM adding use_case 2000_control defaults for var use_case_desc with val Conditions to simulate 2000 land-use
CICE configure done.
POP2 build-namelist: ocn_grid is gx1v6
POP2 build-namelist: ocn_tracer_modules are  iage
 CESM BUILDNML SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
-------------------------------------------------------------------------
 CESM PRESTAGE SCRIPT STARTING
 - Case input data directory, DIN_LOC_ROOT, is /home/nitin/CESM_NEW/input_data
 - Checking the existence of input datasets in DIN_LOC_ROOT
 CESM PRESTAGE SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
-------------------------------------------------------------------------
 CESM BUILDEXE SCRIPT STARTING
rm: No match.
 COMPILER is intel
 - Build Libraries: mct gptl pio csm_share
Wed Feb 11 14:09:25 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/mct.bldlog.150211-140918
Wed Feb 11 14:09:26 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/gptl.bldlog.150211-140918
Wed Feb 11 14:09:26 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/pio.bldlog.150211-140918
Wed Feb 11 14:09:27 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/csm_share.bldlog.150211-140918
Wed Feb 11 14:09:27 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/atm.bldlog.150211-140918
Wed Feb 11 14:10:39 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lnd.bldlog.150211-140918
Wed Feb 11 14:11:14 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/ice.bldlog.150211-140918
Wed Feb 11 14:11:48 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/ocn.bldlog.150211-140918
Wed Feb 11 14:13:27 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/glc.bldlog.150211-140918
Wed Feb 11 14:13:27 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/wav.bldlog.150211-140918
Wed Feb 11 14:13:28 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/rof.bldlog.150211-140918
Wed Feb 11 14:13:39 IST 2015 /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.bldlog.150211-140918
ERROR: cesm.buildexe.csh failed, see /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.bldlog.150211-140918
ERROR: cat /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.bldlog.150211-140918

The error in the log is as follows:mpiifort -o /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe ccsm_comp_mod.o ccsm_driver.o mrg_mod.o seq_avdata_mod.o seq_diag_mct.o seq_domain_mct.o seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_map_esmf.o seq_map_mod.o seq_mctext_mod.o seq_rest_mod.o  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -latm  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lice  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -llnd  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -locn  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lrof  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lglc  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lwav -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share -lcsm_share -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/lib -lpio -lgptl -lmct -lmpeu -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm -L/opt/intel/impi/4.1.3.048/intel64/lib -lmpich  -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm -openmp
ld: MPIR_Thread: TLS definition in /opt/intel/impi/4.1.3.048/intel64/lib/libmpi_mt.so section .tbss mismatches non-TLS definition in /opt/intel/impi/4.1.3.048/intel64/lib/libmpich.so section .bss
/opt/intel/impi/4.1.3.048/intel64/lib/libmpi_mt.so: could not read symbols: Bad value
gmake: *** [/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe] Error 1 Please find attached the env_mach_pes.xml and the error log file.Thanks. 
 

jedwards

CSEG and Liaisons
Staff member
This appears to be a system configuration error - please consult with your local system administrators.   Have you tried a simple program like the mpi hello world in the users guide?   
 

jedwards

CSEG and Liaisons
Staff member
This appears to be a system configuration error - please consult with your local system administrators.   Have you tried a simple program like the mpi hello world in the users guide?   
 

jedwards

CSEG and Liaisons
Staff member
This appears to be a system configuration error - please consult with your local system administrators.   Have you tried a simple program like the mpi hello world in the users guide?   
 
Yes, I tried running programs with hybrid mpi and openmp. (in addition to the mpi programs). I didn't face any issues with the compilation and running. 
 
Yes, I tried running programs with hybrid mpi and openmp. (in addition to the mpi programs). I didn't face any issues with the compilation and running. 
 
Yes, I tried running programs with hybrid mpi and openmp. (in addition to the mpi programs). I didn't face any issues with the compilation and running. 
 
The logs suggested that -lmpich and -lmpi were getting added. for building a mutithreaded version, I had to remove -lmpich and -lmpi from the compiler options. It was not added externally. -lmpich was getting added because of the environment variable MPI_LIB_NAME in the Macros being given as "mpich". Once, I unset that variable -lmpich is not added. Additionally, the -lmpi is added because of the MPI_PATH pointed to impi.  Now, I am getting successful builds for certain configurations. However, when I change the configuration to accomodate more threads. (1 task for each component and each having threads), I get the following error. I have attached the pe layout file along with the log file.  mpiifort -o /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe ccsm_comp_mod.o ccsm_driver.o mrg_mod.o seq_avdata_mod.o seq_diag_mct.o seq_domain_mct.o seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_map_esmf.o seq_map_mod.o seq_mctext_mod.o seq_rest_mod.o  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -latm  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lice  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -llnd  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -locn  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lrof  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lglc  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lwav -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share -lcsm_share -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/lib -lpio -lgptl -lmct -lmpeu -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm  -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm -openmp ccsm_comp_mod.o: In function `ccsm_comp_mod_mp_ccsm_run_':/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x2a67): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x2d91): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3cba): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3cd7): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3e36): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3e5d): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5046): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5324): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5340): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x7840): relocation truncated to fit: R_X86_64_32S against symbol `seq_comm_mct_mp_cplocnid_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x7936): additional relocation overflows omitted from the outputgmake: *** [/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe] Error 1  Thanks.
 
The logs suggested that -lmpich and -lmpi were getting added. for building a mutithreaded version, I had to remove -lmpich and -lmpi from the compiler options. It was not added externally. -lmpich was getting added because of the environment variable MPI_LIB_NAME in the Macros being given as "mpich". Once, I unset that variable -lmpich is not added. Additionally, the -lmpi is added because of the MPI_PATH pointed to impi.  Now, I am getting successful builds for certain configurations. However, when I change the configuration to accomodate more threads. (1 task for each component and each having threads), I get the following error. I have attached the pe layout file along with the log file.  mpiifort -o /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe ccsm_comp_mod.o ccsm_driver.o mrg_mod.o seq_avdata_mod.o seq_diag_mct.o seq_domain_mct.o seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_map_esmf.o seq_map_mod.o seq_mctext_mod.o seq_rest_mod.o  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -latm  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lice  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -llnd  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -locn  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lrof  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lglc  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lwav -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share -lcsm_share -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/lib -lpio -lgptl -lmct -lmpeu -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm  -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm -openmp ccsm_comp_mod.o: In function `ccsm_comp_mod_mp_ccsm_run_':/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x2a67): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x2d91): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3cba): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3cd7): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3e36): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3e5d): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5046): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5324): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5340): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x7840): relocation truncated to fit: R_X86_64_32S against symbol `seq_comm_mct_mp_cplocnid_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x7936): additional relocation overflows omitted from the outputgmake: *** [/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe] Error 1  Thanks.
 
The logs suggested that -lmpich and -lmpi were getting added. for building a mutithreaded version, I had to remove -lmpich and -lmpi from the compiler options. It was not added externally. -lmpich was getting added because of the environment variable MPI_LIB_NAME in the Macros being given as "mpich". Once, I unset that variable -lmpich is not added. Additionally, the -lmpi is added because of the MPI_PATH pointed to impi.  Now, I am getting successful builds for certain configurations. However, when I change the configuration to accomodate more threads. (1 task for each component and each having threads), I get the following error. I have attached the pe layout file along with the log file.  mpiifort -o /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe ccsm_comp_mod.o ccsm_driver.o mrg_mod.o seq_avdata_mod.o seq_diag_mct.o seq_domain_mct.o seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_map_esmf.o seq_map_mod.o seq_mctext_mod.o seq_rest_mod.o  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -latm  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lice  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -llnd  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -locn  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lrof  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lglc  -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/lib/ -lwav -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share -lcsm_share -L/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/lib -lpio -lgptl -lmct -lmpeu -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm  -L/storage/softwares/installedsoftware/netcdf_4.4.0/lib -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -lm -openmp ccsm_comp_mod.o: In function `ccsm_comp_mod_mp_ccsm_run_':/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x2a67): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x2d91): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3cba): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3cd7): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3e36): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x3e5d): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5046): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5324): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_xao_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x5340): relocation truncated to fit: R_X86_64_PC32 against symbol `seq_comm_mct_mp_num_inst_frc_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x7840): relocation truncated to fit: R_X86_64_32S against symbol `seq_comm_mct_mp_cplocnid_' defined in COMMON section in /home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/intel/mpich/nodebug/threads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share/libcsm_share.a(seq_comm_mct.o)/storage/home/nitin/CESM_NEW/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:(.text+0x7936): additional relocation overflows omitted from the outputgmake: *** [/home/nitin/CESM_NEW/cesm1_2_2/cases/B_f19_g16_1node_omp/cesm.exe] Error 1  Thanks.
 

jedwards

CSEG and Liaisons
Staff member
I think maybe in doing multiple builds you have some incompatability in the object files.   Do a $CASE.clean_build all and rebuild.   You still have a -lmpich on the link line as well. 
 

jedwards

CSEG and Liaisons
Staff member
I think maybe in doing multiple builds you have some incompatability in the object files.   Do a $CASE.clean_build all and rebuild.   You still have a -lmpich on the link line as well. 
 

jedwards

CSEG and Liaisons
Staff member
I think maybe in doing multiple builds you have some incompatability in the object files.   Do a $CASE.clean_build all and rebuild.   You still have a -lmpich on the link line as well. 
 
I was able to solve the build error (relocation fit) by adding the -mcmodel=medium flag to the linker flags and the compiler flags in the $CASE. Now, I am successfully able to build the model  However, I am getting a run time error when I try to run a threaded model with just 2 threads per MPI task. (PFA the env_mach_pes.xml file). The model runs for some time and gives the following error in the log. (PFA the cesm log file) MCT::m_Router::initp_: RGSMap indices not increasing...Will correctMCT::m_Router::initp_: GSMap indices not increasing...Will correct(seq_domain_areafactinit) : min/max mdl2drv   0.999841513526222       1.00031732638246    areafact_a_ATM(seq_domain_areafactinit) : min/max drv2mdl   0.999682774281628       1.00015851159572    areafact_a_ATM(seq_domain_areafactinit) : min/max mdl2drv   0.999841513526350       1.00076245909423    areafact_l_LND(seq_domain_areafactinit) : min/max drv2mdl   0.999238121806731       1.00015851159559    areafact_l_LND(seq_domain_areafactinit) : min/max mdl2drv   0.999996826904345      0.999996826905162    areafact_r_ROF(seq_domain_areafactinit) : min/max drv2mdl    1.00000317310491       1.00000317310572    areafact_r_ROF(seq_domain_areafactinit) : min/max mdl2drv   0.999565456406962       1.00000000000000    areafact_o_OCN(seq_domain_areafactinit) : min/max drv2mdl    1.00000000000000       1.00043473250326    areafact_o_OCN(seq_domain_areafactinit) : min/max mdl2drv   0.999565456406962       1.00000000000000    areafact_i_ICE(seq_domain_areafactinit) : min/max drv2mdl    1.00000000000000       1.00043473250326    areafact_i_ICE(seq_mct_drv) : Initialize atm component phase 2 ATM[48:node2] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 48[56:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0[52:node2] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 56internal ABORT - process 52[50:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 50[58:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 58[63:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 63[55:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 55[59:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 59[54:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 54[53:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 53[61:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 61[60:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 60[57:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 57[51:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 51[62:node1] unexpected disconnect completion event from [78:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 62APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) 
 
I was able to solve the build error (relocation fit) by adding the -mcmodel=medium flag to the linker flags and the compiler flags in the $CASE. Now, I am successfully able to build the model  However, I am getting a run time error when I try to run a threaded model with just 2 threads per MPI task. (PFA the env_mach_pes.xml file). The model runs for some time and gives the following error in the log. (PFA the cesm log file) MCT::m_Router::initp_: RGSMap indices not increasing...Will correctMCT::m_Router::initp_: GSMap indices not increasing...Will correct(seq_domain_areafactinit) : min/max mdl2drv   0.999841513526222       1.00031732638246    areafact_a_ATM(seq_domain_areafactinit) : min/max drv2mdl   0.999682774281628       1.00015851159572    areafact_a_ATM(seq_domain_areafactinit) : min/max mdl2drv   0.999841513526350       1.00076245909423    areafact_l_LND(seq_domain_areafactinit) : min/max drv2mdl   0.999238121806731       1.00015851159559    areafact_l_LND(seq_domain_areafactinit) : min/max mdl2drv   0.999996826904345      0.999996826905162    areafact_r_ROF(seq_domain_areafactinit) : min/max drv2mdl    1.00000317310491       1.00000317310572    areafact_r_ROF(seq_domain_areafactinit) : min/max mdl2drv   0.999565456406962       1.00000000000000    areafact_o_OCN(seq_domain_areafactinit) : min/max drv2mdl    1.00000000000000       1.00043473250326    areafact_o_OCN(seq_domain_areafactinit) : min/max mdl2drv   0.999565456406962       1.00000000000000    areafact_i_ICE(seq_domain_areafactinit) : min/max drv2mdl    1.00000000000000       1.00043473250326    areafact_i_ICE(seq_mct_drv) : Initialize atm component phase 2 ATM[48:node2] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 48[56:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0[52:node2] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 56internal ABORT - process 52[50:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 50[58:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 58[63:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 63[55:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 55[59:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 59[54:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 54[53:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 53[61:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 61[60:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 60[57:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 57[51:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 51[62:node1] unexpected disconnect completion event from [78:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 62APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) 
 
I was able to solve the build error (relocation fit) by adding the -mcmodel=medium flag to the linker flags and the compiler flags in the $CASE. Now, I am successfully able to build the model  However, I am getting a run time error when I try to run a threaded model with just 2 threads per MPI task. (PFA the env_mach_pes.xml file). The model runs for some time and gives the following error in the log. (PFA the cesm log file) MCT::m_Router::initp_: RGSMap indices not increasing...Will correctMCT::m_Router::initp_: GSMap indices not increasing...Will correct(seq_domain_areafactinit) : min/max mdl2drv   0.999841513526222       1.00031732638246    areafact_a_ATM(seq_domain_areafactinit) : min/max drv2mdl   0.999682774281628       1.00015851159572    areafact_a_ATM(seq_domain_areafactinit) : min/max mdl2drv   0.999841513526350       1.00076245909423    areafact_l_LND(seq_domain_areafactinit) : min/max drv2mdl   0.999238121806731       1.00015851159559    areafact_l_LND(seq_domain_areafactinit) : min/max mdl2drv   0.999996826904345      0.999996826905162    areafact_r_ROF(seq_domain_areafactinit) : min/max drv2mdl    1.00000317310491       1.00000317310572    areafact_r_ROF(seq_domain_areafactinit) : min/max mdl2drv   0.999565456406962       1.00000000000000    areafact_o_OCN(seq_domain_areafactinit) : min/max drv2mdl    1.00000000000000       1.00043473250326    areafact_o_OCN(seq_domain_areafactinit) : min/max mdl2drv   0.999565456406962       1.00000000000000    areafact_i_ICE(seq_domain_areafactinit) : min/max drv2mdl    1.00000000000000       1.00043473250326    areafact_i_ICE(seq_mct_drv) : Initialize atm component phase 2 ATM[48:node2] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 48[56:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0[52:node2] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 56internal ABORT - process 52[50:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 50[58:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 58[63:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 63[55:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 55[59:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 59[54:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 54[53:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 53[61:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 61[60:node1] unexpected disconnect completion event from [77:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 60[57:node1] unexpected disconnect completion event from [48:node2]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 57[51:node2] unexpected disconnect completion event from [56:node1]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 51[62:node1] unexpected disconnect completion event from [78:node7]Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0internal ABORT - process 62APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) 
 

santos

Member
Your env_mach_pes.xml is somewhat strange to me, since GLC and WAV are not on in most runs (what compset are you using?). However, that's probably not related to the error.The error looks very much like a system message to me. My best guess would be that either you ran out of wall time (which would be strange, since it should have gotten farther as long as you gave the run at least a few minutes), or there was an error on your system. You might want to just try again, or to check the atm.log and see where the run stops in the atm initialization.
 

santos

Member
Your env_mach_pes.xml is somewhat strange to me, since GLC and WAV are not on in most runs (what compset are you using?). However, that's probably not related to the error.The error looks very much like a system message to me. My best guess would be that either you ran out of wall time (which would be strange, since it should have gotten farther as long as you gave the run at least a few minutes), or there was an error on your system. You might want to just try again, or to check the atm.log and see where the run stops in the atm initialization.
 

santos

Member
Your env_mach_pes.xml is somewhat strange to me, since GLC and WAV are not on in most runs (what compset are you using?). However, that's probably not related to the error.The error looks very much like a system message to me. My best guess would be that either you ran out of wall time (which would be strange, since it should have gotten farther as long as you gave the run at least a few minutes), or there was an error on your system. You might want to just try again, or to check the atm.log and see where the run stops in the atm initialization.
 
Top