Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Failed to build cesm.exe because of unsuccessfully linking the netcdf and mpi libs

xgsun

New Member
I'm trying to build the CESM2.1.2 on my own linux platform by prescribing the machine name "homebrew" for the create_newcase, and I can successfully run the case.setup. However, when I run the case.build, it failed to build the executable file cesm.exe at last and showed the information about unsuccessfully linking the netcdf and mpi libs.

I specified the environment for the NETCDF lib in the config_machines.xml as follows,

<environment_variables>
<env name="NETCDF_PATH">/opt/netcdf/netcdf-c4.6.2-f4.4.4-intel2019</env>
</environment_variables>

In the cesm/cases/*/Tools/Makefile, I can see that netcdf and mpi should be linked while building the cesm.exe.

$(EXEC_SE): $(OBJS) $(ULIBDEP) $(CSMSHARELIB) $(MCTLIBS) $(PIOLIB) $(GPTLLIB)
$(LD) -o $(EXEC_SE) $(OBJS) $(CLIBS) $(ULIBS) $(SLIBS) $(MLIBS) $(LDFLAGS)

I'm not sure whether the settings of netcdf and mpi are needed in other places, any suggestions to solve this problem? Attached please find the config_machines.xml and Macro.make.

Thank you.

Sun
 

Attachments

  • config_machines.xml.txt
    106.4 KB · Views: 14
  • Macros.make.txt
    1.7 KB · Views: 6

jedwards

CSEG and Liaisons
Staff member
The machine name homebrew is meant as a generic base for an apple OSx system. The analogous linux platform is centos7-linux.
Without seeing the error I can't determine why you are unable to link netcdf and mpi libraries.
 

xgsun

New Member
Though homebrew is used as the machine name, I have made the corresponding settings on my linux platform. I think it should be ok.
Attached is the log file for case.build, and the config_machines.xml was attached before. please help to solve the problem. Thanks.
 

Attachments

  • cesm.bldlog.200608-230607.txt
    222.3 KB · Views: 5

jedwards

CSEG and Liaisons
Staff member
Still not sure what is going on here - can you attach the pio.bldlog? You might try defining NETCDF instead of NETCDF_PATH
 

xgsun

New Member
I tried to define NETCDF instead of NETCDF_PATH in config_machines.xml, but it comes out errors while running the case.build for mct.bldlog, which indicates the netcdf lib cannot be found,

gmake -f /data1/xgsun/cesm/cases/B1850cmip6.f09_g17/Tools/Makefile -C /home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/intel/impi/nodebug/nothreads/mct CASEROOT=/data1/xgsun/cesm/cases/B1850cmip6.f09_g17 MODEL=mct /home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/intel/impi/nodebug/nothreads/mct/Makefile.conf
gmake: Entering directory `/data1/xgsun/cesm/scratch/B1850cmip6.f09_g17/bld/intel/impi/nodebug/nothreads/mct'
gmake: Leaving directory `/data1/xgsun/cesm/scratch/B1850cmip6.f09_g17/bld/intel/impi/nodebug/nothreads/mct'
cat: Filepath: No such file or directory
/data1/xgsun/cesm/cases/B1850cmip6.f09_g17/Tools/Makefile:199: *** NETCDF not found: Define NETCDF_PATH or NETCDF_C_PATH and NETCDF_FORTRAN_PATH in config_machines.xml or config_compilers.xml. Stop.
ERROR: cat: Filepath: No such file or directory
/data1/xgsun/cesm/cases/B1850cmip6.f09_g17/Tools/Makefile:199: *** NETCDF not found: Define NETCDF_PATH or NETCDF_C_PATH and NETCDF_FORTRAN_PATH in config_machines.xml or config_compilers.xml. Stop.

So I think the definition of NETCDF_PATH should be correct, and return back to use the NETCDF_PATH.

I attached the pio.bldlog and cesm.bldlog. From the output errors in cesm.bldlog, I understand that to finally build the cesm.exe, it should link the netcdf and mpi libs, but I cannot see the flags in the command line

mpiifort -o /home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/cesm.exe cime_comp_mod.o cime_driver.o component_mod.o component_type_mod.o cplcomp_exchange_mod.o map_glc2lnd_mod.o map_lnd2glc_mod.o map_lnd2rof_irrig_mod.o mrg_mod.o prep_aoflux_mod.o prep_atm_mod.o prep_glc_mod.o prep_ice_mod.o prep_lnd_mod.o prep_ocn_mod.o prep_rof_mod.o prep_wav_mod.o seq_diag_mct.o seq_domain_mct.o seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_io_mod.o seq_map_mod.o seq_map_type_mod.o seq_rest_mod.o t_driver_timers_mod.o -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -latm -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -lice -L../../intel/impi/nodebug/nothreads/mct/noesmf/lib/ -lclm -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -locn -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -lrof -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -lglc -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -lwav -L/home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/lib/ -lesp -L../../glc/lib/ -lglimmercismfortran -L../../intel/impi/nodebug/nothreads/mct/noesmf/c1a1l1i1o1r1g1w1e1/lib -lcsm_share -L../../intel/impi/nodebug/nothreads/lib -lpio -lgptl -lmct -lmpeu -mkl=cluster

I'm wondering where I can add the link flags for netcdf and mpi in the "mpiifort -o ....", in other words, in which file I can specify the link flags manually.
Any suggestions? Thanks.
 

Attachments

  • cesm.bldlog.200609-084601.txt
    248.5 KB · Views: 6
  • pio.bldlog.200609-084601.txt
    65.6 KB · Views: 4

jedwards

CSEG and Liaisons
Staff member
Try setting NETCDF_PATH in the config_compilers.xml file instead of in config_machines.xml, you will find several examples in that file of how to do that.
 

xgsun

New Member
The model can finally be successfully be built when I set the NETCDF_PATH in the config_compliers.xml file. Thanks.
But another problem comes out when I run the case.submit, I just show the error information in the cesm.log.xxx

[proxy:0:0@intel1] HYD_spawn (../../../../../src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:57): pipe error (Too many open files)
[proxy:0:0@intel1] launch_processes (../../../../../src/pm/i_hydra/proxy/proxy.c:550): error creating process
[proxy:0:0@intel1] main (../../../../../src/pm/i_hydra/proxy/proxy.c:892): error launching_processes
[mpiexec@intel1] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:528): downstream from host intel1 exited with status 3
[mpiexec@intel1] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:2077): assert (exitcodes != NULL) failed

I think it may be due to that too many nodes are required when I run the cesm.exe. I checked the run information by ./preview_run, and it shows

CASE INFO:
nodes: 9
total tasks: 864
tasks per node: 96
thread count: 1

BATCH INFO:
FOR JOB: case.run

ENV:
Setting Environment NETCDF_PATH=/opt/netcdf/netcdf-c4.6.2-f4.4.4-intel2019
Setting Environment OMP_NUM_THREADS=1

SUBMIT CMD:
None

MPIRUN (job=case.run):
mpirun -np 864 -prepend-rank /home/xgsun/work/cesm/scratch/B1850cmip6.f09_g17/bld/cesm.exe >> cesm.log.$LID 2>&1

FOR JOB: case.st_archive
ENV:
Setting Environment NETCDF_PATH=/opt/netcdf/netcdf-c4.6.2-f4.4.4-intel2019
Setting Environment OMP_NUM_THREADS=1

SUBMIT CMD:
None


I did set the tasks per node 96 in the config_machines.xml, but I didn't prescribe 9 nodes since my platform only has one node. I'm wondering where it comes from. Any suggestions about the number of nodes and the running error? Thanks.
 

jedwards

CSEG and Liaisons
Staff member
You are getting a default setup. You can change it in the case with
./xmlchange NTASKS=96
./xmlchange ROOTPE=0

You can probably tune that for better performance, but these settings should at least run.
 
Top