Main menu

Navigation

CESM 1.2.2 Error at csm_share library build stage

11 posts / 0 new
Last post
leanne.wake@...
CESM 1.2.2 Error at csm_share library build stage

Hi-

I'm trying to build a new case on our cluster and am unable to get past the stage of building the required libraries. The terminal displays this error

ERROR: buildlib.csm_share failed, see /home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/intel/openmpi/nodebug/nothreads/csm_share.bldlog.170112-095700

the csm_share buildlog ends with the following error:

cesm1_2_2/models/drv/shr/seq_io_mod.F90(1170): error #6404: This name does not have a type, and must have an explicit type.   [PIO_UNLIMITED]

      rcode = pio_def_dim(cpl_io_file,'time',PIO_UNLIMITED,dimid(1))

 I've seen this error listed on the forum before, but have not found a solution that works on our set-up. I've attached all the build logs and config.log. Any help to move forward is appreciated.  

 

Attachment: 
jedwards

Since you do not have a pnetcdf library PIO_UNLIMITED is defined in pio.F90 as NF90_UNLIMITED.  I don't see any reason that it wouldn't be defined at the point of the error, but there are a couple of things that you might try to resolve the problem.   Try building with the pnetcdf library, even if you don't use that library at runtime, its presence may solve this problem.  

The other may be more difficult - but we don't have a lot of experience with the openmpi library, you might try an mpich library if that's a possibility for you.

CESM Software Engineer

leanne.wake@...

Thanks for the reply. I will go down the pnetcdf route and report back

 

With regards to mpich - we first need to install this on our system - which version do you recommend to minimise conflicts?

 

jedwards

The latest stable (3.2) should be fine.   But I would wait and see if the pnetcdf install solves the problem.  

CESM Software Engineer

leanne.wake@...

Thankyou @jedwards !

Changing to pnetcdf has solved the problem but another has arisen (much later in the build I may add!) and I suspect it is a library linkage problem related to netcdf. I attach the Macros file and the build log. The crux of the error is (from cesm.bldlog): 

mpif90  -o /home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/cesm.exe ccsm_comp_mod.o ccsm_driver.o mrg_mod.o seq_avdata_mod.o seq_diag_mct.o seq_domain_mct.o seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_map_esmf.o seq_map_mod.o seq_mctext_mod.o seq_rest_mod.o  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -latm  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -lice  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -llnd  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -locn  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -lrof  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -lglc  -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/lib/ -lwav -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/intel/openmpi/nodebug/nothreads/MCT/noesmf/a1l1r1i1o1g1w1/csm_share -lcsm_share -L/home/leanneW/CESM_Work/cesm1_2_2/scripts/test2/execdir/intel/openmpi/nodebug/nothreads/lib -lpio -lgptl -lmct -lmpeu -L/cm/shared/oswald-apps/netcdf-fortran/intel/4.4.1//lib -lnetcdf  -L/cm/shared/oswald-apps/pnetcdf/intel/1.7.0//lib -lpnetcdf -L/cm/shared/apps/intel/compilers_and_libraries/2016.4.258/mpi/intel64//lib -lmpi

ld: cannot find -lnetcdf


When I look in /cm/shared/oswald-apps/netcdf-fortran/intel/4.4.1/lib all objects in there (*.a, *.so) are prefixed libnetcdff, not libnetcdf. Is this the issue, and if so is there an easy fix I can make somewhere else in the set up?

Thanks!

 

 

 

jedwards

Netcdf has two library files and cesm needs them both, they are libnetcdf.a and libnetcdff.a  - It looks like your netcdf install has them in different directories and this 

version of cesm doesn't handle that very well.   The easiest solution is to create links from your netcdf c install to your netdf fortran install so that all the libraries and include files appear to be in the same path.   Then you will need to have -L/cm/shared/oswald-apps/netcdf-fortran/intel/4.4.1//lib -lnetcdf -lnetcdff  

CESM Software Engineer

leanne.wake@...

Noted - I will be back with an update. Thanks again.

 

Edit:

 

"Then you will need to have -L/cm/shared/oswald-apps/netcdf-fortran/intel/4.4.1//lib -lnetcdf -lnetcdff"

In which particular file (or Makefile) is this linkage generated? I'm not sure where to find and then amend this line. Is it in config_compilers.xml, or somewhere similar? 

leanne.wake@...

I've managed to build CESM - thanks so much for your help so far.

 

However - I used mpi to complete the build, rather than mpich,openmpi etc, which isnt listed on the 'supported' mpi compilers list in the macros file. Will it fail on execution, or does it mean 'not supported' as in you arent able to offer help with mpi-related problems?

As I said, cesm.exe  has built - but I want to make sure it isnt a false positive!

Cheers,

jedwards

You'll just have to try it and see.  But note that "mpi" doesn't tell me anything about what mpi library you used to build.   Do you mean impi?   

CESM Software Engineer

leanne.wake@...

This is the path to the library on our cluster.... if this doesnt tell you what you want to know I will ask our technician and get back to you tomorrow:

 

openmpi/open64/64/1.10.1/lib64

leanne.wake@...

Hi again,

the executable has now built - but now the model run does not complete. The error at the end of the attached logfile is as follows:

mpiexec noticed that process rank 1 with PID 46333 on node compute004 exited on signal 11 (Segmentation fault).

The model also produced a core dumpfile, so after doing the following:

gdb execdir/cesm.exe core.46333

info stack

I get the following output:

#0  0x00002aaabc9f4060 in ?? ()
#1  <signal handler called>
#2  PMPI_Info_get (info=0x2, key=0x112be98 "nc_header_read_chunk_size", valuelen=1023, value=0x7ffffffedb30 "\001",
    flag=0x7ffffffedf34) at pinfo_get.c:75
#3  0x0000000000d2b257 in ncmpi_open ()
#4  0x0000000000d279d7 in nfmpi_open_ ()
#5  0x0000000000b13f7a in ionf_mod::open_nf (file=...,
    fname='/home/leanneW/CESM_Work/cesm1_2_2/test_input/cpl/cpl6/map_gx3v7_to_fv4x5_aave_da_091218.nc', ' ' <repeats 270 times>, mode=0, .tmp.FNAME.len_V$16b=360) at /home/leanneW/CESM_Work/cesm1_2_2/models/utils/pio/ionf_mod.F90:182
#6  0x0000000000977d2f in piolib_mod::pio_openfile (iosystem=..., file=..., iotype=5,
    fname='/home/leanneW/CESM_Work/cesm1_2_2/test_input/cpl/cpl6/map_gx3v7_to_fv4x5_aave_da_091218.nc', ' ' <repeats 166 times>, mode=<error reading variable: Cannot access memory at address 0x0>,
    checkmpi=<error reading variable: Cannot access memory at address 0x0>, .tmp.FNAME.len_V$2801=256)
    at /home/leanneW/CESM_Work/cesm1_2_2/models/utils/pio/piolib_mod.F90:2669
#7  0x0000000000636600 in seq_map_mod::seq_map_readdata (maprcfile='seq_maps.rc', maprcname='ocn2atm_fmapname:', mpicom=4,
    id=2, ni_s=<error reading variable: Cannot access memory at address 0x0>,
    nj_s=<error reading variable: Cannot access memory at address 0x0>, av_s=..., gsmap_s=..., avfld_s='aream',
    filefld_s='area_a', ni_d=<error reading variable: Cannot access memory at address 0x0>,
    nj_d=<error reading variable: Cannot access memory at address 0x0>, av_d=..., gsmap_d=..., avfld_d='aream',
    filefld_d='area_b', string='ocn2atm aream initialization', .tmp.MAPRCFILE.len_V$3652=11, .tmp.MAPRCNAME.len_V$3655=17,
    .tmp.AVFLD_S.len_V$365e=5, .tmp.FILEFLD_S.len_V$3661=6, .tmp.AVFLD_D.len_V$3668=5, .tmp.FILEFLD_D.len_V$366b=6,
    .tmp.STRING.len_V$366e=28) at /home/leanneW/CESM_Work/cesm1_2_2/models/drv/driver/seq_map_mod.F90:908
#8  0x000000000045ed69 in ccsm_comp_mod::ccsm_init ()
    at /home/leanneW/CESM_Work/cesm1_2_2/models/drv/driver/ccsm_comp_mod.F90:2012
#9  0x00000000004bec5b in ccsm_driver () at /home/leanneW/CESM_Work/cesm1_2_2/models/drv/driver/ccsm_driver.F90:90
#10 0x000000000041186e in main ()
#11 0x00002aaaac3d1b15 in __libc_start_main () from /usr/lib64/libc.so.6

there seems to be some sort of memory issure at #6. Any ideas? I realise that I have thrown a smorgasbord of errors at you in this thread... I thank you for your help so far. Let me know if you need any more info.

 

Log in or register to post comments

Who's new

  • m.kliphuis@...
  • ddc3061993@...
  • hui.ding@...
  • zhouc@...
  • arianna.valmass...