Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

mpif90 option error

Hello,

I am attempting to get CCSM3.0 running on a linux intel xeon cluster using GB ethernet and mpich-1.2.5-ch_p4. I am just starting out and trying to build the model. When I attempt to run ./$CASE.generic_linux.build, I get the error that the mph library build failed because of the following:

------------------------------------------------------------------------
mpif90 -I. -I/home/ccm33/netcdf-3.5.1/include -I/home/ccm33/ccsmrun1/lib/include -I/usr/local/cluster/mpi/include -c -r8 -i4 -Kieee -Mrecursive -Mdalign -Mextend -DLINUX -DPGF90 -DNO_SHR_VMATH -DLINUX mph.F
/home/ccm33/bin/mpif90: line 332: eval: -I: invalid option
eval: usage: eval [arg ...]
gmake: *** [mph.o] Error 2
----------------------------------------------------------------------

What is the -I option? Is it necessary for the build?

Thanks,
Cathy
 

pjr

Member
The -I option is identifying a directory to be searched for files to be
included in the source. So for example -I/home/ccm33/netcdf-3.5.1/include

would mean, "search the directory /home/ccm33/netcdf-3.5.1 for any file
requested in the source code". The requested file may be modules, or source code. The syntax used for specifying the directories vary by compiler, so if you are trying to get a new compiler going you may need
to revise the syntax.

Phil
 
Hello Cathryn Meyer and list

I would suggest that you try to use directly the compilers (say, gcc and pgf90) on the
Macros.Linux file, rather than the corresponding mpi scripts (mpicc and mpif90).
The latter may have built in directory search paths for include files and libraries,
which may conflict with the ones required by CCSM3.0.

I could install and run CCSM3.0 (and CCSM2.0.1) on "fats",
our LDEO beowulf cluster, this way.

Good luck.

Gus Correa
 
Hi Phil and list,

I am still unsure exactly what may be going wrong with mpif90. In my mpif90 file I have the lines:

--------------------------------------------------------------------------------
# Directory locations: Can change for each Fortran version
f90includedir=${includedir}
f90libdir=${libdir}
#
F90BASE=""
F90LINKERBASE="pgf90"
LDFLAGSBASE=""
BASE_FFLAGS=" "
F90INC=""
# f90modinc specifies how to add a directory to the search path for modules.
# Some compilers (Intel ifc) do not support this concept, and instead need
# a specific list of files that contain module names and directories.
# The f90modincspec is a more general approach that uses and
# for the directory and file respectively.
F90MODINC=""
F90MODINCSPEC=""
USER_FFLAGS=""
#
# Linker flags
F90_LDFLAGS=""
BASE_LIB_LIST=" "
FLIB_LIST=""
F90LIB_PATH_LEADER="-L"
F90LIB_PATH="${libdir}"
MPILIBNAME="mpich"
MPIVERSION="1.2.5 (release) of : 2003/01/13 16:21:53"
FWRAPNAME="fmpich"
FLIBNAME="mpich"
-------------------------------------------------------------------------------------

The F90LIB_PATH_LEADER = "L" line looks to me like what I'd want to change, but -L already seems correct, that's what I want to change the -i (uppercase) to that keeps giving me the error (shown in my original post). Does anybody know where mpif90 is reading the -i from, if not in the actual mpif90 file itself? Even if I change the -L in the line above, it does not affect the error I get. I'm just not sure where to look to change this option. Any thoughts?

Thanks,
Cathy
 

gcarr@ucar_edu

New Member
guscorrea said:
Hello Cathryn Meyer and list

I would suggest that you try to use directly the compilers (say, gcc and pgf90) on the
Macros.Linux file, rather than the corresponding mpi scripts (mpicc and mpif90).
The latter may have built in directory search paths for include files and libraries,
which may conflict with the ones required by CCSM3.0.

I could install and run CCSM3.0 (and CCSM2.0.1) on "fats",
our LDEO beowulf cluster, this way.

Good luck.

Gus Correa


The way we run and test is with mpif90 and mpicc. You may be able to make it work the other way but you are on your own.
 
I am trying to get CCSM to work with mpif90 - I haven't given up on it yet. I'm just stumped as to what I need to change -I to. I have determined that the line with the -I that gives the error is in the Macros.Linux file, but none of the changes I've made to it have worked so far.
---------------------------
INCLDIR := -I. -I$(INC_NETCDF) -I$(INCROOT) -I$(INC_MPI)
---------------------------

Has nobody else has seen or heard of this problem before ... ?

Cathy
 
Hello Cathryn Meyer and list

Cathryn: I post the answer to your email here on the CCSM3 list,
in case it may interest other subscribers.

Here at Lamont, I did compile and run CCSM3 (and CCSM2.0.1) using
gcc and pgf90 directly in the Macros.Linux setup, instead of mpicc and mpif90.
Besides those changes, you need to insert in Macros.Linux the paths to
the include files and libraries, as well as library link flags such as "-lmpich",
appropriate to your system, partiicularly those of NetCDF and MPI.
To do this, just modify "INCLDIR" and "SLIBS" on Macros.Linux.

The errors you show below happen because the linker can't find the
MPI libraries, which are not listed in the default "SLIBS"
macro on Macros.Linux.
But they are present for the alternative machine "jet",
which you may use as an example.
The details depend on your system, though.

I hope this helps.

Gus Correa


Cathryn Meyer wrote:

>Hi Gus,
>
>You responded to one of my posts on the CCSM message board when I was
>having issues using mpif90. You told me to specify gcc and pgf90 and to
>use them instead of mpif90. I tried this, and it got me past the errors
>I was getting using mpif90, however now when I try to build CCSM by
>running generic_linux.build, cpl.buildexe fails and I get many errors in
>the cpl.buildexe file that look like:
>
>--------------------------------------------------------
>cpl_map_mod.o(.text+0xc622): In function `cpl_map_mod_cpl_map_npfixnew3_':
>: undefined reference to `mpi_allgather_'
>shr_mpi_mod.o(.text+0x20b): In function `shr_mpi_mod_shr_mpi_sendi0_':
>: undefined reference to `mpi_send_'
>shr_mpi_mod.o(.text+0x3f8): In function `shr_mpi_mod_shr_mpi_sendi1_':
>: undefined reference to `mpi_send_'
>shr_mpi_mod.o(.text+0x59b): In function `shr_mpi_mod_shr_mpi_sendr0_':
>--------------------------------------------------------
>
>Have you seen errors like this, or do you have any idea of how to fix
>this issue? If not, then I might have to go back to trying to use mpif90.
>
>Also, were there any other code changes you had to make to get CCSM to
>run using pgf90 and pgcc? Or did it just work the first time you tried it?
>
>Thanks,
>Cathy
>
 
Gus,

Thanks for your reply. By adding the -lmpich flag, all those errors went away. However there were more errors that I'm pretty sure are caused by double underscore issues. They look like:
--------------------------------------------------------------------
/home/ccm33/ccsmrun1/lib/libmct.a(m_GlobalSegMap.o)(.text+0xbe8): In function `m_globalsegmap_initd__':
: undefined reference to `mpi_waitall_'
/home/ccm33/ccsmrun1/lib/libmct.a(m_GlobalSegMap.o)(.text+0x14a6): In function `m_globalsegmap_initd__':
: undefined reference to `mpi_irecv_'
-------------------------------------------------------------------

I tried the command nm $LIB_MPI/libmpich.a | grep mpi_waitall_

and got the response:
00000000 W mpi_waitall_
00000000 T pmpi_waitall_

The underscores match, but I'm not sure if that's a good thing or not.

This is using mpich-1.2.5.2, a version I installed that works fine to run CAM3.0 on the linux cluster here. If I use a different mpich version that is installed on the cluster, and I issue the same command as above, I get a double underscore response. I'm not sure which issues would be easier to work with.

Any thoughts? Is there a quick fix for a double underscore issue that does not require reconfiguring and reinstalling mpich?

Cathy
 

jacob@mcs_anl_gov

Rob Jacob
New Member
This is just a guess but try going back to using mpif90 and change INCLDIR to INCLDIR := -I./ -I$(INC_NETCDF) -I$(INCROOT) -I$(INC_MPI)

This is changing -I. to -I./ to make it clear the current directory should be used in the include path.

Rob Jacob
 
Rob,

I tried that and it worked to build the mph library, but then I still get the same error when trying to build the ice model:
---------------------------------------------------------------------------
mpif90 -I./ -I/home/ccm33/netcdf-3.5.1/include -I/home/ccm33/ccsmrun1/lib/include -I/usr/local/cluster/mpi/include -I./ -I/home/ccm33/CCSM/ccsm3_0/scripts/ccsmrun1/SourceMods/src.csim/ -I/home/ccm33/CCSM/ccsm3_0/models/ice/csim4/src/source/ -I/home/ccm33/CCSM/ccsm3_0/models/csm_share/shr/ -I/home/ccm33/CCSM/ccsm3_0/models/csm_share/cpl/ -c -r8 -i4 -Kieee -Mrecursive -Mdalign -Mextend -Mfree
-DLINUX -DPGF90 -DNO_SHR_VMATH -DLINUX -Dcoupled -DNPROC_X=8 -DNPROC_Y=1 -D_MPI /home/ccm33/CCSM/ccsm3_0/models/csm_share/shr/shr_kind_mod.F90
/home/ccm33/bin/mpif90: line 332: eval: -I: invalid option
eval: usage: eval [arg ...]
gmake: *** [shr_kind_mod.o] Error 2
--------------------------------------------------------------------------

It's funny that adding the "/" worked for one build and then is invalid later in another part of the build.

Cathy
 

jacob@mcs_anl_gov

Rob Jacob
New Member
From your mpif90 file posted above, it looks like MPICH wasn't built correctly. F90BASE should be pgf90. You could try editing mpif90 directly to fix it but you may need to rebuild MPICH.

Rob
 

gcarr@ucar_edu

New Member
We are aware of some code and script changes that are needed in CCSM to be able to work with the MPICH P4 driver used with Ethernet clusters. We hope to be able to get these into a formal tag at a future date. At this time, this configuration is officially unsupported.
 
Hello Cathryn Meyer and list

Sorry for the delay. I couldn't acess my ccsm3 files for a while, due to a failing raid array.
Not sure you are still interested on this, but here is what I could find out,
and some suggestions.

1. As opposed to other components of ccsm3,
mct doesn't use Macros.Linux.
Following the style of most ANL products,
it uses configure, automake, etc, to find out directly from your environment
which compilers you have, the flags to use, which paths to search for libraries, etc.

2. The place to look for the results of this configuration is "Makefile.conf"
(under your $EXEROOT/mct).
Apparently mct will use mpif90 as the fortran compiler (FC), if it finds mpif90
(this is what I have on my Makefile.conf).

3. Therefore, the search path for the MPI libraries will be the one built in to your
mpif90 command (established when MPI was installed).
For some reason, this path appears to be wrong or lost on your machiine.
As a result, mpif90 cannot find the library members it needs to link to (e.g.
mpi_waitall_ and mpi_irecv_), which were the errors you reported.

4. Just in case, make sure your PATH environment variable is not
somehow pointing to an old version of mpif90,
that may have wrong library search paths,
(We have tens of old versions here!)
You can check this out with "which mpif90", then look at the contents of that particular
mpif90 script shown by the shell.
Look for a variable called BASE_LIB_LIST,
which tells MPI the library paths it should search.
This variable has to contain the same path where your "nm" command actually
found mpi_waitall_.
Otherwise, your mpif90 is broken.

5. If you are simply pointing to an out of date mpif90, the required fix is just to force your
PATH to point to the right mpif90.
However, if the problem happens because an up-to-date mpif90 points to a wrong path,
you may chose either to edit mpif90 (risky, may break other things,
but worth trying as a quick-and-dirty fix),
or to reinstall MPI (takes more effort, but may save you future headaches).

Good luck.

Gus Correa

cathryn.meyer@yale.edu said:
Gus,

Thanks for your reply. By adding the -lmpich flag, all those errors went away. However there were more errors that I'm pretty sure are caused by double underscore issues. They look like:
--------------------------------------------------------------------
/home/ccm33/ccsmrun1/lib/libmct.a(m_GlobalSegMap.o)(.text+0xbe8): In function `m_globalsegmap_initd__':
: undefined reference to `mpi_waitall_'
/home/ccm33/ccsmrun1/lib/libmct.a(m_GlobalSegMap.o)(.text+0x14a6): In function `m_globalsegmap_initd__':
: undefined reference to `mpi_irecv_'
-------------------------------------------------------------------

I tried the command nm $LIB_MPI/libmpich.a | grep mpi_waitall_

and got the response:
00000000 W mpi_waitall_
00000000 T pmpi_waitall_

The underscores match, but I'm not sure if that's a good thing or not.

This is using mpich-1.2.5.2, a version I installed that works fine to run CAM3.0 on the linux cluster here. If I use a different mpich version that is installed on the cluster, and I issue the same command as above, I get a double underscore response. I'm not sure which issues would be easier to work with.

Any thoughts? Is there a quick fix for a double underscore issue that does not require reconfiguring and reinstalling mpich?

Cathy
 
Hello Cathryn Meyer and list

Please ignore my reference to the variable BASE_LIB_LIST in the mpif90 script
on item 4 of my previous message. That refers to additional libraries which may
support MPICH. Since you use P4, no additional libraries should be required.
(We use Myrinet, not Ethernet or P4, hence we need the support of the
Myrinet "GM" libraries, and our BASE_LIB_LIST is not empty.)

Sorry for the confusion.

What seems to matter in your case are several lines at the beginning of
the mpif90 script.
Something that should look more or less like this:

...

# Default compiler configuration
#
# Directory locations: Fixed for any MPI implementation
prefix=/usr/local/mpich
exec_prefix=${prefix}
sysconfdir=${exec_prefix}/etc
includedir=${prefix}/include
libdir=${exec_prefix}/lib
#
# Directory locations: Can change for each Fortran version
f90includedir=${includedir}
f90libdir=${libdir}
#

...

Note that the variable "prefix" should be the very root of your mpich directory
(not necessarily /usr/local/mpich).
Note also that "prefix" sets everything else, including the seach paths for
the MPI libraries (i.e. libdir and f90libdir), when mpif90 works as a linker.
"prefix" also sets the include directory paths, which were another
source of trouble for you.

This variable (prefix) is the one you may want to check on your mpif90.
The actual value of the derived variables libdir and f90libdir
should match the library path you used on
the "nm" command that successfully retrieved mpi_waitall_.
Otherwise your mpif90 is broken.

However, a simple edit of "prefix" may fix! ... if you don't mind a pun ... :)

Good luck.
Gus Correa

guscorrea said:
Hello Cathryn Meyer and list

Sorry for the delay. I couldn't acess my ccsm3 files for a while, due to a failing raid array.
Not sure you are still interested on this, but here is what I could find out,
and some suggestions.

1. As opposed to other components of ccsm3,
mct doesn't use Macros.Linux.
Following the style of most ANL products,
it uses configure, automake, etc, to find out directly from your environment
which compilers you have, the flags to use, which paths to search for libraries, etc.

2. The place to look for the results of this configuration is "Makefile.conf"
(under your $EXEROOT/mct).
Apparently mct will use mpif90 as the fortran compiler (FC), if it finds mpif90
(this is what I have on my Makefile.conf).

3. Therefore, the search path for the MPI libraries will be the one built in to your
mpif90 command (established when MPI was installed).
For some reason, this path appears to be wrong or lost on your machiine.
As a result, mpif90 cannot find the library members it needs to link to (e.g.
mpi_waitall_ and mpi_irecv_), which were the errors you reported.

4. Just in case, make sure your PATH environment variable is not
somehow pointing to an old version of mpif90,
that may have wrong library search paths,
(We have tens of old versions here!)
You can check this out with "which mpif90", then look at the contents of that particular
mpif90 script shown by the shell.
Look for a variable called BASE_LIB_LIST,
which tells MPI the library paths it should search.
This variable has to contain the same path where your "nm" command actually
found mpi_waitall_.
Otherwise, your mpif90 is broken.

5. If you are simply pointing to an out of date mpif90, the required fix is just to force your
PATH to point to the right mpif90.
However, if the problem happens because an up-to-date mpif90 points to a wrong path,
you may chose either to edit mpif90 (risky, may break other things,
but worth trying as a quick-and-dirty fix),
or to reinstall MPI (takes more effort, but may save you future headaches).

Good luck.

Gus Correa

cathryn.meyer@yale.edu said:
Gus,

Thanks for your reply. By adding the -lmpich flag, all those errors went away. However there were more errors that I'm pretty sure are caused by double underscore issues. They look like:
--------------------------------------------------------------------
/home/ccm33/ccsmrun1/lib/libmct.a(m_GlobalSegMap.o)(.text+0xbe8): In function `m_globalsegmap_initd__':
: undefined reference to `mpi_waitall_'
/home/ccm33/ccsmrun1/lib/libmct.a(m_GlobalSegMap.o)(.text+0x14a6): In function `m_globalsegmap_initd__':
: undefined reference to `mpi_irecv_'
-------------------------------------------------------------------

I tried the command nm $LIB_MPI/libmpich.a | grep mpi_waitall_

and got the response:
00000000 W mpi_waitall_
00000000 T pmpi_waitall_

The underscores match, but I'm not sure if that's a good thing or not.

This is using mpich-1.2.5.2, a version I installed that works fine to run CAM3.0 on the linux cluster here. If I use a different mpich version that is installed on the cluster, and I issue the same command as above, I get a double underscore response. I'm not sure which issues would be easier to work with.

Any thoughts? Is there a quick fix for a double underscore issue that does not require reconfiguring and reinstalling mpich?

Cathy
:D :D
 
Top