Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

problem with intel compilers on xeon cluster

I am trying to get CCSM3 running on our cluster with Intel's compilers (version 8.0), buidling seems to work fine, but when I start a run it always ends with this output:

...
(cpl_map_init) done initializing map: map_So2a
(cpl_map_init) initialize map: map_Xr2o
map areasrc
(cpl_mct_aVect_info) local size = 129600
(cpl_mct_aVect_info) rList = aream
(cpl_mct_aVect_info) l min/max 3.322E-07 7.615E-05 1 aream
map areadst
(cpl_mct_aVect_info) local size = 5800
(cpl_mct_aVect_info) rList = aream
(cpl_mct_aVect_info) l min/max 2.691E-05 3.093E-03 1 aream
(cpl_map_init) scatter matrix by column...
--- mpimon --- Aborting run after process-0 terminated abnormally Childprocess 6797 got signal SIGSEGV(11): segmentation violation ---
Fri Feb 11 10:43:08 CET 2005 -- CSM EXECUTION HAS FINISHED
Model did not complete - see cpl.log.050211-104152

I tried different optimizations (-O0, -O, or just nothing), nothing helped. I thought it could be interesting to test with -g, but then cpl_domain_mod.F90 wouldn't compile any more:

fortcom: Severe: cpl_domain_mod.F90, line 21: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error.

Is anybody out there who has managed to get CCSM running with Intel compilers? I would like to know how to do it.

Cheers,
Klaus
 

gcarr@ucar_edu

New Member
We do not support the Intel compiler at this time on any platform. We are getting close to having CCSM working on the SGI Altix (Itanium processor) with the Intel compiler. For the Xeon clusters we have always used the PGI compiler. Validations have been performed with PGI 5.1-3 and 5.1-6.
 
Thanks for the reply, I am well aware that Intel compilers on Linux clusters are not officially supported. However, I was hoping (still hoping) that somebody outside NCAR has worked with Intel compilers and CCSM, and is willing to share his or her experience.

BTW: I have solved the problem with the sudden crash when running a CCSM executable from the Intel compiler thanks to help from our tech support. Nevertheless, I am still interested to come into contact with other groups that are using Intel compilers to discuss issues related to this configuration.

Klaus
 
gcarr said:
If someone has the magic formula for using the Intel compiler with CCSM3 we would like to know.

I have put some configuration files on our ftp server (ftp://ftp.smhi.se/pub/klaus/ccsm3_intel). Our PC cluster has dual-CPU 2.2Ghz Xeon boxes, with 2GB RAM on each node. Intel compilers are version 8.1 (It did also work with 8.0). I call the setup mono_ifort, look for these lines in the files to see my changes.

I have installed CCSM3 with Intel compilers and run it for 1 month without problem, with both T31 and T42 resolution. The setup has passed the installation tests (TER, TBR, THY). I am working on the load balancing now before starting a longer run.

Here is the list of modified files:
ccsm3_0/models/bld/Macros.Linux
ccsm3_0/models/utils/esmf/build/linux_intel/base_variables
ccsm3_0/scripts/ccsm_utils/Components/esmf.buildlib
ccsm3_0/scripts/ccsm_utils/Components/mct.buildlib

Note that Intel's mkl library is kind of special, it is not necessary to link it but it speeds up the code slightly.

I got a runtime error (this is when I started this topic), and it had to do with a limited stacksize on our PC cluster. Setting "unlimit" in the beginning of the $CASE.mono_ifort.run script has done the trick.

I will be happy if someone else is trying the Intel compilers and let us now his or her experience, good or bad. Good luck!

Klaus
 
I have been attempting to get CCSM3 to run using the Intel 8.1 compilers on our Linux Cluster and have had the same problem of stopping at the same place with a SIGSEV. It is interesting that the Xr2o mapping is much larger that the mappings Sa2o etc. I am using LAM/MPI and thought it may have been a thread stacksize limit for LAM as I was already using an unlimited stacksize. However increasing this does not seem to have fixed things. Are you using MPICH?
 
Top