Main menu

Navigation

CESM 1.2.2.1 cesm_setup module issue on Cheyenne

32 posts / 0 new
Last post
swang54@...
CESM 1.2.2.1 cesm_setup module issue on Cheyenne

Hello,

I had some issues with cesm_setup on Cheyenne (CESM 1.2.2.1). After creating my new case, I tried to do ./cesm_setup but I got the following error messages:

Please do not use this module, load 2.19 instead 2.19 is loaded effectively ... just to help old scripts

This module will be deprecated soon!!!!

 

Lmod has detected the following error:  These module(s) exist but cannot be loaded as requested: "netcdf-mpi/4.4.1.1"

Try: "module spider netcdf-mpi/4.4.1.1" to see how to load the module(s). 

 

Lmod has detected the following error:  These module(s) exist but cannot be loaded as requested: "pnetcdf/1.8.0"

Try: "module spider pnetcdf/1.8.0" to see how to load the module(s).

 

NETCDF: Undefined variable.

ERROR: /gpfs/u/home/swang54/run_cheyenne/RCP_runs/RCP4.5run/preview_namelists failed: 65280

 

I think the issue is the mpt module. I checked the required modules for loading netcdf-mpi/4.4.1.1 and pnetcdf/1.8.0, and both require mpt/2.16. But the system didn't allow me to load mpt/2.16 (it will force me to load mpt/2.19 instead).

Have anyone had this issue before? How can I fix that?

Thank you!!

Regards,
Sally

jedwards

Due to the cheyenne maintenance this week you need to update some modules

intel/17.0.1

pnetcdf/1.11.0

 

If you are using cesm2 also update netcdf:

netcdf/4.6.1

netcdf-mpi/4.6.1

 

if you are using an older cesm version use

netcdf-mpi/4.4.1.1

CESM Software Engineer

swang54@...

Hello jedwards.

Thank you for your prompt reply!! I fixed it and setup the case sucessfully!!

These are the detailed processes to fix it for people who might have the same issue:

1. Go to your code directory: ~/cesm1_2_2_1/scripts/ccsm_utils/Machines

2. vi env_mach_specific.cheyenne

3. update the listed modules to what jedwards suggested

4. make sure recreate your cases using your modified code so that the changes would update to your new cases


Thanks again for your help!! I really appreciate it!

Regards,
Sally

sungduk@...

I modified env_mach_specific.cheyenne (for CESM1.2.2.1) as suggested: intel/17.0.1, mpt/2.19, netcdf/4.6.1, netcdf-mpi/4.6.1, pnetcdf/1.11.0, ncarenv/1.0, mkl, ncarcompilers/0.3.5.

However, I still have an error during building stage. Would you let me know how to fix this problem?

Thank you,

Sungduk Yu

 

In "cesm.bldlog",

ccsm_comp_mod.o: In function `ccsm_comp_mod_mp_ccsm_run_':

ccsm_comp_mod.F90:(.text+0x26): relocation truncated to fit: R_X86_64_32 against symbol `seq_avdata_mod_mp_infodata_' defined in COMMON section in seq_avdata_mod.o

ccsm_comp_mod.F90:(.text+0x467): relocation truncated to fit: R_X86_64_32S against symbol `ccsm_comp_mod_mp_begstep_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x495): relocation truncated to fit: R_X86_64_32S against symbol `ccsm_comp_mod_mp_dtime_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4c8): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_dtime_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4e7): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_ncpl_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4ee): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4f5): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4fc): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x503): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x50a): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x512): additional relocation overflows omitted from the output

/glade/work/sungduk/ENSO/cases/_EXP/SungdukSST_STEP1_test1_env1/Tools/Makefile:629: recipe for target '/glade/scratch/sungduk/ENSO/_EXP/SungdukSST_STEP1_test1_env1/bld/cesm.exe' failed

gmake: *** [/glade/scratch/sungduk/ENSO/_EXP/SungdukSST_STEP1_test1_env1/bld/cesm.exe] Error 1 

 

 

 

swang54@...

Actually I got the same error messages as you did when building the case. I was wondering whether it was a bug induced by different versions of software or I had something that didn't update correctly.

jedwards

Some older model versions - specifically older versions of pio (earlier than pio1.10.0) are not compatible with netcdf-mpi/4.6.2  cisl user support has now reinstalled netcdf-mpi/4.4.1.1 please try changing that module back and see if it solves the problem.  

CESM Software Engineer

sungduk@...

Thank you for suggestion.

I tried building CESM1.2.2.1 with netcdf-mpi/4.4.1.1, but it failed with a same problem.

swang54@...

Same as here. I got the same error messages too. Here is the module list in the env_mach_specific.cheyenne file as a reference:

 

source  /etc/profile.d/modules.csh

 

module purge

 

module load intel/17.0.1

module load ncarenv/1.0

module load mkl

module load ncarcompilers/0.3.5

module load mpt/2.19

if( $MPILIB == "mpi-serial" ) then

  module load netcdf/4.6.1

else

  module load netcdf-mpi/4.4.1.1

  module load pnetcdf/1.11.0

 

endif

jedwards

Please be clear about what error - this one? "relocation truncated to fit: R_X86_64_32"  if so make sure that you completly clean and rebuild after you make the suggested change.  

CESM Software Engineer

jedwards

For 1.2.2.1 you should have the following in env_mach_specific


module purge

 

module load intel/17.0.1

module load ncarenv/1.2

module load mkl

module load ncarcompilers/0.4.1

module load mpt/2.19

if( $MPILIB == "mpi-serial" ) then

  module load netcdf/4.4.1.1

else

  module load netcdf-mpi/4.4.1.1

  module load pnetcdf/1.11.0

endif

CESM Software Engineer

swang54@...

Hi jedwards,


Thank you for your reply. I did the following steps and still got the error messages when building the case: "ccsm_comp_mod.F90:(.text+0x50a): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o".

1. I changed the module list in the env_mach_specific.cheyenne as you suggested above

2. create a new case

3. ./cesm_setup

4. ./case.build

 

Did I miss any step?

Thank you!

Regards,
Sally

sungduk@...

I also have a same error, even after updated env_mach_specific and clean_build. At the end of cesm.bldlog:

ccsm_comp_mod.o: In function `ccsm_comp_mod_mp_ccsm_run_':

ccsm_comp_mod.F90:(.text+0x26): relocation truncated to fit: R_X86_64_32 against symbol `seq_avdata_mod_mp_infodata_' defined in COMMON section in seq_avdata_mod.o

ccsm_comp_mod.F90:(.text+0x467): relocation truncated to fit: R_X86_64_32S against symbol `ccsm_comp_mod_mp_begstep_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x495): relocation truncated to fit: R_X86_64_32S against symbol `ccsm_comp_mod_mp_dtime_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4c8): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_dtime_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4e7): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_ncpl_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4ee): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4f5): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x4fc): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x503): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x50a): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o

ccsm_comp_mod.F90:(.text+0x512): additional relocation overflows omitted from the output

/glade/work/sungduk/ENSO/cases/_EXP/SungdukSST_STEP1_test1_env4/Tools/Makefile:629: recipe for target '/glade/scratch/sungduk/ENSO/_EXP/SungdukSST_STEP1_test1_env4/bld/cesm.exe' failed

gmake: *** [/glade/scratch/sungduk/ENSO/_EXP/SungdukSST_STEP1_test1_env4/bld/cesm.exe] Error 1

jedwards

So the next suggestion is a little more complicated.  

In your source tree:

cd models/utils

mv pio pio.old

(sorry this was the wrong tag) [svn co https://github.com/NCAR/ParallelIO.git/tags/pio1_8_12/pio ]

 svn co https://github.com/NCAR/ParallelIO.git/tags/pio1_8_14/pio 

then clean and rebuild your case.  You should be able to use either netcdf version with this change, I recommend netcdf-mpi/4.6.1

CESM Software Engineer

swang54@...

Hi Jedwards.

Thanks for keeping updating the solutions! I really appreciate your help.

I tried the method you suggested (using new pio) but it still return the error: "ccsm_comp_mod.F90:(.text+0x50a): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o".

I tried both netcdf-mpi/4.6.1 and netcdf-mpi/4.4.1.1 but both didn't work and still returned the above error.

Regards,
Sally

jedwards

This appears to be a different problem than the pio one.   The problem is in pop - I can run an F case without any issues - from the symptoms I think that pop is trying to declare some array to an unrealistically large size but I've not been able to pinpoint it.  

CESM Software Engineer

swang54@...

Hi Jedwards,


Thank your for keep updating the debugging progress! Yes. I ran a B case (BRCP45C5CN). I will try to help diagnose the issue but probably not as efficient as you do. If I was able to find something, i'll update it here.

Many thanks!
Sally

sungduk@...

Thank you, jedwards and Sally!

I checked out the newer pio, but still I can't run CESM. I tested both pio 4.4.1.1 and 4.6.1, also B and F compsets (B1850C5 and F1850C5). However, any compbination of these returned a same error. (jedwards: would you share your env_mach_specific.cheyenne? I may have set something wrong cus I could not run a F-compset.)

- Sungduk

jedwards

My F-case is in /glade/scratch/jedwards/cesm1_2_2_1.F

CESM Software Engineer

zke@...

Dear Jedwards:

By copy your env_mach_specific settings and following your pio advices, I still can not build the case like in your cesm1_2_2_1.F

the error is

zke3@cheyenne4:$ ./cesm_setup
Creating Macros file for cheyenne
/gpfs/u/home/zke3/cesm1_2_2_1/scripts/ccsm_utils/Machines/config_compilers.xml intel cheyenne
Creating batch script cesm1_2_2_1.F.run
PHASE is set_batch
mppsize is 128 4
PHASE is set_exe
Locking file env_mach_pes.xml
Creating user_nl_xxx files for components and cpl
Running preview_namelist script
 infile is /glade/u/home/zke3/cases/cesm1_2_2_1.F/Buildconf/cplconf/cesm_namelist
** build-namelist - CCSM inputdata root is not a directory: "/inputdata" **
ERROR: cpl.buildnml.csh failed
ERROR: /gpfs/u/home/zke3/cases/cesm1_2_2_1.F/preview_namelists failed: 25344


---------------------------

Thanks!


Ziming Ke

jedwards

It looks like you messed something up in your login window.   Logout and back in and then try again with a new case. 

CESM Software Engineer

zke@...

Hi, Jedwards:

I redo the process. You are right I may messed somthing in login. Now it can to to the final build step 'CESM', but same error as reported yesterday.

Here is the error

----------------------------

Mon Mar 11 11:24:22 MDT 2019 /glade/scratch/zke3/cesm1_2_2_1.F/bld/rof.bldlog.190311-111849
Mon Mar 11 11:24:31 MDT 2019 /glade/scratch/zke3/cesm1_2_2_1.F/bld/cesm.bldlog.190311-111849
ERROR: cesm.buildexe.csh failed, see /glade/scratch/zke3/cesm1_2_2_1.F/bld/cesm.bldlog.190311-111849
ERROR: cat /glade/scratch/zke3/cesm1_2_2_1.F/bld/cesm.bldlog.190311-111849

-------------------------

ccsm_comp_mod.F90:(.text+0x26): relocation truncated to fit: R_X86_64_32 against symbol `seq_avdata_mod_mp_infodata_' defined in COMMON section in seq_avdata_mod.o
ccsm_comp_mod.F90:(.text+0x467): relocation truncated to fit: R_X86_64_32S against symbol `ccsm_comp_mod_mp_begstep_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x495): relocation truncated to fit: R_X86_64_32S against symbol `ccsm_comp_mod_mp_dtime_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x4c8): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_dtime_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x4e7): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_ncpl_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x4ee): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x4f5): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x4fc): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x503): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x50a): relocation truncated to fit: R_X86_64_PC32 against symbol `ccsm_comp_mod_mp_cktime_acc_' defined in COMMON section in ccsm_comp_mod.o
ccsm_comp_mod.F90:(.text+0x512): additional relocation overflows omitted from the output
/glade/u/home/zke3/cases/cesm1_2_2_1.F/Tools/Makefile:629: recipe for target '/glade/scratch/zke3/cesm1_2_2_1.F/bld/cesm.exe' failed
gmake: *** [/glade/scratch/zke3/cesm1_2_2_1.F/bld/cesm.exe] Error 1

-----------------

Ziming Ke

 

kangwanying1992@...

Hi,

I have met the same problem, and it seems that the model tends to allocate a too large array (greater than the 2GB limit). I searched online and found one solution which is to add "-mcmodel=medium" to the compile flag. You could add this flag somewhere to Machines/config_compilers.xml under intel machine. 

After this fix, the compile can go through ok, but the executable is not runable... Any comment and advice on that?

Best,

Wanying

wanying kang

jedwards

I posted this earlier and gave the wrong revision number - corrected instructions are here:

So the next suggestion is a little more complicated.  

In your source tree:

cd models/utils

mv pio pio.old

svn co https://github.com/NCAR/ParallelIO.git/tags/pio1_8_14/pio

 

then clean and rebuild your case.  You should be able to use either netcdf version with this change, I recommend netcdf-mpi/4.6.1

CESM Software Engineer

christopher.mal...

Hi Jedwards,

I appreciate all of the work that you have put into helping fix this issue. I have updated my env_mach_specific.cheyenne to match yours, downloaded the new 'pio' library per your instructions, and performed a clean build; however, my case is still failing to build. The error message has changed from the ones prevoiously mentioned though. Any suggestions for fixes would be greatly appreciated. I have put the error message below:

shr_pio_mod.F90:(.text+0x21b6): undefined reference to `piodarray_mp_pio_set_buffer_size_limit_'

/gpfs/u/home/cmaloney/cesm1_2_2_1/scripts/testing_new_mpt_v2/Tools/Makefile:629: recipe for target '/glade/scratch/cmaloney/testing_new_mpt_v2/bld/cesm.exe' failed

 

gmake: *** [/glade/scratch/cmaloney/testing_new_mpt_v2/bld/cesm.exe] Error 1

 

Thanks,

Chris Maloney

 

jedwards

I don't see why you would get that error - everything looks correct to me.   Try removing the bld directory

rm -fr /glade/scratch/cmaloney/testing_new_mpt_v2/bld

and building again.

CESM Software Engineer

nusbaume@...

Hi Jedwards and all,

 

I am also getting the "relocation truncated to fit" error during compilation of a b-compset using CESM1.2.0.1.  I have modified the libraries (e.g. mpt/2.19) as requested , and also copied the new PIO library into my model source code (pio1_8_14).  However, it looks like I am still getting the same "relocation" error, and am unable to build the model.  I also tried adding the "-mcmodel=medium" flag as suggested by Wanying Kang, but wound up with the same result.  Although, at least that probably means that I am having the same (or a similar) problem as Wanying.

 

Anyways, should I use a different PIO version, given that my CESM version is different? Any advice or suggestions for this particular issue would certainly be appreciated.  I'll also keep digging around to see if I can find any solutions, but not sure how successful I'll be.

 

Thanks, and have a great day!

 

Jesse Nusbaumer

sungduk@...

Thank you, jedwards and all who investigated this issue. CESM1.2.2.1 is finally working (both F and B compsets) after updating pio version to 1_8_14 (as shown in #23). 

Just for refrence to others, my env_mach_specific.cheyenne uses:

intel/17.0.1, ncarenv/1.0, mkl, ncarcompilers/0.4.1, mpt/2.19, netcdf/4.6.1, netcdf-mpi/4.6.1, and pnetcdf/1.11.0.

- Sungduk

nusbaume@...

Hi Sungduk, Jedwards et al.,

 

Thanks for sharing!  I found out that the build scripts (*.build and *.clean_build) I had were skipping the re-compilation of the PIO library for my particular case.  However, when I started from scratch with a new case using the updated libraries (included the updated PIO) it compiled and ran without an issue.  So it looks like this fix works for the CESM1.2.0.1 version as well (which I figured I would share here in case any future user runs across this same issue/thread).

 

Thanks again for the help, and have a great day!

 

Jesse Nusbaumer

kmkodama314@...

I'm currently trying to run the CESM 1.1.2 LENS configuration. I made the changes to env_mach_specific from comment #10 but wasn't able to successfully build with the new pio, (it threw an error that said there was no Makefile), so I left it at the old one. While it sets up and builds fine, it quickly aborts shortly after running with the following error.

0:256  r7i3n22
0: ... list truncated at 256
1: Opened existing file b.e11.B1850C5CN.f09_g16.005.cam.i.0402-01-01-00000.nc
1:           0
1: Opened existing file
1: /glade/p/cesmdata/cseg/inputdata/atm/cam/topo/USGS-gtopo30_0.9x1.25_remap_c0510
1: 27.nc           1
MPT: shepherd terminated: r7i2n11.ib0.cheyenne.ucar.edu - job aborting

It seems like there is still an incompatibility with the new libraries, but I'm not sure where since I'm running an older version of the model. Is it because I am still using the old pio utilities? If so, how do build this version with the new pio?

jedwards

For the 1.1.2 version of cesm you need to update pio to:

https://github.com/NCAR/ParallelIO.git/tags/pio1_7_3/pio
 

CESM Software Engineer

yihsuan@...

Hi,

I am using the CESM1.1.1 and I encounter similar issues of "relocation truncated to fit". I update the env_mach_specific from comment #10 and update pio to pio1_7_3, but the CESM still cannot be built. The error message is

---

Can't open perl script "../bin/genf90.pl": No such file or directory 

Makefile:125: recipe for target 'pionfatt_mod.F90' failed

---

It seems that the pio1_7_3 does not work for CESM1.1.1. Does anyone know which version of pio should I use for the CESM1.1.1? Thanks.


jedwards

Users of cesm1.1.x should look here:

https://bb.cgd.ucar.edu/cesm111-build-pio-question

CESM Software Engineer

Log in or register to post comments

Who's new

  • bxz125@...
  • yixiaozhang@...
  • dongxia.yang@...
  • 2017301110179@...
  • zhangpengcheng@...