Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

porting CCSM3 code from yellowstone to cheyenne

Hi CESM Software Working Group,I'm trying to port CCSM3 from yellowstone to cheyenne and I can get the model configured, but the model fails to build.I documented what I did below.  It seems that additional changes need to be made in file modules.cheyenne. I would like to get some advice on this.  The version of CCSM3 I'm trying to port is at /glade/p/cesm/cseg/releases/ccsm3_0_1_beta33I was following the instruction---6.10 Adding a new machine to $CCSMROOT/ from the CCSM guide below:http://www.cesm.ucar.edu/models/ccsm3.0/ccsm/doc/UsersGuide/UsersGuide/node8.html#SECTION000810000000000000000Here is what I did: 
  • edit $CCSMROOT/ccsm_utils/Tools/check_machine and add the cheyenne to the list named ``resok''. 
  • cd $CCSMROOT/scripts/ccsm_utils/Machines/[/i] and copy yellowstone specific files into cheyenne specific files. 
    • cd $CCSMROOT/scripts/ccsm_utils/Machines/cp env.linux.yellowstone env.linux.cheyennecp run.linux.yellowstone run.linux.cheyennecp batch.linux.yellowstone batch.linux.cheyenne            and "set mach = cheyenne"cp l_archive.linux.yellowstone l_archive.linux.cheyenne           and "set mach = cheyenne"
    • For the modules, I copied that from the cesm1_2_2_1 (cesm1_2_2 for cheyenne)
    • cp $CCSMROOT_of_cesm1_2_2_1/scripts/ccsm_utils/Machines/env_mach_specific.cheyenne         modules.cheyenne
    • I also needs to revise modules.cheyenne to get it working as much as possible (I attached the revised file modules.cheyenne below)
    • I also copied the file "Macros" from one of my cheyenne cesm1_2_2_1 simulation to $CCSMROOT/models/bld/Macros.Linux (which I attached below)[/i]
    • cp $My_cesm1_2_1_1_CASEROOT/Macros           [/i]$CCSMROOT/models/bld/[/i]Macros.Linux[/i]
With all these modifications, I can configure CCSM3, but when I build the model, the model passes esmf, but failed to build for mph> b30.CHE.cheyenne.buildsourcing modules.cheyenne------------------------------------------------------------------------- Preparing component models for execution ------------------------------------------------------------------------- - Create execution directories for atm,cpl,lnd,ice,ocn - If a restart run then copy restart files into executable directory ccsm_getrestart: get /glade/scratch/fenghe/b30.CHE restarts from /glade/scratch/fenghe/archive/b30.CHE/restart - Check validity of configuration - Determine if build must happen (env variable BLDTYPE) - Build flag (BLDTYPE) is TRUE - Build Libraries: esmf, mph, mctThu May 25 19:00:28 MDT 2017 esmf.buildlib.170525-190028Thu May 25 19:00:33 MDT 2017 mph.buildlib.170525-190028ERROR: mph.buildlib failed, see mph.buildlib.170525-190028 ERROR: cat /glade/scratch/fenghe/b30.CHE/mph/mph.buildlib.170525-190028 The error message for mph is copied below Thu May 25 19:00:33 MDT 2017 mph.buildlib.170525-190028icc -E  -DFORTRANUNDERSCORE -DNO_R16 -DLINUX -DCPRINTEL    mph.F > mph.f ifort  -no-opt-dynamic-align  -fp-model source -convert big_endian -assume byterecl -ftz -traceback -assume realloc_lhs  -O2  -fixed -132    mph.f  ifort: command line remark #10411: option '-no-opt-dynamic-align' is deprecated and will be removed in a future release. Please use the replacement option '-qno-opt-dynamic-align'/glade/u/apps/opt/intel/2017u1/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64_lin/for_main.o: In function `main':for_main.c:(.text+0x2a): undefined reference to `MAIN__'/glade/scratch/fenghe/ifortrjZdEe.o: In function `mph_module_mp_mph_components_':mph.f:(.text+0xa06): undefined reference to `mpi_init_'mph.f:(.text+0xa1c): undefined reference to `mpi_comm_dup_'mph.f:(.text+0xa32): undefined reference to `mpi_comm_rank_'mph.f:(.text+0xa48): undefined reference to `mpi_comm_size_'mph.f:(.text+0x1119): undefined reference to `mpi_type_struct_'mph.f:(.text+0x112a): undefined reference to `mpi_type_commit_'mph.f:(.text+0x11be): undefined reference to `mpi_comm_split_'mph.f:(.text+0x13c6): undefined reference to `mpi_allgatherv_'mph.f:(.text+0x13fa): undefined reference to `mpi_bcast_'/glade/scratch/fenghe/ifortrjZdEe.o: In function `mph_module_mp_mph_local_':mph.f:(.text+0x1da7): undefined reference to `mpi_comm_split_'mph.f:(.text+0x1dca): undefined reference to `mpi_comm_rank_'mph.f:(.text+0x1ded): undefined reference to `mpi_comm_size_'mph.f:(.text+0x1f36): undefined reference to `mpi_gather_'mph.f:(.text+0x222e): undefined reference to `mpi_comm_split_'mph.f:(.text+0x2245): undefined reference to `mpi_comm_rank_'mph.f:(.text+0x2287): undefined reference to `mpi_comm_dup_'mph.f:(.text+0x22aa): undefined reference to `mpi_comm_rank_'mph.f:(.text+0x22cd): undefined reference to `mpi_comm_size_'/glade/scratch/fenghe/ifortrjZdEe.o: In function `mph_module_mp_mph_global_':mph.f:(.text+0x298f): undefined reference to `mpi_comm_split_'mph.f:(.text+0x2b92): undefined reference to `mpi_allgatherv_'mph.f:(.text+0x2bc6): undefined reference to `mpi_bcast_'/glade/scratch/fenghe/ifortrjZdEe.o: In function `mph_module_mp_mph_comm_join_':mph.f:(.text+0x3952): undefined reference to `mpi_comm_split_'/glade/scratch/fenghe/ifortrjZdEe.o: In function `mph_module_mp_mph_timer_':mph.f:(.text+0x4dc5): undefined reference to `mpi_wtime_'/glade/scratch/fenghe/ifortrjZdEe.o: In function `mph_module_mp_mph_init_':mph.f:(.text+0x72d6): undefined reference to `mpi_init_'mph.f:(.text+0x72ec): undefined reference to `mpi_comm_dup_'mph.f:(.text+0x7302): undefined reference to `mpi_comm_rank_'mph.f:(.text+0x7318): undefined reference to `mpi_comm_size_'mph.f:(.text+0x79f8): undefined reference to `mpi_type_struct_'mph.f:(.text+0x7a09): undefined reference to `mpi_type_commit_'Makefile:35: recipe for target 'mph.o' failed gmake: *** [mph.o] Error 1   
 
Thanks, jedwards.I want to give you some update on porting the CCSM3 code from yellowstone ( /glade/p/cesm/cseg/releases/ccsm3_0_1_beta33) to cheyenne.In short, the good news is that I can get CCSM3 build on cheyenne, and the bad news is that when I submit the job to cheyenne, it hangs and I don't get any model output. Here is how I get the CCSM3 code build on cheyenne:I need to change the following three lines in models/bld/Macros.LinuxI made the 1st change because the CCSM3 configuration I use don't need both libraries. I made the 2nd and 3rd change because some lines of ocean and sea ice code is longer than 72 characters. 27c27< SLIBS      := -L$(LIB_NETCDF) -lnetcdf> SLIBS      := -L$(LIB_NETCDF) -lnetcdf  -llapack -lblas47c47    FIXEDFLAGS := -byteswapio52c52    FIXEDFLAGS := -byteswapio   Here is how I changed the CCSM3 batch run script for cheyenne: On yellowstone, the code is submitted through a command file (poe.cmdfile) in the run script:mpirun.lsf -cmdfile poe.cmdfileOn cheyenne, I followed the PBS Pro job script examples below on command file (cmdfile)https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne/running-jobs/pbs-pro-job-script-examples#cmdfileI changed the yellowstone run script with the following two linessetenv MPI_SHEPHERD truempiexec_mpt launch_cf.sh poe.cmdfile >&! ccsm3.log.$LID   But when I submit the run script on cheyenne, I don't get any model output. It seems that I didn't add the command file in the run script correctly.   Will you please give me some suggestions on what I should change to get the CCSM3 job running on cheyenne? Thanks!    
 
Thanks, jedwards.I want to give you some update on porting the CCSM3 code from yellowstone ( /glade/p/cesm/cseg/releases/ccsm3_0_1_beta33) to cheyenne.In short, the good news is that I can get CCSM3 build on cheyenne, and the bad news is that when I submit the job to cheyenne, it hangs and I don't get any model output. Here is how I get the CCSM3 code build on cheyenne:I need to change the following three lines in models/bld/Macros.LinuxI made the 1st change because the CCSM3 configuration I use don't need both libraries. I made the 2nd and 3rd change because some lines of ocean and sea ice code is longer than 72 characters. 27c27< SLIBS      := -L$(LIB_NETCDF) -lnetcdf> SLIBS      := -L$(LIB_NETCDF) -lnetcdf  -llapack -lblas47c47    FIXEDFLAGS := -byteswapio52c52    FIXEDFLAGS := -byteswapio   Here is how I changed the CCSM3 batch run script for cheyenne: On yellowstone, the code is submitted through a command file (poe.cmdfile) in the run script:mpirun.lsf -cmdfile poe.cmdfileOn cheyenne, I followed the PBS Pro job script examples below on command file (cmdfile)https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne/running-jobs/pbs-pro-job-script-examples#cmdfileI changed the yellowstone run script with the following two linessetenv MPI_SHEPHERD truempiexec_mpt launch_cf.sh poe.cmdfile >&! ccsm3.log.$LID   But when I submit the run script on cheyenne, I don't get any model output. It seems that I didn't add the command file in the run script correctly.   Will you please give me some suggestions on what I should change to get the CCSM3 job running on cheyenne? Thanks!    
 
Thanks, jedwards.I want to give you some update on porting the CCSM3 code from yellowstone ( /glade/p/cesm/cseg/releases/ccsm3_0_1_beta33) to cheyenne.In short, the good news is that I can get CCSM3 build on cheyenne, and the bad news is that when I submit the job to cheyenne, it hangs and I don't get any model output. Here is how I get the CCSM3 code build on cheyenne:I need to change the following three lines in models/bld/Macros.LinuxI made the 1st change because the CCSM3 configuration I use don't need both libraries. I made the 2nd and 3rd change because some lines of ocean and sea ice code is longer than 72 characters. 27c27< SLIBS      := -L$(LIB_NETCDF) -lnetcdf> SLIBS      := -L$(LIB_NETCDF) -lnetcdf  -llapack -lblas47c47    FIXEDFLAGS := -byteswapio52c52    FIXEDFLAGS := -byteswapio   Here is how I changed the CCSM3 batch run script for cheyenne: On yellowstone, the code is submitted through a command file (poe.cmdfile) in the run script:mpirun.lsf -cmdfile poe.cmdfileOn cheyenne, I followed the PBS Pro job script examples below on command file (cmdfile)https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne/running-jobs/pbs-pro-job-script-examples#cmdfileI changed the yellowstone run script with the following two linessetenv MPI_SHEPHERD truempiexec_mpt launch_cf.sh poe.cmdfile >&! ccsm3.log.$LID   But when I submit the run script on cheyenne, I don't get any model output. It seems that I didn't add the command file in the run script correctly.   Will you please give me some suggestions on what I should change to get the CCSM3 job running on cheyenne? Thanks!    
 

jedwards

CSEG and Liaisons
Staff member
First you do not need the MPI_SHEPHERD line, that is only used for running serial codes on cheyenne.Second poe.cmdfile is a format specific to the old IBM poe environment, you'll have to find a pbs equivalent.  I have no idea what launch_cf.sh is but I bet it's also specific to yellowstone and will need to be reformated to work on cheyenne.
 

jedwards

CSEG and Liaisons
Staff member
First you do not need the MPI_SHEPHERD line, that is only used for running serial codes on cheyenne.Second poe.cmdfile is a format specific to the old IBM poe environment, you'll have to find a pbs equivalent.  I have no idea what launch_cf.sh is but I bet it's also specific to yellowstone and will need to be reformated to work on cheyenne.
 

jedwards

CSEG and Liaisons
Staff member
First you do not need the MPI_SHEPHERD line, that is only used for running serial codes on cheyenne.Second poe.cmdfile is a format specific to the old IBM poe environment, you'll have to find a pbs equivalent.  I have no idea what launch_cf.sh is but I bet it's also specific to yellowstone and will need to be reformated to work on cheyenne.
 

heavens

Member
Hi,I'm having the same problem figuring out how to run once I've built the CCSM3 executables. This isn't a trivial issue. launch_cf.sh is something mentioned in the Cheyenne documentation. It has nothing to do with Yellowstone. Nicholas HeavensResearch Assistant Professor of Planetary ScienceHampton University 
 

heavens

Member
Hi,I'm having the same problem figuring out how to run once I've built the CCSM3 executables. This isn't a trivial issue. launch_cf.sh is something mentioned in the Cheyenne documentation. It has nothing to do with Yellowstone. Nicholas HeavensResearch Assistant Professor of Planetary ScienceHampton University 
 

heavens

Member
Hi,I'm having the same problem figuring out how to run once I've built the CCSM3 executables. This isn't a trivial issue. launch_cf.sh is something mentioned in the Cheyenne documentation. It has nothing to do with Yellowstone. Nicholas HeavensResearch Assistant Professor of Planetary ScienceHampton University 
 

heavens

Member
I have made some progress by noticing that the settings for tempest are effectively SGI PBS settings using mpirun. I'm still not sure of the proper translation into mpiexec_mpt, though. # -------------------------------------------------------------------------# Create processor count input files# ------------------------------------------------------------------------- cd $EXEROOT/all@ PROC = 0      # counts total number of tasksforeach n (1 2 3 4 5)   set comp  = $COMPONENTS[$n]   set model = $MODELS[$n]             set nthrd = $NTHRDS[$n]             set ntask = $NTASKS[$n]   @ M = 0   while ( $M < $ntask )      @ M++        @ PROC++   end   ln -s $EXEROOT/$model/$comp  $EXEROOT/all/.  # link binaries into all dirend # -------------------------------------------------------------------------# Run the model# ------------------------------------------------------------------------- env | egrep '(MP_|LOADL|XLS|FPE|DSM|OMP|MPC)' # document env vars cd $EXEROOTecho "`date` -- CSM EXECUTION BEGINS HERE" mpirun -v -d $EXEROOT/all     -np $NTASKS[1] "env OMP_NUM_THREADS=$NTHRDS[1] $COMPONENTS[1]" :    -np $NTASKS[2] "env OMP_NUM_THREADS=$NTHRDS[2] $COMPONENTS[2]" :    -np $NTASKS[3] "env OMP_NUM_THREADS=$NTHRDS[3] $COMPONENTS[3]" :    -np $NTASKS[4] "env OMP_NUM_THREADS=$NTHRDS[4] $COMPONENTS[4]" :    -np $NTASKS[5] "env OMP_NUM_THREADS=$NTHRDS[5] $COMPONENTS[5]"   &waitecho "`date` -- CSM EXECUTION HAS FINISHED"  
 

heavens

Member
I have made some progress by noticing that the settings for tempest are effectively SGI PBS settings using mpirun. I'm still not sure of the proper translation into mpiexec_mpt, though. # -------------------------------------------------------------------------# Create processor count input files# ------------------------------------------------------------------------- cd $EXEROOT/all@ PROC = 0      # counts total number of tasksforeach n (1 2 3 4 5)   set comp  = $COMPONENTS[$n]   set model = $MODELS[$n]             set nthrd = $NTHRDS[$n]             set ntask = $NTASKS[$n]   @ M = 0   while ( $M < $ntask )      @ M++        @ PROC++   end   ln -s $EXEROOT/$model/$comp  $EXEROOT/all/.  # link binaries into all dirend # -------------------------------------------------------------------------# Run the model# ------------------------------------------------------------------------- env | egrep '(MP_|LOADL|XLS|FPE|DSM|OMP|MPC)' # document env vars cd $EXEROOTecho "`date` -- CSM EXECUTION BEGINS HERE" mpirun -v -d $EXEROOT/all     -np $NTASKS[1] "env OMP_NUM_THREADS=$NTHRDS[1] $COMPONENTS[1]" :    -np $NTASKS[2] "env OMP_NUM_THREADS=$NTHRDS[2] $COMPONENTS[2]" :    -np $NTASKS[3] "env OMP_NUM_THREADS=$NTHRDS[3] $COMPONENTS[3]" :    -np $NTASKS[4] "env OMP_NUM_THREADS=$NTHRDS[4] $COMPONENTS[4]" :    -np $NTASKS[5] "env OMP_NUM_THREADS=$NTHRDS[5] $COMPONENTS[5]"   &waitecho "`date` -- CSM EXECUTION HAS FINISHED"  
 

heavens

Member
I have made some progress by noticing that the settings for tempest are effectively SGI PBS settings using mpirun. I'm still not sure of the proper translation into mpiexec_mpt, though. # -------------------------------------------------------------------------# Create processor count input files# ------------------------------------------------------------------------- cd $EXEROOT/all@ PROC = 0      # counts total number of tasksforeach n (1 2 3 4 5)   set comp  = $COMPONENTS[$n]   set model = $MODELS[$n]             set nthrd = $NTHRDS[$n]             set ntask = $NTASKS[$n]   @ M = 0   while ( $M < $ntask )      @ M++        @ PROC++   end   ln -s $EXEROOT/$model/$comp  $EXEROOT/all/.  # link binaries into all dirend # -------------------------------------------------------------------------# Run the model# ------------------------------------------------------------------------- env | egrep '(MP_|LOADL|XLS|FPE|DSM|OMP|MPC)' # document env vars cd $EXEROOTecho "`date` -- CSM EXECUTION BEGINS HERE" mpirun -v -d $EXEROOT/all     -np $NTASKS[1] "env OMP_NUM_THREADS=$NTHRDS[1] $COMPONENTS[1]" :    -np $NTASKS[2] "env OMP_NUM_THREADS=$NTHRDS[2] $COMPONENTS[2]" :    -np $NTASKS[3] "env OMP_NUM_THREADS=$NTHRDS[3] $COMPONENTS[3]" :    -np $NTASKS[4] "env OMP_NUM_THREADS=$NTHRDS[4] $COMPONENTS[4]" :    -np $NTASKS[5] "env OMP_NUM_THREADS=$NTHRDS[5] $COMPONENTS[5]"   &waitecho "`date` -- CSM EXECUTION HAS FINISHED"  
 
Dear CESM Software Working Group,Due to the popularity and the large user base of CCSM3, I'm wondering whether the CESM Software Working Group can port the CCSM3 code from yellowstone to cheyenne for all the users, as the working group did during the transition from Bluefire to Yellowstone. The large CCSM3 user base will really appreciate this effort from the CESM Software Working Group. Otherwise, I assume many users will make the same efforts to try (and fail?) to port the code from yellowstone to cheyenne again and again for years to come. Thank you very much for the consideration of my suggestion. 
 
Dear CESM Software Working Group,Due to the popularity and the large user base of CCSM3, I'm wondering whether the CESM Software Working Group can port the CCSM3 code from yellowstone to cheyenne for all the users, as the working group did during the transition from Bluefire to Yellowstone. The large CCSM3 user base will really appreciate this effort from the CESM Software Working Group. Otherwise, I assume many users will make the same efforts to try (and fail?) to port the code from yellowstone to cheyenne again and again for years to come. Thank you very much for the consideration of my suggestion. 
 
Dear CESM Software Working Group,Due to the popularity and the large user base of CCSM3, I'm wondering whether the CESM Software Working Group can port the CCSM3 code from yellowstone to cheyenne for all the users, as the working group did during the transition from Bluefire to Yellowstone. The large CCSM3 user base will really appreciate this effort from the CESM Software Working Group. Otherwise, I assume many users will make the same efforts to try (and fail?) to port the code from yellowstone to cheyenne again and again for years to come. Thank you very much for the consideration of my suggestion. 
 

heavens

Member
The most progress I have been able to make is to use something like this in the run script:mpiexec_mpt -v    -np $NTASKS[1] omplace $EXEROOT/all/$COMPONENTS[1]  :    -np $NTASKS[2] omplace $EXEROOT/all/$COMPONENTS[2]  :    -np $NTASKS[3] omplace $EXEROOT/all/$COMPONENTS[3] :    -np $NTASKS[4] omplace $EXEROOT/all/$COMPONENTS[4] :    -np $NTASKS[5] omplace $EXEROOT/all/$COMPONENTS[5]   & The challenge is that you may encounter the error, "MPT ERROR: could not run executable. If this is a non-MPT application,you may need to set MPI_SHEPHERD=true." This is deceptive. It is not caused here by "a bad node" as you may find by searching the forums here. The issue is that MCT does not recognize the various CCSM3 executables as valid MPI programs. I have found that this can be partly solved by ensuring that the code is compiled by the MCT versions of the MPI compilers, but I still end up with segmentation faults:"MPT ERROR: Rank 0(g:0) received signal SIGSEGV(11).Process ID: 25767, Host: r6i4n5, Program: /glade2/scratch2/heavens/Isabel1_mapgenerator/cpl/cplMPT Version: SGI MPT 2.15  12/18/16 02:58:06"I'm trying to see if there are any useful hints in the tracebacks.Nicholas HeavensResearch Assistant Professor of Planetary ScienceHampton University
 
Top