Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Building CCSM 3.0.1 beta 14 in preparation

Hi folks.

I've been tasked with getting CCSM 3.0.1 beta 14 (probably just the most recent version, really) building on lightning in preparation for the arrival of a similar compute cluster here at UCI.

It's building, but I'm not sure it's running as it should. If I'm not mistaken, it appears to be thinking that it should run on only one node - is that what it should be doing? The final output is:

Shared memory file: /tmp/gmpi_shmem-9504250:[0-9]*.tmp

/usr/bin/ssh -l strombrg ln0310en "cd /ptmp/strombrg/T31x3/all && exec env GMPI_MASTER=ln0310en GMPI_PORT=47854 GMPI_SHMEM=1 GMPI_SHMEM_PREFIX=/tmp/gmpi_shmem- GMPI_VERBOSE=1 LD_LIBRARY_PATH=/contrib/2.6/pathscale/2.2.1/lib/2.2.1:/contrib/2.6/mpich-gm/1.2.6..14a-pathscale-2.2.1-64/lib:/opt/gm/lib64 DISPLAY=ln0127en:16.0 GMPI_MAGIC=9504250 GMPI_ID=0 GMPI_NP=1 GMPI_BOARD=-1 GMPI_SLAVE=192.168.150.80 /usr/local/lsf/6.2/linux2.6-glibc2.3-x86_64/bin/TaskStarter -p ln0310en:47853 -c /usr/local/lsf/conf -a X86_64 /ptmp/strombrg/T31x3/all/cpl "
All processes have been spawned
Warning: Permanently added the RSA host key for IP address '192.168.150.80' to the list of known hosts.^M
(main) =========================================================================
(main) CCSM Coupler, version 6 (cpl6)
(main) CVS tag $Name: ccsm3_0_1_beta14 $
(main) date & time: 2006-05-18 16:11:13
(main) =========================================================================
(cpl_comm_init) setting up communicators, name = cpl
===================================
warning: global processor 0 is overlapped
(cpl_comm_init) cpl_comm_comp, size: 137 1
User defined signal 2
Job /usr/local/lsf/6.2/linux2.6-glibc2.3-x86_64/bin/gmmpirun_wrapper -v -pg mpirun.pgfile ./cpl

TID HOST_NAME COMMAND_LINE STATUS TERMINATION_TIME
===== ========== ================ ======================= ===================
00000 ln0310en /ptmp/strombrg/T Exit (status unknown)



Thanks!
 
I got some really useful hints from Juli Rew.

I wound up making the following changes, based on this advice, to the script I use to coordinate the build+run:

$ diff -u local-script.2006-05-24 local-script
--- local-script.2006-05-24 2006-05-24 15:06:02.254994367 -0600
+++ local-script 2006-05-24 16:45:41.290047124 -0600
@@ -1,8 +1,7 @@
#!/usr/local/bin/bash

# 1. cd'd to scripts directory
-# 2. created a case using:
-# create_newcase -case Lightpath -mach lightning -res T31_gx3v5 -compset B -ccsmroot ~juliana/ccsm3_0_1_beta12
+# 2. created a case using: create_newcase -case Lightpath -mach lightning -res T31_gx3v5 -compset B -ccsmroot ~juliana/ccsm3_0_1_beta12
# 3. cd'd to the directory it created, called Lightpath
# 4. configured, using: configure -mach lightning
# 5. This created build and run scripts. I first ran the build script.
@@ -242,6 +241,9 @@
# and generate the configure script for this experiment. #cd
# ~/ccsm3_0/scripts

+#CASEID='T31_gx3v5'
+#export CASEID
+
CASEID='T31x3'
export CASEID

@@ -332,6 +334,12 @@
;;
esac

+if ! ./"$CASEID"."$machine".build
+then
+ echo Sorry, build failed 1>&2
+ exit 1
+fi
+
#3.2.4 Run Geometry
#
#3.2.5 Testing CCSM
@@ -354,7 +362,7 @@
# This bsub is for LSF - Many other linux systems will likely use
# qsub. Note: bsub likes the ./ and I doubt anything else will
# mind having it there
- if ! bsub -q regular -o output ./"$CASEID"."$machine".run
+ if ! bsub -q regular -o output < "$CASEID"."$machine".run
then
echo qsub failed 1>&2
exit 1
ln0127en-strombrg:~/CCSM3 x86_64-suse-linux 14270 - above cmd done Wed May 24 06:10 PM
 
Top