jeff_blasius@yale_edu
New Member
Hello Everyone,
I'm trying to run ccsm on a linux/intel cluster, but it's been quite a struggle. At this point the cpl process is dying about 10 minutes into the run with a p2_3729: p4_error: interrupt SIGSEGV: 11. Below is the log.
Any Ideas?
Thanks,
jeff
node033 0
node035 1 /home/jb723/testcase/all/cpl
node035 1 /home/jb723/testcase/all/cpl
node062 1 /home/jb723/testcase/all/csim
node036 1 /home/jb723/testcase/all/csim
node036 1 /home/jb723/testcase/all/csim
node037 1 /home/jb723/testcase/all/csim
node037 1 /home/jb723/testcase/all/csim
node038 1 /home/jb723/testcase/all/csim
node038 1 /home/jb723/testcase/all/csim
node039 1 /home/jb723/testcase/all/csim
node039 1 /home/jb723/testcase/all/clm
node040 1 /home/jb723/testcase/all/clm
node040 1 /home/jb723/testcase/all/clm
node041 1 /home/jb723/testcase/all/clm
node041 1 /home/jb723/testcase/all/clm
node042 1 /home/jb723/testcase/all/clm
node042 1 /home/jb723/testcase/all/pop
node043 1 /home/jb723/testcase/all/pop
node043 1 /home/jb723/testcase/all/pop
node044 1 /home/jb723/testcase/all/pop
node044 1 /home/jb723/testcase/all/pop
node045 1 /home/jb723/testcase/all/pop
node045 1 /home/jb723/testcase/all/pop
node046 1 /home/jb723/testcase/all/pop
node046 1 /home/jb723/testcase/all/pop
node047 1 /home/jb723/testcase/all/pop
node047 1 /home/jb723/testcase/all/pop
node048 1 /home/jb723/testcase/all/pop
node048 1 /home/jb723/testcase/all/pop
node049 1 /home/jb723/testcase/all/pop
node049 1 /home/jb723/testcase/all/pop
node050 1 /home/jb723/testcase/all/pop
node050 1 /home/jb723/testcase/all/pop
node051 1 /home/jb723/testcase/all/pop
node051 1 /home/jb723/testcase/all/pop
node052 1 /home/jb723/testcase/all/pop
node052 1 /home/jb723/testcase/all/pop
node053 1 /home/jb723/testcase/all/pop
node053 1 /home/jb723/testcase/all/pop
node054 1 /home/jb723/testcase/all/pop
node054 1 /home/jb723/testcase/all/cam
node055 1 /home/jb723/testcase/all/cam
node055 1 /home/jb723/testcase/all/cam
node056 1 /home/jb723/testcase/all/cam
node056 1 /home/jb723/testcase/all/cam
node057 1 /home/jb723/testcase/all/cam
node057 1 /home/jb723/testcase/all/cam
node058 1 /home/jb723/testcase/all/cam
node058 1 /home/jb723/testcase/all/cam
node059 1 /home/jb723/testcase/all/cam
node059 1 /home/jb723/testcase/all/cam
node060 1 /home/jb723/testcase/all/cam
node060 1 /home/jb723/testcase/all/cam
node061 1 /home/jb723/testcase/all/cam
node061 1 /home/jb723/testcase/all/cam
node062 1 /home/jb723/testcase/all/cam
(main) -------------------------------------------------------------------------
(main) start of main integration loop
(main) -------------------------------------------------------------------------
(tStamp_write) cpl model date 0001-01-01 00000s wall clock 2006-03-16 10:11:37 avg dt 0s dt 0s
(main) -------------------------------------------------------------------------
(main) process IC data
(main) -------------------------------------------------------------------------
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(frac_set) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
(cpl_map_bun) WARNING: bundle aoflux_o has accum count = 0
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
(main) -------------------------------------------------------------------------
(main) optional atm initialization: send albedos & recv new solar?
(main) -------------------------------------------------------------------------
(main) * atm component requests recalculation of initial solar
(main) send albedos to atm, recv new atm IC's
(main) -------------------------------------------------------------------------
(main) create data as necessary for 1st iteration of main event loop
(main) -------------------------------------------------------------------------
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
(main) -------------------------------------------------------------------------
(main) start of main integration loop
(main) -------------------------------------------------------------------------
(tStamp_write) cpl model date 0001-01-01 00000s wall clock 2006-03-16 10:11:37 avg dt 0s dt 0s
p2_3729: p4_error: interrupt SIGSEGV: 11
rm_l_2_3730: (115.379166) net_send: could not write to fd=5, errno = 32
rm_l_52_2892: (156.207844) net_send: could not write to fd=5, errno = 32
p51_2923: p4_error: net_recv read: probable EOF on socket: 1
p10_3371: (226.677444) net_send: could not write to fd=5, errno = 32
p0_4309: (228.147000) net_send: could not write to fd=4, errno = 32
p23_3553: (225.420471) net_send: could not write to fd=5, errno = 32
P4 procgroup file is /home/jb723/pgfile_play.
Thu Mar 16 10:13:30 EST 2006 -- CSM EXECUTION HAS FINISHED
Model did not complete - see cpl.log.060203-142420
I'm trying to run ccsm on a linux/intel cluster, but it's been quite a struggle. At this point the cpl process is dying about 10 minutes into the run with a p2_3729: p4_error: interrupt SIGSEGV: 11. Below is the log.
Any Ideas?
Thanks,
jeff
node033 0
node035 1 /home/jb723/testcase/all/cpl
node035 1 /home/jb723/testcase/all/cpl
node062 1 /home/jb723/testcase/all/csim
node036 1 /home/jb723/testcase/all/csim
node036 1 /home/jb723/testcase/all/csim
node037 1 /home/jb723/testcase/all/csim
node037 1 /home/jb723/testcase/all/csim
node038 1 /home/jb723/testcase/all/csim
node038 1 /home/jb723/testcase/all/csim
node039 1 /home/jb723/testcase/all/csim
node039 1 /home/jb723/testcase/all/clm
node040 1 /home/jb723/testcase/all/clm
node040 1 /home/jb723/testcase/all/clm
node041 1 /home/jb723/testcase/all/clm
node041 1 /home/jb723/testcase/all/clm
node042 1 /home/jb723/testcase/all/clm
node042 1 /home/jb723/testcase/all/pop
node043 1 /home/jb723/testcase/all/pop
node043 1 /home/jb723/testcase/all/pop
node044 1 /home/jb723/testcase/all/pop
node044 1 /home/jb723/testcase/all/pop
node045 1 /home/jb723/testcase/all/pop
node045 1 /home/jb723/testcase/all/pop
node046 1 /home/jb723/testcase/all/pop
node046 1 /home/jb723/testcase/all/pop
node047 1 /home/jb723/testcase/all/pop
node047 1 /home/jb723/testcase/all/pop
node048 1 /home/jb723/testcase/all/pop
node048 1 /home/jb723/testcase/all/pop
node049 1 /home/jb723/testcase/all/pop
node049 1 /home/jb723/testcase/all/pop
node050 1 /home/jb723/testcase/all/pop
node050 1 /home/jb723/testcase/all/pop
node051 1 /home/jb723/testcase/all/pop
node051 1 /home/jb723/testcase/all/pop
node052 1 /home/jb723/testcase/all/pop
node052 1 /home/jb723/testcase/all/pop
node053 1 /home/jb723/testcase/all/pop
node053 1 /home/jb723/testcase/all/pop
node054 1 /home/jb723/testcase/all/pop
node054 1 /home/jb723/testcase/all/cam
node055 1 /home/jb723/testcase/all/cam
node055 1 /home/jb723/testcase/all/cam
node056 1 /home/jb723/testcase/all/cam
node056 1 /home/jb723/testcase/all/cam
node057 1 /home/jb723/testcase/all/cam
node057 1 /home/jb723/testcase/all/cam
node058 1 /home/jb723/testcase/all/cam
node058 1 /home/jb723/testcase/all/cam
node059 1 /home/jb723/testcase/all/cam
node059 1 /home/jb723/testcase/all/cam
node060 1 /home/jb723/testcase/all/cam
node060 1 /home/jb723/testcase/all/cam
node061 1 /home/jb723/testcase/all/cam
node061 1 /home/jb723/testcase/all/cam
node062 1 /home/jb723/testcase/all/cam
(main) -------------------------------------------------------------------------
(main) start of main integration loop
(main) -------------------------------------------------------------------------
(tStamp_write) cpl model date 0001-01-01 00000s wall clock 2006-03-16 10:11:37 avg dt 0s dt 0s
(main) -------------------------------------------------------------------------
(main) process IC data
(main) -------------------------------------------------------------------------
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(frac_set) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
(cpl_map_bun) WARNING: bundle aoflux_o has accum count = 0
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
(main) -------------------------------------------------------------------------
(main) optional atm initialization: send albedos & recv new solar?
(main) -------------------------------------------------------------------------
(main) * atm component requests recalculation of initial solar
(main) send albedos to atm, recv new atm IC's
(main) -------------------------------------------------------------------------
(main) create data as necessary for 1st iteration of main event loop
(main) -------------------------------------------------------------------------
MCT::m_AttrVect::indexRA_:: ERROR--attribute not found: "afrac" Traceback:
(cpl_bundle_mult) ->MCT::m_AttrVect::indexRA_
MCT(MPEU)::m_List::clean_: deallocate(aList%...) error, stat =1
(main) -------------------------------------------------------------------------
(main) start of main integration loop
(main) -------------------------------------------------------------------------
(tStamp_write) cpl model date 0001-01-01 00000s wall clock 2006-03-16 10:11:37 avg dt 0s dt 0s
p2_3729: p4_error: interrupt SIGSEGV: 11
rm_l_2_3730: (115.379166) net_send: could not write to fd=5, errno = 32
rm_l_52_2892: (156.207844) net_send: could not write to fd=5, errno = 32
p51_2923: p4_error: net_recv read: probable EOF on socket: 1
p10_3371: (226.677444) net_send: could not write to fd=5, errno = 32
p0_4309: (228.147000) net_send: could not write to fd=4, errno = 32
p23_3553: (225.420471) net_send: could not write to fd=5, errno = 32
P4 procgroup file is /home/jb723/pgfile_play.
Thu Mar 16 10:13:30 EST 2006 -- CSM EXECUTION HAS FINISHED
Model did not complete - see cpl.log.060203-142420