xuexh@ustc_edu_cn
New Member
I run WACCM on bluefire for multi-years. For the first 4 years, it ran OK. But when it ran on Jan 30 for the 5 year, it failed. There was "segmentation fault" in run_??.out file and also a new fold named "coredir.6" appeared at the same time. The file "core_lite" in the fold looked like this:
+++PARALLEL TOOLS CONSORTIUM LIGHTWEIGHT COREFILE FORMAT version 1.0
+++LCB 1.0 Sun Feb 28 12:56:09 2010 Generated by IBM AIX 5.3
#
+++ID Node 6 Process 381354 Thread 1
***FAULT "SIGSEGV - Segmentation violation"
+++STACK
__gw_drag_NMOD_gw_drag_prof : 0x00003720
__gw_drag_NMOD_gw_intr : 0x00000e70
*__gw_drag_NMOD_gw_intr_stub_in_tphysac : 0x000000e8
tphysac : 0x00000e14
....................
In run_...out file, it looked like this:
...skipping...
0:
0:Global flash freq (/s), lightning NOx (TgN/y) = 46.7682 2.2125
ERROR: 0031-250 task 6: Segmentation fault
ERROR: 0031-250 task 9: Terminated
ERROR: 0031-250 task 5: Terminated
ERROR: 0031-250 task 3: Terminated
ERROR: 0031-250 task 2: Terminated
Can you tell me, how to fix this problem? Thank you!
+++PARALLEL TOOLS CONSORTIUM LIGHTWEIGHT COREFILE FORMAT version 1.0
+++LCB 1.0 Sun Feb 28 12:56:09 2010 Generated by IBM AIX 5.3
#
+++ID Node 6 Process 381354 Thread 1
***FAULT "SIGSEGV - Segmentation violation"
+++STACK
__gw_drag_NMOD_gw_drag_prof : 0x00003720
__gw_drag_NMOD_gw_intr : 0x00000e70
*__gw_drag_NMOD_gw_intr_stub_in_tphysac : 0x000000e8
tphysac : 0x00000e14
....................
In run_...out file, it looked like this:
...skipping...
0:
0:Global flash freq (/s), lightning NOx (TgN/y) = 46.7682 2.2125
ERROR: 0031-250 task 6: Segmentation fault
ERROR: 0031-250 task 9: Terminated
ERROR: 0031-250 task 5: Terminated
ERROR: 0031-250 task 3: Terminated
ERROR: 0031-250 task 2: Terminated
Can you tell me, how to fix this problem? Thank you!