客户在AIX 6100-09/Power9平台上安装11gR2 RAC,遇到报错:
# kfod
Segmentation fault(coredump)
经了解,kfod是ASM用来查找磁盘的命令。查询Oracle support,发现一例类似故障,与LD_LIBRARY_PATH/LIBPATH设置不当有关。建议客户调整:
unset LD_LIBRARY_PATH
export LIBPATH=${ORACLE_HOME}/lib:/usr/lib:$LIBPATH
重试后问题依旧。
要求客户提供snapcore数据:
方法参考:AIX下core dump定位方法简介
通过dbx分析,发现段错误代码位置涉及对空指针赋值:
# dbx ./kfod /u01/grid/core
Type 'help' for help.
[using memory image in /u01/grid/core]
reading symbolic information ...
Segmentation fault in dbgc_free_sga at 0x1000d9d24 ($t1)
0x1000d9d24 (dbgc_free_sga+0x24) 98060001 stb r0,0x1(r6)
(dbx) print $r0
0x0000000000000001
(dbx) print $r6
(nil)
(dbx) where
dbgc_free_sga(0x5, 0x11029ec30, 0x11029ec30) at 0x1000d9d24
dbgc_free_all(??) at 0x1000d978c
dbgc_rls_diagctx_i(??, ??) at 0x1000d6edc
dbgc_rls_diagctx(??, ??) at 0x1000d9694
kpeDbgInitDBGC(??, ??) at 0x1006e2e24
kpeDbgGetInitFileParmsAndInitDBGC(??, ??, ??) at 0x1006e2f40
nlstddd_do_alter_diag(??, ??, ??, ??, ??, ??) at 0x1000c97fc
nlstddt_do_alter_trace(??, ??, ??) at 0x1000c310c
nlstdggo(??, ??, ??, ??, ??, ??, ??) at 0x1000c2050
nlstdgg(??, ??, ??, ??, ??) at 0x1000c1b54
nigini2(??, ??, ??, ??, ??) at 0x100f00004
kpeDbgGetNPDGlobal(??, ??, ??) at 0x1006e36c4
kpeDbgTLSInit(??, ??) at 0x1006e35b8
kpummTLSGET1(??, ??) at 0x1006e8294
kpeDbgProcessInit(??, ??) at 0x1006e27c0
kpummpin(??, ??, ??, ??, ??, ??, ??, ??) at 0x1006e5ddc
kpuenvcr(0x10001bbd4, 0x2300000023, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) at 0x1008af104
OCIEnvCreate(??, ??, ??, ??, ??, ??, ??, ??) at 0x100f01a7c
kfodgpGetDscString(??, ??, ??, ??) at 0x10222e478
kfodParseProfileString(??, ??) at 0x100055d4c
kfodParse(??, ??, ??) at 0x100054260
kfod_main(??, ??, ??, ??, ??) at 0x100053ed0
lpmcall(??, ??, ??, ??, ??) at 0x10223ed08
lpmpmai(??, ??, ??, ??) at 0x10223e83c
main(??, ??) at 0x100000638
而通过dbx在安装正常的环境进行调试发现,正常环境下,kfod不会进入dbgc_rls_diagctx函数,不会触发故障逻辑。因此判断该问题应当属于Oracle应用程序缺陷,在特定条件下触发,可以通过找到触发条件进行规避。
进一步观察发现CORE DUMP报错时间均为1970年1月9日,怀疑该问题系由系统时间错误触发,将日期改到2023年,问题消失:
# date 081721281970
Mon Aug 17 21:28:25 CST 1970
# kfod
Segmentation fault(coredump)
# date 0817213623
Thu Aug 17 21:36:41 CST 2023
# kfod
--------------------------------------------------------------------------------
Disk Size Path User Group
================================================================================
1: 190773 Mb /dev/rhdisk0 root system
...
如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!
赞3
添加新评论0 条评论