ORA-04021导致oracle11gADG备库宕机问题处理
admin
2023-04-20 22:03:45
0

发现数据库告警,查看alert日志,发现如下报错
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lgwr_26383.trc:
ORA-04021: timeout occurred while waiting to lock object
LGWR (ospid: 26383): terminating the instance due to error 4021
Sun Mar 25 03:29:07 2018
System state dump requested by (instance=1, osid=26383 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcle1_diag_26321_20180325032907.trc
Instance terminated by LGWR, pid = 26383
Sun Mar 25 03:29:20 2018
Starting ORACLE instance (normal)

先处理DG备库问题,查看状态发现库是MOUNT状态,先将数据库启动。
SQL> alter database open;
SQL> alter database recover managed standby database using current logfile disconnect;
SQL> select open_mode from v$database;

OPEN_MODE

READ ONLY WITH APPLY

再查看问题,MOS对该问题解释如下:
APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
Information in this document applies to any platform.
SYMPTOMS

DR database crashed with below errors..

Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=XX.XXX.XXX.XX)(PORT=54537))
WARNING: inbound connection timed out (ORA-3136)
Wed Jul 13 13:43:24 2016
Errors in file /u01/app/oracle/diag/rdbms/rxeprr_dr/RXEPRR1/trace/RXEPRR1_lgwr_31312.trc:
ORA-04021: timeout occurred while waiting to lock object
LGWR (ospid: 31312): terminating the instance due to error 4021
Wed Jul 13 13:43:24 2016
System state dump requested by (instance=1, osid=31312 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/rxeprr_dr/RXEPRR1/trace/RXEPRR1_diag_31221.trc
Wed Jul 13 13:43:25 2016
License high water mark = 318
Instance terminated by LGWR, pid = 31312
USER (ospid: 20898): terminating the instance
Instance terminated by USER, pid = 20898
Wed Jul 13 13:43:39 2016
Starting ORACLE instance (normal)

CHANGES

No changes

CAUSE

Bug 16717701 - ADG SHOULD GET THE INSTANCE PARSE LOCK WITH A TIMEOUT

Bug 11712267 - ACTIVE DATA GUARD DATABASE HUNG ON 'LIBRARY CACHE: MUTEX X' WAIT EVENT

LGWR trace file (RXEPRR1_lgwr_31312.trc)

2016-07-13 13:43:24.498
SESSION ID:(6709.1) 2016-07-13 13:43:24.498
CLIENT ID:() 2016-07-13 13:43:24.498
SERVICE NAME:(SYS$BACKGROUND) 2016-07-13 13:43:24.498
MODULE NAME:() 2016-07-13 13:43:24.498
ACTION NAME:() 2016-07-13 13:43:24.498

error 4021 detected in background process
ORA-04021: timeout occurred while waiting to lock object
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+1296<-kjzdicrshnfy()+364<-ksuitm()+1688<-ksbrdp()+4296<-opirip()+1680<-opidrv()+748<-sou2o()+88<-opimai_real()+276<-ssthrdmain()+316<-main()+316<-_start()+380
----- End of Abridged Call Stack Trace -----

SOLUTION

Issue matches with bug 11712267 and bug 16717701

Since two bugs are matching with the case,

You can try with option (1) . As per Bug 11712267

change the cursor_sharing to force on Active dataguard (ADG).

Monitor your environment for sometime.

If it crashes again then follow with the option (2)
Option (2):

As per bug description

LGWR can request DBINSTANCE lock in X mode without any timeout which can lead to a hang / deadlock.

Both fixes are already included in 11.2.0.4 but the fix is DISABLED by default.
== > To ENABLE the fix one has to set == > "_adg_parselock_timeout" > to the number of centi-seconds == > LGWR should wait before backing off and retrying the request.

Value should be in centi seconds. == > I Don't think there is really any hard fast rule for a value - at default (0) it will not timeout.
A value representing a few seconds seems reasonable - if LGWR has been stuck for say 5 seconds waiting it seems reasonable guess it is not going to get the lock.

The param just causes it to abort the current attempt and retry If you want to play safe can start with a higher value then decrease later.
A higher value will just mean more sessions blocked for longer in case of the deadlock situation.
500 Seems reasonable , but I have no data to base it on.

There should be a statistic "ADG parselock X get attempts" If it gets set too small that value would likely increase a lot due to keep timing out and retrying.

This is a dynamic parameter

Follow option (1) .

change the cursor_sharing to force on ADG

If issue re-appears then follow option (2) as below

Please set "_adg_parselock_timeout" to 500 == >

SQL > alter system set "_adg_parselock_timeout"=500 scope=both sid='*';

简单翻译如下:
1、将cursor_sharing 参数改成FORCE
2、将 "_adg_parselock_timeout" 设置为500
SQL > alter system set "_adg_parselock_timeout"=500 scope=both sid='*';

相关内容

热门资讯

终于明白“七彩丹霞到底有挂吗?... 家人们!今天小编来为大家解答七彩丹霞透视挂怎么安装这个问题咨询软件客服徽4282891的挂在哪里买很...
今日重大消息“道游联盟到底有挂... 您好:道游联盟这款游戏可以开挂,确实是有挂的,需要了解加客服微信【9752949】很多玩家在这款游戏...
金源智能取得电池焊接压装装置专... 国家知识产权局信息显示,惠州金源智能机器人有限公司取得一项名为“电池焊接压装装置及焊接设备”的专利,...
今日重磅消息“娱网皮球开挂器?... 网上科普关于“娱网皮球有没有挂”话题很是火热,小编也是针对娱网皮球作*弊开挂的方法以及开挂对应的知识...
广州在全国率先设立区级人工智能... 【环球网科技综合报道】12月29日消息,记者从广州海珠发布平台获悉,日前广州市海珠区人工智能发展局成...
玩家分享攻略“麦穗推筒子开挂器... 网上科普关于“麦穗推筒子有没有挂”话题很是火热,小编也是针对麦穗推筒子作*弊开挂的方法以及开挂对应的...
【第一资讯】“新好游炸/金/花... 有 亲,根据资深记者爆料新好游炸/金/花是可以开挂的,确实有挂(咨询软件...
今日重大消息“情怀娱乐究竟有挂... 有 亲,根据资深记者爆料情怀娱乐是可以开挂的,确实有挂(咨询软件无需打开...
今日重大发现“新版九哥开挂神器... 今日重大发现“新版九哥开挂神器?”(原来真的有挂)您好,新版九哥这个游戏其实有挂的,确实是有挂的,需...
今日重磅消息“海贝大厅牛牛有挂... 网上科普关于“海贝大厅牛牛有没有挂”话题很是火热,小编也是针对海贝大厅牛牛作*弊开挂的方法以及开挂对...