Oracle数据库掉电或掉线导致数据库故障恢复

时间：2022-06-29 09:46:00 编辑：袖梨来源：一聚教程网

xx医院存储突然掉线,导致数据库异常,现场工程师折腾了一天,问题依旧没有解决,无奈之下找到我们,希望我们能够帮忙恢复数据库.

启动报ORA-00600[2131]错误
Fri Nov 06 14:50:59 2015
ALTER DATABASE MOUNT
This instance was first to mount
Fri Nov 06 14:50:59 2015
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.4.4)(PORT=1521))' SCOPE=MEMORY SID='xifenfei1';
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
NOTE: dependency between database xifenfei and diskgroup resource ora.DATA.dg is established
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13221.trc (incident=191085):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
Incident details in: /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_191085/xifenfei1_ora_13221_i191085.trc
Fri Nov 06 14:51:10 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE MOUNT...
出现该错误的原因是由于：We are attempting to write a controlfile checkpoint progress record, but find we do not have the progress record generating this exception.由于控制文件异常导致,出现此类情况,我们一般使用单个控制文件一次尝试,如果都不可以考虑重建控制文件

由于坏块(逻辑/物理)导致数据库实例恢复无法进行

Beginning crash recovery of 2 threads
Started redo scan
kcrfr_rnenq: use log nab 393216
kcrfr_rnenq: use log nab 2
Completed redo scan
read 4427 KB redo, 500 data blocks need recovery
Started redo application at
Thread 1: logseq 5731, block 391398
Thread 2: logseq 4252, block 520815
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5731 Reading mem 0
Mem# 0: +DATA/xifenfei/onlinelog/group_2.266.835331047
Recovery of Online Redo Log: Thread 2 Group 8 Seq 4252 Reading mem 0
Mem# 0: +DATA/xifenfei/onlinelog/group_8.331.835330421
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc (incident=197486):
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197486/xifenfei1_ora_13770_i197486.trc
Fri Nov 06 15:03:09 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc (incident=197487):
ORA-01578: ORACLE data block corrupted (file # 2, block # 65207)
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197487/xifenfei1_ora_13770_i197487.trc
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01578: ORACLE data block corrupted (file # 2, block # 65207)
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Recovery of Online Redo Log: Thread 2 Group 3 Seq 4253 Reading mem 0
Mem# 0: +DATA/xifenfei/onlinelog/group_3.332.835330505
Hex dump of (file 14, block 62536) in trace file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc
Reading datafile '+DATA/xifenfei/datafile/ht01.dbf' for corruption at rdba: 0x0380f448 (file 14, block 62536)
Reread (file 14, block 62536) found same corrupt data (logically corrupt)
RECOVERY OF THREAD 1 STUCK AT BLOCK 62536 OF FILE 14
Fri Nov 06 15:03:13 2015
Abort recovery for domain 0
Aborting crash recovery due to error 1172
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01172: recovery of thread 1 stuck at block 62536 of file 14
ORA-01151: use media recovery to recover block, restore backup if needed
Abort recovery for domain 0
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01172: recovery of thread 1 stuck at block 62536 of file 14
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...
查看资料发现和Bug 14301592 – Several errors by corrupt blocks shifted by 2 bytes in buffer cache during recovery caused by INDEX redo apply，可以通过ALLOW 1 CORRUPTION临时解决

使用ALLOW 1 CORRUPTION进行恢复,出现ORA-07445[kdxlin]错误
Specify log: {=suggested | filename | AUTO | CANCEL}
+DATA/xifenfei/onlinelog/group_3.332.835330505
ORA-00279: change 700860458 generated at 11/05/2015 21:20:15 needed for thread
1
ORA-00289: suggestion : +ARCHIVE/xifenfei/xifenfei_1_5731_835324843.arc
ORA-00280: change 700860458 for thread 1 is in sequence #5731

Specify log: {=suggested | filename | AUTO | CANCEL}
+DATA/xifenfei/onlinelog/group_2.266.835331047
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
70104)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 82289
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB572] [Address not mapped to object] []

ORA-01112: media recovery not started
ORA-07445[kdxlin()+4088]未找到类似说明,到了这一步,无法简单的恢复成功,只能通过设置隐含参数跳过实例恢复,尝试resetlog库

通过设置_allow_resetlogs_corruption参数继续恢复
SQL> startup pfile='/tmp/pfile.ora' mount;
ORACLE instance started.

Total System Global Area 7315603456 bytes
Fixed Size                  2267384 bytes
Variable Size            2566915848 bytes
Database Buffers         4731174912 bytes
Redo Buffers               15245312 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927],
[0], [700860464], [], [], [], [], [], [], []
Process ID: 13563
Session ID: 157 Serial number: 3
alert日志报错
Fri Nov 06 19:26:39 2015
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc (incident=319140):
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_319140/xifenfei1_ora_13563_i319140.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Redo thread 2 internally disabled at seq 1 (CKPT)
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 9956 added for thread 2 sequence 1 ID 0x0 dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 13563): terminating the instance due to error 600
Fri Nov 06 19:26:42 2015
Instance terminated by USER, pid = 13563
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (13563) as a result of ORA-1092
Fri Nov 06 19:26:42 2015
ORA-1092 : opitsk aborting process
这里是比较熟悉的ora-600[kclchkblk_4]错误,和ora-600[2662]错误类似,需要调整scn,由于数据库版本为11.2.0.4,无法使用常规方法调整scn,在修改控制文件,oradebug,bbed方法可供选择

使用oradebug方法处理
因为是asm环境,其他方法处理起来都相对麻烦

[oracle@wisetop1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Fri Nov 6 19:30:59 2015

Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.ora' mount;
ORACLE instance started.

Total System Global Area 7315603456 bytes
Fixed Size                  2267384 bytes
Variable Size            2566915848 bytes
Database Buffers         4731174912 bytes
Redo Buffers               15245312 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug poke 0x06001AE70 4 0x2FAF0800
BEFORE: [06001AE70, 06001AE74) = 00000000
AFTER: [06001AE70, 06001AE74) = 2FAF0800
SQL> alter database open;

Database altered.

至此数据库open成功,后续就是处理一些坏块的工作,并建议客户逻辑重建库.

推荐专题

最新下载

热门教程

Oracle数据库掉电或掉线导致数据库故障恢复

相关文章

热门栏目

php教程

asp.net教程

手机开发

css教程

网页制作

办公数码

jsp教程