1,问题描述,standby从库没有应用redo日志 Tue Jul 22 09:05:07 2014 RFS[8852]: Assigned to RFS process 12956 RFS[8852]: Identified database type as ‘physical standby‘: Client is ARCH pid 16028 Tue Jul 22 09:05:09 2014 RFS[8853]: Assigned to RFS process 12958 RFS[8853]: Identified database type as ‘physical standby‘: Client is LGWR SYNC pid 15950 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Standby controlfile consistent with primary RFS[8853]: No standby redo logfiles selected (reason:7) Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_rfs_12958.trc: ORA-16086: Redo data cannot be written to the standby redo log Tue Jul 22 09:11:07 2014 RFS[8854]: Assigned to RFS process 12976 RFS[8854]: Identified database type as ‘physical standby‘: Client is ARCH pid 16028 Tue Jul 22 09:11:07 2014 RFS[8855]: Assigned to RFS process 12978 RFS[8855]: Identified database type as ‘physical standby‘: Client is LGWR SYNC pid 15950 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Standby controlfile consistent with primary RFS[8855]: No standby redo logfiles selected (reason:7) Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_rfs_12978.trc: ORA-16086: Redo data cannot be written to the standby redo log
2,在从库查看redo日志信息 SQL> show parameter log_file_name_convert;
NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_file_name_convert string /home/oradata/powerdes, /home/ oradata/powerdes SQL> SQL> SQL> select group#,member from v$logfile;
GROUP# ---------- MEMBER -------------------------------------------------------------------------------- 3 /home/oradata/powerdes/redo03.log
2 /home/oradata/powerdes/redo02.log
1 /home/oradata/powerdes/redo01.log
SQL> select GROUP#,FIRST_CHANGE#,SEQUENCE#,STATUS from v$log;
SQL> select group#,bytes/1024/1024,members,status from v$log;
GROUP# BYTES/1024/1024 MEMBERS STATUS ---------- --------------- ---------- ---------------- 1 50 1 CLEARING 2 50 1 CLEARING_CURRENT 3 50 1 CLEARING
SQL>
先暂停redo log日志: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
执行更新日志操作: alter database clear logfile group 1; alter database clear logfile group 2; alter database clear logfile group 3;
SQL> alter database clear logfile group 1; alter database clear logfile group 1 * ERROR at line 1: ORA-01156: recovery or flashback in progress may need access to files
报错是因为MFS进程锁定日志了,所以需要先停应用再更新日志操作 alter database recover managed standby database cancel; SQL> alter database recover managed standby database cancel; Database altered. SQL> alter database clear logfile group 1; Database altered. SQL> alter database clear logfile group 2; Database altered. SQL> alter database clear logfile group 3; Database altered. SQL>
然后再执行redo应用 SQL> alter database recover managed standby database disconnect from session; Database altered. SQL>
查看redo应用情况 SQL> select name,creator,sequence#,applied,completion_time from v$archived_log; 都是NO
3,重建redo log 再查看下看一下现在的redo log状态 select group#,bytes/1024/1024,members,status from v$log; SQL> select group#,bytes/1024/1024,members,status from v$log;
GROUP# BYTES/1024/1024 MEMBERS STATUS ---------- --------------- ---------- ---------------- 1 50 1 UNUSED 2 50 1 CLEARING 3 50 1 CLEARING_CURRENT
alter database add logfile group 4 (‘/home/oradata/powerdes/redo04.log‘) size 50M; alter database add logfile group 5 (‘/home/oradata/powerdes/redo05.log‘) size 50M; alter database add logfile group 6 (‘/home/oradata/powerdes/redo06.log‘) size 50M;
日志管理是自动的,所以不能操作,要先设置成手动管理的 SQL> alter system set standby_file_management=‘manual‘; System altered. SQL> alter database add logfile group 4 (‘/home/oradata/powerdes/redo04.log‘) size 50M; Database altered. SQL> alter database add logfile group 5 (‘/home/oradata/powerdes/redo05.log‘) size 50M; Database altered. SQL> alter database add logfile group 6 (‘/home/oradata/powerdes/redo06.log‘) size 50M; Database altered. SQL>
清空redo日志组 SQL> alter database clear logfile group 1; alter database clear logfile group 2; alter database clear logfile group 3;
SQL> alter database clear logfile group 1; alter database clear logfile group 2; alter database clear logfile group 3; Database altered.
SQL> Database altered.
SQL> Database altered.
SQL>
查看redo 日志组信息 SQL> select group#,bytes/1024/1024,members,status from v$log; GROUP# BYTES/1024/1024 MEMBERS STATUS ---------- --------------- ---------- ---------------- 1 50 1 CURRENT 2 50 1 UNUSED 3 50 1 UNUSED 4 50 1 UNUSED 5 50 1 UNUSED 6 50 1 UNUSED
6 rows selected.
alter database drop logfile group 1; alter database drop logfile group 2; alter database drop logfile group 3;
SQL> alter database drop logfile group 1; alter database drop logfile group 2; alter database drop logfile group 3; Database altered.
SQL> Database altered.
SQL> alter database drop logfile group 3 * ERROR at line 1: ORA-01623: log 3 is current log for instance powerdes (thread 1) - cannot drop ORA-00312: online log 3 thread 1: ‘/home/oradata/powerdes/redo03.log‘
SQL>
4,检查归档文件是否完整 从库redo log损坏了的话,只要从库的归档日志在,还是可以修复的,不用重新做Standy。 从库上执行check: SQL> SELECT DISTINCT THREAD#,max(SEQUENCE#) OVER(PARTITION BY THREAD#) A FROM V$ARCHIVED_LOG; THREAD# A ---------- ---------- 1 23826 SQL>
主库上执行check: SQL> SELECT DISTINCT THREAD#,max(SEQUENCE#) OVER(PARTITION BY THREAD#) A FROM V$ARCHIVED_LOG; THREAD# A ---------- ---------- 1 24022
如何查看归档路径,最高可用模式的时候 dg会尽力的让日志应用到standby 去查一下 SELECT GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS FROM V$STANDBY_LOG; 从库: SQL> SELECT GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS FROM V$STANDBY_LOG; no rows selected SQL>
主库: SQL> SELECT GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS FROM V$STANDBY_LOG; no rows selected SQL> 表名从库的standby log都空的,需要重建standby log。
5,确定归档日志有没有写到从库: 所以用的lgwr进程,用lgwr进程进行传输日志,而其他模式用arch传输日志是等到日志写到归档的过程中传输过去,最近你没有改过从库的保护模式吧。 主库上查看下redo log的大小: select GROUP#,BYTES/1024/1024,STATUS from v$log; SQL> select GROUP#,BYTES/1024/1024,STATUS from v$log; GROUP# BYTES/1024/1024 STATUS ---------- --------------- ---------------- 1 50 CURRENT 2 50 INACTIVE 3 50 ACTIVE
SQL>
确定日志有没有写到从库: 执行: select name,sequence#,applied from v$archived_log; NAME -------------------------------------------------------------------------------- SEQUENCE# APPLIED ---------- --------- /oracle/app/oracle/flash_recovery_area/archivelog/1_24161_821708334.dbf 24161 NO
/oracle/app/oracle/flash_recovery_area/archivelog/1_24162_821708334.dbf 24162 NO
从库上归档日志: NAME -------------------------------------------------------------------------------- SEQUENCE# APPLIED ---------- --------- /data/oracle/oradgdata/standby_archive/1_24072_821708334.dbf 24072 NO
检查下主库从库的归档管理模式 从库: SQL> show parameter standby_file_management; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_file_management string MANUAL SQL>
主库: SQL> show parameter standby_file_management; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_file_management string AUTO SQL>
看到从库跟主库不一致,需要手动将从库上面的MANUAL修改成AUTO。 alter system set standby_file_management=‘AUTO‘; SQL> alter system set standby_file_management=‘AUTO‘; System altered. SQL> SQL> show parameter standby_file_management; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_file_management string AUTO SQL>
6,检查alert信息 因为主库1点有oracle归档日志,所以去主库从库看下alert日志信息: 主库信息: ll /oracle/app/oracle/diag/rdbms/pdunq/powerdes/trace/alert_powerdes.log Tue Jul 22 01:00:02 2014 ALTER SYSTEM ARCHIVE LOG Tue Jul 22 01:00:02 2014 Thread 1 advanced to log sequence 24073 (LGWR switch) Current log# 1 seq# 24073 mem# 0: /home/oradata/powerdes/redo01.log Archived Log entry 46639 added for thread 1 sequence 24072 ID 0xca2ab4eb dest 1: Tue Jul 22 01:00:17 2014 Errors in file /oracle/app/oracle/diag/rdbms/pdunq/powerdes/trace/powerdes_lgwr_15950.trc: ORA-16086: Redo data cannot be written to the standby redo log LGWR: Failed to archive log 2 thread 1 sequence 24074 (16086) Thread 1 advanced to log sequence 24074 (LGWR switch) Current log# 2 seq# 24074 mem# 0: /home/oradata/powerdes/redo02.log Tue Jul 22 01:00:19 2014 Archived Log entry 46641 added for thread 1 sequence 24073 ID 0xca2ab4eb dest 1: Tue Jul 22 01:02:28 2014 backup piece header validation failure for handle /data/oracle/backup/data/ctl_auto/c-3391761643-20140721-00 backup piece header validation failure for handle /data/oracle/backup/data/ctl_auto/c-3391761643-20140721-01
ll /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/alert_powerdes.log 从库信息: Tue Jul 22 01:04:48 2014 RFS[8692]: Assigned to RFS process 10970 RFS[8692]: Identified database type as ‘physical standby‘: Client is ARCH pid 16028 Tue Jul 22 01:04:49 2014 RFS[8693]: Assigned to RFS process 10972 RFS[8693]: Identified database type as ‘physical standby‘: Client is LGWR SYNC pid 15950 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Standby controlfile consistent with primary RFS[8693]: No standby redo logfiles selected (reason:7) Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_rfs_10972.trc: ORA-16086: Redo data cannot be written to the standby redo log Tue Jul 22 01:10:48 2014 RFS[8694]: Assigned to RFS process 10989 RFS[8694]: Identified database type as ‘physical standby‘: Client is ARCH pid 16028 Tue Jul 22 01:10:51 2014 RFS[8695]: Assigned to RFS process 10992 RFS[8695]: Identified database type as ‘physical standby‘: Client is LGWR SYNC pid 15950
7,在从库上操作,还原昨天的logfile组: 先停止redo应用 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 然后设置成manual模式 alter system set standby_file_management=‘MANUAL‘; 重建redo日志文件 alter database add logfile group 1 (‘/home/oradata/powerdes/redo01.log‘) size 50M; alter database add logfile group 2 (‘/home/oradata/powerdes/redo02.log‘) size 50M; alter database drop logfile group 4; alter database drop logfile group 5; alter database drop logfile group 6; SQL> alter database add logfile group 1 (‘/home/oradata/powerdes/redo01.log‘) size 50M; alter database add logfile group 1 (‘/home/oradata/powerdes/redo01.log‘) size 50M * ERROR at line 1: ORA-01275: Operation ADD LOGFILE is not allowed if standby file management is automatic. SQL> 看到报错,需要去从库删除已经存在的redo01和redo02日志 [oracle@localhost dbs]$ mv /home/oradata/powerdes/redo01.log /home/oradata/powerdes/bak.redo01.log.20140722 [oracle@localhost dbs]$ mv /home/oradata/powerdes/redo02.log /home/oradata/powerdes/bak.redo02.log.20140722 然后再在从库执行如下还原日志操作,成功了 alter database add logfile group 1 (‘/home/oradata/powerdes/redo01.log‘) size 50M; alter database add logfile group 2 (‘/home/oradata/powerdes/redo02.log‘) size 50M; alter database drop logfile group 4; alter database drop logfile group 5; alter database drop logfile group 6;
去主库从库执行,不分先后顺序: alter database add standby logfile group 4 (‘/home/oradata/powerdes/redo_dg_01.log‘) size 50m; alter database add standby logfile group 5 (‘/home/oradata/powerdes/redo_dg_02.log‘) size 50m; alter database add standby logfile group 6 (‘/home/oradata/powerdes/redo_dg_03.log‘) size 50m;
执行redo应用: alter database recover managed standby database disconnect from session; SQL> alter database recover managed standby database disconnect from session; Database altered. SQL>
select name,sequence#,applied from v$archived_log; 还是No,没有被应用成YES。 alter system set standby_file_management=‘AUTO‘; SQL> alter system set standby_file_management=‘AUTO‘; System altered. SQL> SQL> show parameter standby_file_management; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_file_management string AUTO SQL>
然后再在从库上面执行redo应用 alter database recover managed standby database disconnect from session;
8,尝试下关闭再重启从库 SHUTDOWN IMMEDIATE; 下面是alert信息:
Tue Jul 22 11:04:42 2014
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_pr00_13338.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery
Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_pr00_13338.trc:
ORA-16037: user requested cancel of managed recovery operation
Waiting for MRP0 pid 13335 to terminate
Tue Jul 22 11:04:44 2014
MRP0: Background Media Recovery process shutdown (powerdes)
再启动从库 STARTUP MOUNT; 再应用redo应用 alter database recover managed standby database disconnect from session; 查看是否有yes select name,sequence#,applied from v$archived_log; select sequence#,applied from v$archived_log; 还是NO,没有被应用 最后执行 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION; 在检查select sequence#,applied from v$archived_log;还是No,日志没有被应用。
查看归档日志路径: select name from v$archived_log;
查看主库备份记录策略: RMAN> show retention policy; RMAN configuration parameters for database with db_unique_name PDUNQ are: CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 15 DAYS;
9,从库上面执行恢复归档日志,这个过程比较慢,耗时比较长 recover automatic standby database ; SQL> recover automatic standby database ; ORA-00279: change 10533608939 generated at 07/22/2014 14:00:38 needed for thread 1 ORA-00289: suggestion : /data/oracle/oradgdata/standby_archive/1_24178_821708334.dbf ORA-00280: change 10533608939 for thread 1 is in sequence #24178 ORA-00278: log file ‘/data/oracle/oradgdata/standby_archive/1_24178_821708334.dbf‘ no longer needed for this recovery ORA-16145: archival for thread# 1 sequence# 24178 in progress
Specify log: {<RET>=suggested | filename | AUTO | CANCEL} ORA-16145: archival for thread# 1 sequence# 24178 in progress
SQL>
如下是alert日志信息: Media Recovery Log /data/oracle/oradgdata/standby_archive/1_24177_821708334.dbf Tue Jul 22 14:16:58 2014 Media Recovery Log /data/oracle/oradgdata/standby_archive/1_24178_821708334.dbf Tue Jul 22 14:16:59 2014 MRP: Archival for thread 1 sequence 24178 in progress Standby Managed Recovery operation not detected Errors with log /data/oracle/oradgdata/standby_archive/1_24178_821708334.dbf Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_pr00_13910.trc: ORA-16145: archival for thread# 1 sequence# 24178 in progress Tue Jul 22 14:17:13 2014 ORA-279 signalled during: ALTER DATABASE RECOVER automatic standby database ... ALTER DATABASE RECOVER CONTINUE DEFAULT Media Recovery Log /data/oracle/oradgdata/standby_archive/1_24178_821708334.dbf Tue Jul 22 14:17:13 2014 MRP: Archival for thread 1 sequence 24178 in progress Standby Managed Recovery operation not detected Errors with log /data/oracle/oradgdata/standby_archive/1_24178_821708334.dbf Errors in file /oracle/app/oracle/diag/rdbms/pddgunq/powerdes/trace/powerdes_pr00_13910.trc: ORA-16145: archival for thread# 1 sequence# 24178 in progress ORA-16145 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ... ALTER DATABASE RECOVER CANCEL Media Recovery Canceled Completed: ALTER DATABASE RECOVER CANCEL
然后启动redo应用 alter database recover managed standby database disconnect from session; SQL> alter database recover managed standby database disconnect from session; Database altered.
停止redo应用 alter database recover managed standby database cancel; 再打开open模式,将从库打开供大家查询数据 alter database open read only; 再起动redo应用 alter database recover managed standby database disconnect from session;
10,总结2个操作: 一是重建了redo log,添加了3组standby log: alter database add logfile group 1 (‘/home/oradata/powerdes/redo01.log‘) size 50M; alter database add logfile group 2 (‘/home/oradata/powerdes/redo02.log‘) size 50M; alter database drop logfile group 4; alter database drop logfile group 5; alter database drop logfile group 6; alter database add standby logfile group 4 (‘/home/oradata/powerdes/redo_dg_01.log‘) size 50m; alter database add standby logfile group 5 (‘/home/oradata/powerdes/redo_dg_02.log‘) size 50m; alter database add standby logfile group 6 (‘/home/oradata/powerdes/redo_dg_03.log‘) size 50m;