标签:sed strong lov round employees 程序 splay 节点 sso
11.2 Data Guard Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 1304939.1)
Oracle Database - Enterprise Edition - Version 11.2.0.1 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
Perform trouble free Data Guard switchover. 执行无故障的Data Guard切换。
Note: for Data Guard switchover using the Broker please refer to Note 1305019.1 - "11.2 Data Guard Physical Standby Switchover Best Practices using the Broker"
注意:有关使用Broker进行Data Guard切换的信息,请参考Note 1305019.1 - "11.2 Data Guard Physical Standby Switchover Best Practices using the Broker"
Would you like to explore this Topic further with other Oracle Customers, Oracle Employees and Industry Experts ??
You can discuss this Note, show your Experiences or ask Questions about it directly right at the Bottom of this Note in the Discussion Thread about this Document.
If you want to discover Discussions about other Articles and Subjects or even post new Discussions you can access the My Oracle Support Community Page for High Availability Data Guard
For the purposes of this document, the following fictitious environment is used as an example to describe the procedure: 为了本文档的目的,以下虚拟环境用作描述此过程的示例
Primary Database: DB_NAME: SFO Standby Database: DB_UNIQUE_NAME: NYC
警报:如果将数据库从先前版本(e.g. 10.2, 11.1, 11.2.0.1)升级到11.2.0.2,则必须参考Note 1288640.1 "Managed Recovery (MRP) Fails w/ ORA-328 After Upgrade to 11.2.0.2 and Switchover"。
Further ensure ‘compatible‘ is set correctly and to the same Value on the Primary and Standby Site. 进一步确保在主库和备库上正确设置 ‘compatible‘ 并将其设置为相同的值。
These steps should be completed before the switchover planned maintenance window begins. Our recommendation is that these are done a couple days in advance.
这些步骤应在计划切换维护窗口开始之前完成。我们的建议是提前两天完成。
The following query at the standby verifies that managed recovery is running: 备用数据库上的以下查询验证 managed recovery 正在运行
SQL> SELECT PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE ‘MRP%‘;
The following query at the Primary verifies that recovery is running with “REAL TIME APPLY” option. In the example below, LOG_ARCHIVE_DEST_2 is established to ship redo to the target standby (dest_id=2):
主库上的以下查询使用 “REAL TIME APPLY” 选项验证 recovery 是否正在运行。在下面的示例中,建立了 LOG_ARCHIVE_DEST_2 以便将 redo 发送到目标备用数据库 (dest_id=2):
SQL> SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2; RECOVERY_MODE ----------------------- MANAGED REAL TIME APPLY
If managed standby recovery is not running or not started with real-time apply, restart managed recovery with real-time apply enabled:
如果 managed standby recovery 未运行或未通过 real-time apply 启动,请在启用实时应用的情况下重新启动 managed recovery
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
Note: If you previously defined a delay for this standby the delay is ignored when you start real time apply 注意:如果您先前为此备用数据库定义了延迟,那么在您开始实时应用时,该延迟将被忽略。
For more information see Section 3.2.7 Verify the Physical Standby Database Is Performing Properly 有关更多信息,请参见部分3.2.7验证物理备用数据库是否正常运行
确保Data Guard配置中每个主数据库和备用数据库的LOG_ARCHIVE_MAX_PROCESSES都设置为4或更高。注意不要将其设置得太高,因为其他归档进程会增加关闭数据库的时间。可以通过ALTER SYSTEM动态设置此参数。
Online redo logs on the target physical standby need to be cleared before that standby database can become a primary database. Although this will automatically happen as part of the SWITCHOVER TO PRIMARY command, it is recommended that the logs are cleared prior to the switchover.
需要清除目标物理备用数据库上的 Online redo logs,然后该备用数据库才能成为主数据库。尽管这将作为 SWITCHOVER TO PRIMARY 命令的一部分自动发生,但建议在切换之前清除日志。
Setting the LOG_FILE_NAME_CONVERT parameter at the physical standby will cause the online redo logs to be automatically cleared when managed recovery is started on the standby.
在物理备用数据库上设置 LOG_FILE_NAME_CONVERT 参数将导致在备用数据库上启动 managed recovery 时自动清除 online redo logs。
If your databases are using Oracle Managed Files (OMF) or you have already set the parameter LOG_FILE_NAME_CONVERT you can skip this step as the online log files will always be cleared automatically.
如果您的数据库正在使用 Oracle Managed Files (OMF) 或已经设置了参数 LOG_FILE_NAME_CONVERT,则可以跳过此步骤,因为始终会自动清除 online log files。
Clearing online redo logs as part of the SWITCHOVER TO PRIMARY command can make the switchover command susceptible to termination by another process that is waiting on access to the CONTROLFILE. The CONTROLFILE waiter will attempt to kill the switchover after a timeout is 15 minutes.
作为 SWITCHOVER TO PRIMARY 命令的一部分,清除 online redo logs 可以使 switchover 命令易于被另一个正在等待访问 CONTROLFILE 的进程终止。CONTROLFILE 等待程序将在超时15分钟后尝试终止切换。
Oracle recommends setting LOG_FILE_NAME_CONVERT to automatically clear online redo logs on the physical standby database. In the event the primary database and the physical standby database have the exact same directory path to the online redo logs, it is acceptable to set LOG_FILE_NAME_CONVERT such that the entry pairs have the same value.
Oracle 建议设置 LOG_FILE_NAME_CONVERT 以自动清除物理备用数据库上的 online redo logs。如果主库和备库的 online redo logs 路径相同,则可以设置 LOG_FILE_NAME_CONVERT 具有相同的值
As an example, if the online redo logs are stored in /oradata/order_db/redo for both the primary and physical standby databases on their respective servers, you can set the parameter value as
例如,如果 online redo logs 同时存储在主库和备库各自服务器上的 /oradata/order_db/redo 中,则可以将参数值设置为
LOG_FILE_NAME_CONVERT=’/oradata/order_db/redo/’,’/oradata/order_db/redo/’
This will initiate automatic clearing of the online redo logs on the physical standby database when managed recovery is started.
这将在启动 managed recovery 时自动清除物理备用数据库上的online redo logs。
Since the LOG_FILE_NAME_CONVERT parameter is not dynamic you must restart the standby database for the property change to take affect.
由于 LOG_FILE_NAME_CONVERT 参数不是动态的,因此必须重新启动备用数据库以使属性更改生效。
If you have not set your environment to automatically clear the online redo logs and you do not want to restart the standby database, you should manually clear them at some point prior to the switchover. This can be done at any time.
如果尚未将环境设置为自动清除online redo logs ,并且不想重新启动备用数据库,则应在切换之前的某个时候手动清除它们。这可以随时进行
On the target physical standby run the following query to determine if the online redo logs have not been cleared:
在目标物理备用数据库上,运行以下查询以确定是否已清除online redo logs
SQL> SELECT DISTINCT L.GROUP# FROM V$LOG L, V$LOGFILE LF WHERE L.GROUP# = LF.GROUP# AND L.STATUS NOT IN (‘UNUSED‘, ‘CLEARING‘,‘CLEARING_CURRENT‘);
If the above query returns rows, on the target physical standby stop Redo Apply, issue the following statement for each GROUP# returned and restart Redo Apply:
如果以上查询返回行,请在目标物理备用库上停止Redo Apply,对返回的每个GROUP#发出以下语句,然后重新启动Redo Apply:
SQL> ALTER DATABASE CLEAR LOGFILE GROUP <ORL GROUP# from the query above>;
注意:可以使用 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 在SQL*Plus中停止Redo apply 并使用 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT; 命令重新启动。如果要使用Data Guard Broker管理配置,请使用 DGMGRL 命令 EDIT DATABASE ‘StandbyName‘ SET STATE=‘APPLY-OFF‘; 和 EDIT DATABASE ‘StandbyName‘ SET STATE=‘APPLY-ON‘;
Please note that later when you do the actual switchover if it is terminated by a CONTROLFILE waiter timeout, just re-issue the SWITCHOVER TO PRIMARY command until it completes successfully.
请注意,如果以后由于 CONTROLFILE 等待者超时而终止实际切换,则只需重新发出 SWITCHOVER TO PRIMARY 命令,直到成功完成。
You should monitor your alert log to ensure your online redo logs are being cleared and you are not experiencing some other issue. 您应该监视 alert log,以确保清除了online redo logs,并且没有遇到其他问题。
Identify the current sequence number for each thread on the primary database 标识主数据库上每个线程的当前序列号
SQL> SELECT THREAD#, SEQUENCE# FROM V$THREAD;
Verify the target physical standby database has applied up to, but not including the logs from the primary query. On the standby the following query should be within 1 or 2 of the primary query result.
验证目标物理备用数据库已应用到(但不包括)来自主库查询的日志。在备用数据库上,以下查询应在主库查询结果的1或2之内。
SQL> SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG WHERE APPLIED = ‘YES‘ AND RESETLOGS_CHANGE# = (SELECT RESETLOGS_CHANGE# FROM V$DATABASE_INCARNATION WHERE STATUS = ‘CURRENT‘) GROUP BY THREAD#;
If large gaps exist (more than 3 logs) then see Section 6.4.3 Redo Gap Detection and Resolution.
如果存在较大差距(超过3个日志),请参阅第6.4.3节“重做差距检测和解决”。
SQL> SELECT TMP.NAME FILENAME, BYTES, TS.NAME TABLESPACE FROM V$TEMPFILE TMP, V$TABLESPACE TS WHERE TMP.TS#=TS.TS#;
If the queries do not match then you can correct the mismatch now or immediately after the open of the new primary database.
如果查询不匹配,则可以立即或在打开新的主数据库后立即纠正不匹配
Prior to switchover, on the target standby, verify that all datafiles necessary for updates after role transition to primary are ONLINE.
切换之前,请在备库上,验证角色转换到主库后更新所需的所有数据文件均处于ONLINE状态。
On the target standby: 在目标备用数据库上
SQL> SELECT NAME FROM V$DATAFILE WHERE STATUS=‘OFFLINE‘;
If there are any OFFLINE datafiles, and these are needed after switchover, bring them ONLINE: 如果有任何 OFFLINE 数据文件,并且在切换后需要这些数据文件,请将其ONLINE
SQL> ALTER DATABASE DATAFILE ‘datafile-name’ ONLINE;
These steps are completed as part of the switchover process on the day of the planned outage. 这些步骤是在计划中断之日完成的切换过程的一部分
Remove any delay in applying redo that may be in effect on the standby database that will become the new primary database. If there is a delay then on the target standby database execute the following command.
取消任何应用延迟。如果存在延迟,则在目标备用数据库上执行以下命令
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE NODELAY DISCONNECT FROM SESSION;
Capture current job state on the primary: 在主数据库上捕获当前 job 状态
SQL> SELECT * FROM DBA_JOBS_RUNNING;
Depending on what the running job is, be ready to terminate the job if necessary. 根据正在运行的job,如有必要,请准备好终止job
SQL> SELECT OWNER, JOB_NAME, START_DATE, END_DATE, ENABLED FROM DBA_SCHEDULER_JOBS WHERE ENABLED=‘TRUE‘ AND OWNER <> ‘SYS‘; SQL> SHOW PARAMETER job_queue_processes
Note: Job candidates to be disabled among others: oracle text sync and optimizer, RMAN backups, application garbage collectors, application background agents.
注意:待禁用的Job包括:oracle text sync和优化器,RMAN备份,应用程序垃圾收集器,应用程序后台代理
Block further job submission
SQL> ALTER SYSTEM SET job_queue_processes=0 SCOPE=BOTH SID=’*’;
Disable any jobs that may interfere. 禁用任何可能会干扰的jobs
SQL> EXECUTE DBMS_SCHEDULER.DISABLE( <job_name> );
This can be done in parallel to the switchover. 这可以与切换并行进行
$ opmnctl stopall
Tracing is turned on to have diagnostic information available in case any issues arise. Turning on tracing does not have any noticeable impact on switchover time but does require space for the trace output.
如果出现任何问题,将打开跟踪功能以提供诊断信息。开启跟踪对切换时间没有任何明显影响,但确实需要空间来容纳跟踪输出。
Capture the current value on both the primary and the target physical standby databases 在主数据库和目标物理备用数据库上捕获当前值
SQL> SHOW PARAMETER log_archive_trace
Set Data Guard trace level to 8191 on both the primary and the target physical standby databases 将主数据库和目标物理备用数据库上的Data Guard跟踪级别都设置为8191
SQL> ALTER SYSTEM SET log_archive_trace=8191;
Trace output will appear under the destination pointed to by the database parameter BACKGROUND_DUMP_DEST with “mrp” in the file name.
跟踪输出将出现在数据库参数 Background_DUMP_DEST 指向的目标下,文件名中带有 “mrp”。
Locate alert logs by showing database parameter background_dump_dest 通过显示数据库参数background_dump_dest找到警报日志
SQL> SHOW PARAMETER background_dump_dest
Tail the alert logs
> tail –f <background_dump_dest location>/alert*
The standard switchover fallback options should suffice for successfully backing out of a switchover. However, if you want an additional fallback option then you can create a guaranteed restore point on the primary and standby database participating in the switchover.
标准的切换后备选项应足以成功退出切换。但是,如果需要其他回退选项,则可以在参与切换的主数据库和备用数据库上创建有保证的还原点。
On the standby
Stop the apply process
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Create a guaranteed restore point 创建一个有保证的还原点
SQL> CREATE RESTORE POINT SWITCHOVER_START_GRP GUARANTEE FLASHBACK DATABASE;
Start the apply process
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
On the primary
Create a guaranteed restore point
SQL> CREATE RESTORE POINT SWITCHOVER_START_GRP GUARANTEE FLASHBACK DATABASE;
Note: If a guaranteed restore points are created, make sure they are dropped post-switchover! 注意:如果创建了保证的还原点,请确保在切换后将其删除!
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the primary database: 在主数据库上查询V$DATABASE视图的SWITCHOVER_STATUS列:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE; SWITCHOVER_STATUS ----------------- TO STANDBY
A value of TO STANDBY or SESSIONS ACTIVE (which requires the WITH SESSION SHUTDOWN clause on the switchover command) indicates that the primary database can be switched to the standby role. If neither of these values is returned, a switchover is not possible because redo transport is either mis-configured or is not functioning properly. See Appendix A.4 Problems Switching Over to a Physical Standby Database
TO STANDBY 或 SESSIONS ACTIVE 的值(在 switchover 命令上需要 WITH SESSION SHUTDOWN 子句)表示可以将主数据库切换到备用角色。如果这两个值均未返回,则无法进行切换,因为 redo 传输配置错误或无法正常工作。请参阅附录A.4切换到物理备用数据库时遇到的问题
A normal or immediate shutdown can be done, but to expedite the shutdown issue a SHUTDOWN ABORT on secondary RAC instances on the primary cluster only leaving one Primary instance up. Wait until the remaining Primary instance has completed cluster reconfiguration (and performed recovery if you chose to abort the secondary instances) before continuing.
可以执行正常或立即关闭,但为了加快关闭速度,请在主群集上的辅助RAC实例上发出SHUTDOWN ABORT 。等待,直到剩余的主实例完成集群重新配置(如果选择abort辅助实例,并执行recovery),然后再继续。
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
If an ORA-16139 error is encountered, as long as V$DATABASE.DATABASE_ROLE=’PHYSICAL STANDBY’, then you can proceed. A common case where this can occur is when there are a large number of data files. Once managed recovery is started on the new standby, the database will recover.
如果遇到 ORA-16139 错误,只要 V$DATABASE.DATABASE_ROLE =‘PHYSICAL STANDBY‘,您就可以继续。可能发生这种情况的常见情况是,当有大量数据文件时。在新的备用数据库上启动 managed recovery 后,数据库将recover.
If the role was not changed then you need to cancel the switchover and review the alert logs and trace files further. 如果角色未更改,则需要取消切换并进一步查看警报日志和跟踪文件。
In the primary alert log you will see messages like these: 在 primary alert log 中,您将看到以下消息
Switchover: Primary controlfile converted to standby controlfile succesfully. Tue Mar 15 16:12:15 2011 MRP0 started with pid=17, OS id=2717 MRP0: Background Managed Standby Recovery process started (SFO) Serial Media Recovery started Managed Standby Recovery not using Real Time Apply Online logfile pre-clearing operation disabled by switchover Media Recovery Log /u01/app/flash_recovery_area/SFO/archivelog/2011_03_15/o1_mf_1_133_6qzl0yvd_.arc Identified End-Of-Redo for thread 1 sequence 133 Resetting standby activation ID 0 (0x0) Media Recovery End-Of-Redo indicator encountered Media Recovery Applied until change 4314801 MRP0: Media Recovery Complete: End-Of-REDO (SFO) MRP0: Background Media Recovery process shutdown (SFO) Tue Mar 15 16:12:21 2011 Switchover: Complete - Database shutdown required (SFO) Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN
And correspondingly in the standby alert log file you should see messages like these: 相应地,在备用警报日志文件中,您应该看到以下消息
Tue Mar 15 16:12:15 2011 RFS[8]: Assigned to RFS process 2715 RFS[8]: Identified database type as ‘physical standby‘: Client is Foreground pid 2568 Media Recovery Log /u01/app/flash_recovery_area/NYC/archivelog/2011_03_15/o1_mf_1_133_6qzl0yjp_.arc Identified End-Of-Redo for thread 1 sequence 133 Resetting standby activation ID 2680651518 (0x9fc77efe) Media Recovery End-Of-Redo indicator encountered Media Recovery Continuing Resetting standby activation ID 2680651518 (0x9fc77efe) Media Recovery Waiting for thread 1 sequence 134
In versions prior to Oracle Database 11g Release 2, the MRP (Redo Apply coordinator) would stop automatically after processing the End-of-Redo marker. With Oracle Database 11g Release 2, it no longer stops leaving all bystander standby databases still ready to apply redo from the new primary database without having to be restarted. The MRP process will be shut down automatically by the switchover command when executed at the target standby database.
在 Oracle Database 11g Release 2 之前的版本中,MRP (Redo Apply coordinator) 将在处理 End-of-Redo 后自动停止。在Oracle Database 11g Release 2 中,不会停止MRP,其他备库仍然可以从新的主数据库中应用redo,而不必重新启动MRP。在目标备用数据库上执行时,MRP进程将由 switchover命令自动关闭。
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the standby database: 在备用数据库上查询V$DATABASE视图的SWITCHOVER_STATUS列:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE; SWITCHOVER_STATUS ----------------- TO PRIMARY
A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is returned, verify that redo apply is active and that redo transport is configured and working properly. Continue to query this column until the value returned is either TO PRIMARY or SESSIONS ACTIVE.
TO PRIMARY 或 SESSIONS ACTIVE 的值指示备用数据库已准备好切换到主要角色。如果这些值均未返回,请确认redo应用处于活动状态,并且redo传输已配置且正常工作。继续查询此列,直到返回的值是 TO PRIMARY 或 SESSIONS ACTIVE。
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
In the standby alert log file you should see messages like these: 在备用警报日志文件中,您应该看到以下消息
Tue Mar 15 16:16:44 2011 ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN ALTER DATABASE SWITCHOVER TO PRIMARY (NYC) Maximum wait for role transition is 15 minutes. Switchover: Media recovery is still active Role Change: Canceling MRP - no more redo to apply Tue Mar 15 16:16:45 2011 MRP0: Background Media Recovery cancelled with status 16037 Errors in file /u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc: ORA-16037: user requested cancel of managed recovery operation Managed Standby Recovery not using Real Time Apply Recovery interrupted! Waiting for MRP0 pid 2460 to terminate Errors in file /u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc: ORA-16037: user requested cancel of managed recovery operation Tue Mar 15 16:16:45 2011 MRP0: Background Media Recovery process shutdown (NYC) Role Change: Canceled MRP
SQL> ALTER DATABASE OPEN;
Note: There will be an increase in I/O activity while the new primary’s standby redo logs are cleared.
注意:清除新主服务器的备用重做日志时,I/O活动将增加。
If there was a tempfile that was not corrected during the pre-switchover check, then correct it now on the new primary.
如果有一个临时文件在切换前检查期间未得到纠正,则现在在新的主库上对其进行纠正。
If the new standby database (former primary database) was not shutdown since switching it to standby, bring it to the mount state and start managed recovery. This can be done in parallel to the new primary open.
如果新的备用数据库(以前的主数据库)自切换为备用数据库以来未关闭,请将其置于mount状态并开始managed recovery。这可以与new primary open同时进行
SQL> SHUTDOWN ABORT;
Note: If you use IMMEDIATE, an ABORT will be performed anyway as of 11.2.0.2 and you would see the following in the alert log: 注意:如果使用IMMEDIATE,则将从11.2.0.2开始执行ABORT,并且您将在警报日志中看到以下内容:
Performing implicit shutdown abort due to switchover to physical standby
Shutting down instance (abort)
License high water mark = 15
USER (ospid: 14665): terminating the instance
Instance terminated by USER, pid = 14665
SQL> STARTUP MOUNT; SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
Note: If you were using a delay for your standby then you would restart the apply without real time apply:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
Finally, if the database is a RAC, then start all secondary instances on the new standby.
最后,如果数据库是RAC,则在新的备用数据库上启动所有辅助实例。
See Appendix A.4.5 Roll Back After Unsuccessful Switchover and Start Over in the Data Guard Concepts and Administration manual.
请参见《 Data Guard概念和管理》手册中的附录A.4.5切换失败后回滚和重新开始。
For each instance on the Primary and Standby: 对于主实例和备用实例上的每个实例
SQL> ALTER SYSTEM SET log_archive_trace=<prior value>;
Set the job queue processes to its original value on the new standby.
SQL> ALTER SYSTEM SET job_queue_processes=<value saved> scope=both sid=’*’
Enable any jobs that were disabled.
SQL> EXECUTE DBMS_SCHEDULER.ENABLE(<for each job name captured>);
On all databases where a Guaranteed Restore point was created
SQL> DROP RESTORE POINT SWITCHOVER_START_GRP;
11.2 Data Guard Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 1304939.1)
标签:sed strong lov round employees 程序 splay 节点 sso
原文地址:https://www.cnblogs.com/zylong-sys/p/12040447.html