标签:
一,查看Replication的fundamental components
1,查看Publisher 和 Publisher database
select pdb.publisher_id, srv.name as publisher_name, pdb.id as publisher_db_id, pdb.publisher_db as publisher_db_name from dbo.MSpublisher_databases pdb inner join sys.servers srv on pdb.publisher_id=srv.server_id
2,查看Publication
select p.publisher_id, p.publisher_db, p.publication_id, p.publication as publication_name, case p.publication_type when 0 then ‘Transactional‘ when 1 then ‘Snapshot‘ when 2 then ‘Merge‘ else ‘‘ end as publication_type, p.immediate_sync, p.retention, p.sync_method, p.description from dbo.MSpublications p
The synchronization method:
3,查看Article 和 publication的关系
select a.publisher_id, a.publisher_db, a.publication_id, a.article_id, a.article, a.source_owner, a.source_object, a.destination_object from dbo.MSarticles a with(nolock)
4,查看 Subscription和 publication的关系
select s.publisher_id, s.publisher_database_id, s.publisher_db, s.publication_id, s.article_id, s.subscriber_id, s.subscriber_db, case s.status when 0 then ‘Inactive‘ when 1 then ‘Subscribed‘ when 2 then ‘Active‘ else ‘‘ end as status, s.agent_id, case s.subscription_type when 0 then ‘Push‘ when 1 then ‘Pull‘ else ‘Anonymous‘ end as subscription_type, case when s.sync_type=1 then ‘Automatic‘ else ‘No synchronization‘ end as sync_type from dbo.MSsubscriptions s where s.subscriber_id<>-1
二,查看 LogReader History
;with cte as ( select lh.agent_id, case lh.runstatus when 1 then ‘Start‘ when 2 then ‘Succeed‘ when 3 then ‘In progress‘ when 4 then ‘Idle‘ when 5 then ‘Retry‘ when 6 then ‘Fail‘ else ‘‘ end as runstatus, lh.time as LogTime, lh.duration/1000 as duration_s, lh.comments, lh.xact_seqno, lh.delivery_latency, cast(lh.delivery_rate as int) as [avg_cmd_s], lh.delivered_transactions, lh.delivered_commands, lh.average_commands as [avg_cmd/tran], row_number() over(partition by agent_id order by lh.time desc) as rid from dbo.MSlogreader_history lh with(nolock) ) select lra.publisher_db, c.runstatus, c.LogTime, c.duration_s, c.comments, c.delivery_latency/1000 as latency_s, c.[avg_cmd_s], c.[avg_cmd/tran], c.delivered_transactions, c.delivered_commands, c.xact_seqno, --lra.name as LogReaderAgentName, c.agent_id from cte as c inner join dbo.MSlogreader_agents lra with(nolock) on c.agent_id=lra.id where rid<=11 order by agent_id,LogTime desc
commtents 字段中包含有丰富的信息
<stats state="1" work="4222" idle="180724" > <reader fetch="3806" wait="35"/> <writer write="46" wait="112801"/> <sincelaststats elapsedtime="301" work="0" cmds="0" cmdspersec="0.000000"> <reader fetch="0" wait="0"/> <writer write="0" wait="0"/> </sincelaststats> <message>Normal events that describe both the reader and writer thread performance.</message> </stats>
三,查看Transaction 和 Commands
LogReader 将Publisher database中的Transaction 和 Command读取到distribution中的 MSrepl_transactions 和 MSrepl_commands,这两个Table通过 xact_seqno (The sequence number of the transaction)字段链接在一起,使用 sp_browsereplcmds 查看 MSrepl_commands table中的SQL语句。
查看《Transactional Replication5: Transaction and Command》
四,查看Distribution history
1,查看 Distribution Status
在Distribution 数据库中,SQL Server 提供一个View:MSdistribution_status 用于查看Article的更新情况
UndelivCmdsInDistDB:The number of commands pending delivery to Subscribers
DelivCmdsInDistDB:The number of commands delivered to Subscribers.
select * from dbo.MSdistribution_status order by UndelivCmdsInDistDB desc
使用 sp_helptext 查看给View的定义
sp_helptext ‘MSdistribution_status‘
MSdistribution_status Body的定义是:
SELECT t.article_id, s.agent_id, ‘UndelivCmdsInDistDB‘=SUM(CASE WHEN xact_seqno > h.maxseq THEN 1 ELSE 0 END), ‘DelivCmdsInDistDB‘=SUM(CASE WHEN xact_seqno <= h.maxseq THEN 1 ELSE 0 END) FROM ( SELECT article_id,publisher_database_id, xact_seqno FROM MSrepl_commands with (NOLOCK) ) as t JOIN ( SELECT agent_id,article_id,publisher_database_id FROM MSsubscriptions with (NOLOCK) ) AS s ON (t.article_id = s.article_id AND t.publisher_database_id=s.publisher_database_id ) JOIN ( SELECT agent_id,‘maxseq‘= isnull(max(xact_seqno),0x0) FROM MSdistribution_history with (NOLOCK) GROUP BY agent_id ) as h ON (h.agent_id=s.agent_id) GROUP BY t.article_id,s.agent_id
2,查看 Distribution History
;with cte as ( select dh.agent_id, case dh.runstatus when 1 then ‘Start‘ when 2 then ‘Succeed‘ when 3 then ‘In progress‘ when 4 then ‘Idle‘ when 5 then ‘Retry‘ when 6 then ‘Fail‘ else ‘‘ end as runstatus, dh.time as LogTime, dh.duration/1000 as duration_s, dh.comments, dh.xact_seqno, dh.current_delivery_rate, dh.current_delivery_latency/1000 current_delivery_latency_s, dh.average_commands as [avg_cmd/tran], dh.delivered_transactions, dh.delivered_commands, cast(dh.delivery_rate as bigint) as avg_cmd_s, dh.delivery_latency/1000 as latency_s, row_number() over(partition by dh.agent_id order by dh.time desc) as rid from dbo.MSdistribution_history dh with(nolock) ) select da.publication, c.runstatus, c.LogTime, c.duration_s, c.comments, c.current_delivery_latency_s, c.current_delivery_rate, c.[avg_cmd/tran], c.avg_cmd_s, c.latency_s, c.delivered_transactions, c.delivered_commands, da.name as DistributionAgentName, da.publisher_db, da.subscriber_db, c.xact_seqno, c.agent_id from cte c inner join dbo.MSdistribution_agents da with(nolock) on c.agent_id=da.id where da.subscriber_id<>-1 and rid<=4 order by c.agent_id,c.LogTime desc
四,Distribution History 的维护
1,在 MSdistribution_history table中有一个非常关键的字段 xact_seqno,用于表示该agent处理的最后一个事务ID。
xact_seqno :The last processed transaction sequence number.
如果一个Article 有N个Subscription,每个Subscription都有一个唯一的AgentID,这N个Agent的 Distribution History中都有一个最大的xact_seqno,表示Agent已经推送成功的最后一个事务ID,这些事务ID中最小的xact_seqno,是成功推送到所有Subscription的最大事务ID,小于该ID的事务,可以从 MSrepl_transactions 表中删除,其相应的Command可以从 MSrepl_commands 表中删除。
系统管理Agent:“Distribution clean up: distribution” 就是使用这个原理删除Trnsaction 和 Command,不过,对于Transaction 和 Command,SQL Server 设置一个默认的Retention,最小的驻留时间是0h,最大的驻留时间是120h,超过120h未被推送到Subscription的command也会被自动删除。
use distribution go EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 120
2,Agent History的清理
在创建Distributor时,SQL Server 自动创建Agent :“Agent history clean up: distribution” 用于"Removes replication agent history from the distribution database.",默认情况下,Agent History会保留5天。
use distribution go EXEC dbo.sp_MShistory_cleanup @history_retention = 120
3,检查Agent是否记录 history
由于Distribution History 非常重要,SQL Server 创建一个管理Agent:Replication agents checkup 用于检查Agent 是否及时的记录History,默认的heartbeat_interval 是10 minutes,如果在 heartbeat_interval 限定的时间内 Agent没有记录进度日志,那么 sp_replication_agent_checkup 为检测到的每个Agent生成 14151 号错误,并且记录该Agent的History table中插入失败的历史记录。
sp_replication_agent_checkup (Transact-SQL)
Checks each distribution database for replication agents that are running but have not logged history within the specified heartbeat interval. This stored procedure is executed at the Distributor on any database.
sp_replication_agent_checkup raises error 14151 for each agent it detects as suspect. It also logs a failure history message about the agents.
exec sys.sp_replication_agent_checkup @heartbeat_interval = 35
使用 sp_helptext 查看该sp,发现最终会在 MSlogreader_history,MSdistribution_history 等存储History的Table中插入失败日志信息。
五,History日志的写入控制
SQL Server Replication使用 .exe application来进行日志的读取和数据的分发同步,每一个 .exe application 都提供一些参数来控制应用程序的行为,通过Distributor的Agent Profile修改参数值,控制Agent写入日志的详细程度和时间间隔。
Replication uses a number of standalone programs, called agents, to carry out the tasks associated with tracking changes and distributing data.
1,控制LogReader Agent 写入日志的详细程度和时间间隔
Replication Log Reader Agent 相关参数:
-HistoryVerboseLevel [ 0| 1| 2]
Specifies the amount of history logged during a log reader operation. You can minimize the performance effect of history logging by selecting 1.
1:Default. Always update a previous history message of the same status (startup, progress, success, and so on). If no previous record with the same status exists, insert a new record.
2:Insert new history records unless the record is for such things as idle messages or long-running job messages, in which case update the previous records.
-KeepAliveMessageInterval keep_alive_message_interval_seconds
Is the number of seconds before the history thread checks if any of the existing connections is waiting for a response from the server. This value can be decreased to avoid having the checkup agent mark the Log Reader Agent as suspect when executing a long-running batch. The default is 300 seconds.
-MessageInterval message_interval
Is the time interval used for history logging. A history event is logged when the MessageInterval value is reached after the last history event is logged.
If there is no replicated transaction available at the source, the agent reports a no-transaction message to the Distributor. This option specifies how long the agent waits before reporting another no-transaction message. Agents always report a no-transaction message when they detect that there are no transactions available at the source after previously processing replicated transactions. The default is 60 seconds.
2,控制Distribution Agent 写入日志的详细程度和时间间隔
Replication Distribution Agent 相关参数:
-HistoryVerboseLevel [ 0 | 1 | 2 | 3 ]
Specifies the amount of history logged during a distribution operation. You can minimize the performance effect of history logging by selecting 1.
HistoryVerboseLevel value | Description |
---|---|
0 | Progress messages are written either to the console or to an output file. History records are not logged in the distribution database. |
1 | Default. Always update a previous history message of the same status (startup, progress, success, and so on). If no previous record with the same status exists, insert a new record. |
2 | Insert new history records unless the record is for such things as idle messages or long-running job messages, in which case update the previous records. |
3 | Always insert new records, unless it is for idle messages. |
-KeepAliveMessageInterval keep_alive_message_interval_seconds
Is the number of seconds before the history thread checks if any of the existing connections is waiting for a response from the server. This value can be decreased to avoid having the checkup agent mark the Distribution Agent as suspect when executing a long-running batch. The default is 300 seconds.
-MessageInterval message_interval Is the time interval used for history logging. A history event is logged when one of these parameters is reached:
The TransactionsPerHistory value is reached after the last history event is logged.
The MessageInterval value is reached after the last history event is logged.
If there is no replicated transaction available at the source, the agent reports a no-transaction message to the Distributor. This option specifies how long the agent waits before reporting another no-transaction message. Agents always report a no-transaction message when they detect that there are no transactions available at the source after previously processing replicated transactions. The default is 60 seconds.
-TransactionsPerHistory [ 0| 1|... 10000]
Specifies the transaction interval for history logging. If the number of committed transactions after the last instance of history logging is greater than this option, a history message is logged. The default is 100. A value of 0 indicates infinite TransactionsPerHistory. See the preceding –MessageIntervalparameter.
3,通过 msdb.dbo.MSagent_profiles 和 msdb.dbo.MSagent_parameters 查看每个Agent的Profile 及其参数设置
select case prof.agent_type when 1 then ‘Snapshot Agent‘ when 2 then ‘Log Reader Agent‘ when 3 then ‘Distribution Agent‘ when 4 then ‘Merge Agent‘ when 9 then ‘Queue Reader Agent‘ else ‘‘ end as agent_type, prof.profile_name, prof.type as IsSystemProfile, prof.def_profile as isDefaultProfile, para.parameter_name, para.value as parameter_value, prof.description from msdb.dbo.MSagent_profiles prof inner join msdb.dbo.MSagent_parameters para on prof.profile_id=para.profile_id order by agent_type
参考文档:
MSdistribution_status (Transact-SQL)
Replication Tables (Transact-SQL)
Replication Stored Procedures (Transact-SQL)
标签:
原文地址:http://www.cnblogs.com/ljhdo/p/5718985.html