---恢复内容开始---
1.检查MHA状态:
#masterha_check_status --conf=/etc/mha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
状态为没有运行。
2.检查MHA主从复制状态
masterha_check_repl --conf=/etc/mha/app1.cnf
Wed Mar 28 11:38:22 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Wed Mar 28 11:38:22 2018 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Wed Mar 28 11:38:22 2018 - [info] Reading server configuration from /etc/mha/app1.cnf.. Wed Mar 28 11:38:22 2018 - [info] MHA::MasterMonitor version 0.57. Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/Server.pm, ln935] SQL Thread is stopped(error) on 10.1.46.214(10.1.46.214:3306)! Errno:1032, Error:Could not execute Delete_rows event on table shop.t_ncl_agent_openid; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439 Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln703] Server 10.1.46.214(10.1.46.214:3306) is alive, but does not work as a slave! Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/local/share/perl5/MHA/MasterMonitor.pm line 329 Wed Mar 28 11:38:22 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers. Wed Mar 28 11:38:22 2018 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK! |
3.主从复制检查:其中一个从库:
mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000070 Read_Master_Log_Pos: 232233295 Relay_Log_File: weixintbdb02-relay-bin.000024 Relay_Log_Pos: 877993376 Relay_Master_Log_File: mysql-bin.000010 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Could not execute Delete_rows event on table shop.t_ncl; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439 Skip_Counter: 0 Exec_Master_Log_Pos: 877993203 Relay_Log_Space: 64262720902 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1032 Last_SQL_Error: Could not execute Delete_rows event on table shop.t_ncl_agent_openid; Can‘t find record in ‘t_ncl_agent_openid‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439 Replicate_Ignore_Server_Ids: Master_Server_Id: 462131 Master_UUID: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9 Master_Info_File: /data/mysql/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 180327 23:55:13 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:9-123314222 Executed_Gtid_Set: 98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-8, 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652530 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) |
检查另一个从库复制状态:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
两个线程均正常。
4.解析日志文件
mysqlbinlog -v --base64-output=decode-rows weixintbdb02-relay-bin.000024 > /tmp/weixintbdb02.sql
SET @@SESSION.GTID_NEXT= ‘9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:13652530‘/*!*/; #180320 13:30:50 server id 462131 end_log_pos 877993268 CRC32 0x8c10dea2 GTID last_committed=1536936 sequence_number=1536937 rbr_only=yes |
主从复制显示sql线程已经执行完gtid号为9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652530,日志在GTID号为9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652531均是对shop`.`shop.t_ncl表的删除操作,查看从库中有这个表,但是没有数据,说明从库可能手动删除过,或者根本就没有数据。
解决跳过GTID号为9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652531的操作。
解决办法:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> set gtid_next=‘9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:13652531‘;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.1.46.213
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000070
Read_Master_Log_Pos: 232703867
Relay_Log_File: weixintbdb02-relay-bin.000024
Relay_Log_Pos: 877993376
Relay_Master_Log_File: mysql-bin.000010
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table shop.t_ncl; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439
Skip_Counter: 0
Exec_Master_Log_Pos: 877993203
Relay_Log_Space: 64263191474
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table shop.t_ncl; Can‘t find record in ‘t_ncl‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000010, end_log_pos 878001439
Replicate_Ignore_Server_Ids:
Master_Server_Id: 462131
Master_UUID: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9
Master_Info_File: /data/mysql/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 180327 23:55:13
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:9-123314722
Executed_Gtid_Set: 98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-8,
9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652530
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (10.00 sec)
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000012
Position: 436
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: 98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-8,
9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-13652531
mysql> set gtid_next=‘automatic‘;
Query OK, 0 rows affected (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
通过执行一个空事物跳过31的GTID号,在重启slave。
5.后续操作
因为mha需要在mysql主从同步复制正常的时候才能正常运行。现在从库10.1.46.214还在追日志,等日志追平后,将mha重新启动。
检查从库同步:
mysql> show slave status\G;
查看点:
同步状态:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
复制同步情况:
Retrieved_Gtid_Set: 9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:9-123315286
Executed_Gtid_Set: 45cb77c3-1cfa-11e8-a55f-0425c555a5f3:1-3,
98ef4e90-1cff-11e8-ab2a-0425c58fa8d7:1-3,
9ab25963-1cf8-11e8-b9b9-0425c58fa8e9:1-123315286
Retrieved_Gtid_Set 提取的最大GTID号和Executed_Gtid_Set执行的最大GTID相同,则同步完成。
启动MHA:
nohup masterha_manager --conf=/etc/mha/app1.cnf --ignore_last_failover > /etc/mha/app1/manager.log < /dev/null 2>&1 &