标签:sla progress read_only 故障转移 数字 模式 pid epo priority
master服务器异常down机后,两个原有的slave1,slave2服务器接管服务,如slave1变成新的master服务器,slave2变成slave1的从库。配置文件主要参数讲解:
sentinel monitor mymaster 127.0.0.1 6379 1 几个哨兵发现down才认为真正的down
sentinel down-after-milliseconds mymaster 30000 多少毫秒后连接不到master认为断开
sentinel parallel-syncs mymaster 1 同时把几台master指到新的master机器。
sentinel failover-timeout mymaster 180000 多长时间失败
启动哨兵
[root@ZFRC-YW-YJF-TEST-370123 redis]# ./bin/redis-server ./sentinel.conf --sentinel
17400:X 28 Jun 17:17:32.853 # Not listening to IPv6: unsupproted
.
_.-__ ‘‘-._ <br/>_.-
.
. ‘‘-. Redis 3.2.13 (00000000/0) 64 bit
.-.-```. ```\/ _.,_ ‘‘-._ <br/>( ‘ , .-` | `, ) Running in sentinel mode<br/>|`-._`-...-` __...-.
-.|‘` .-‘| Port: 26379
| -._
. / .-‘ | PID: 17400-._
-. `-./ .-‘ .-‘
|`-.-._
-..-‘ .-‘.-‘|
| -._
-. .-‘.-‘ | http://redis.io
`-. -._
-..-‘.-‘ .-‘
|-._
-._ -.__.-‘ _.-‘_.-‘| <br/>|
-.`-. .-‘.-‘ | -._
-._-.__.-‘_.-‘ _.-‘ <br/>
-._ -.__.-‘ _.-‘ <br/>
-. .-‘
`-.__.-‘
17400:X 28 Jun 17:17:32.854 # Sentinel ID is b81b851b02fec76bcfc7144b0a675fdedecf7188
17400:X 28 Jun 17:17:32.854 # +monitor master mymaster 127.0.0.1 6379 quorum 1
17400:X 28 Jun 17:17:32.854 +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:17:32.855 +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
测试将master down,查看哨兵是否有故障转移
[root@ZFRC-YW-YJF-TEST-370123 ~]# cd /usr/local/redis/
[root@ZFRC-YW-YJF-TEST-370123 redis]# ./bin/redis-cli
127.0.0.1:6379> shutdown
not connected>
日志打印出了一些枚举的过程,关键字switch为master机
17400:X 28 Jun 17:19:03.363 # +sdown master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.363 # +odown master mymaster 127.0.0.1 6379 #quorum 1/1
17400:X 28 Jun 17:19:03.363 # +new-epoch 1
17400:X 28 Jun 17:19:03.363 # +try-failover master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.364 # +vote-for-leader b81b851b02fec76bcfc7144b0a675fdedecf7188 1
17400:X 28 Jun 17:19:03.364 # +elected-leader master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.364 # +failover-state-select-slave master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.464 # +selected-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.464 +failover-state-send-slaveof-noone slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.564 +failover-state-wait-promotion slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.917 # +promoted-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.917 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:04.006 +slave-reconf-sent slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:04.982 +slave-reconf-inprog slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:04.982 +slave-reconf-done slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:05.064 # +failover-end master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:05.064 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380
17400:X 28 Jun 17:19:05.064 +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380
17400:X 28 Jun 17:19:05.064 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
17400:X 28 Jun 17:19:35.080 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
同时登陆到6380从库,查看是否现在为master主节点
127.0.0.1:6380> info replication
role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=6381,state=online,offset=22858,lag=0
master_repl_offset:22858
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:22857
127.0.0.1:6380>
127.0.0.1:6381> info replication
role:slave
master_host:127.0.0.1
master_port:6380
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:35773
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
有时候主库down了,从库切换为master不是顺序晋升,如master挂了后,6381为主库了。其实是有个参数控制,在redis配置文件中,不在哨兵配置文件。
slave-priority 100 该数字越小。优先级越高。
标签:sla progress read_only 故障转移 数字 模式 pid epo priority
原文地址:https://blog.51cto.com/yangjunfeng/2415069