标签:dev live down not ace set active resource 切换
一、环境 $ cat /etc/redhat-release CentOS Linux release 7.0.1406 (Core) node1: 192.168.111.128 node2: 192.168.111.129 vip-master: 192.168.111.228 vip-slave: 192.168.111.229 配置主机名 hostnamectl set-hostname postgres128 hostnamectl set-hostname postgres129 [root@postgres128 ~]# vi /etc/hosts 192.168.111.128 postgres128 192.168.111.129 postgres129 二、配置Linux集群环境 1. 安装Pacemaker和Corosync 在所有节点执行: [root@postgres128 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python 2. 禁用防火墙 在所有节点执行: [root@postgres128 ~]# systemctl disable firewalld.service [root@postgres128 ~]# systemctl stop firewalld.service 3. 启用pcs 在所有节点执行: [root@postgres128 ~]# systemctl start pcsd.service [root@postgres128 ~]# systemctl enable pcsd.service [root@postgres128 ~]# echo hacluster | sudo passwd hacluster --stdin 4. 集群认证 在任何一个节点上执行,这里选择node1: [root@postgres128 ~]# pcs cluster auth -u hacluster -p hacluster 192.168.111.128 192.168.111.129 5. 同步配置 在node1上执行: [root@postgres128 ~]# pcs cluster setup --last_man_standing=1 --name pgcluster 192.168.111.128 192.168.111.129 Shutting down pacemaker/corosync services... Redirecting to /bin/systemctl stop pacemaker.service Redirecting to /bin/systemctl stop corosync.service Killing any remaining services... Removing all cluster configuration files... 192.168.111.128: Succeeded 192.168.111.129: Succeeded 6. 启动集群 在node1上执行: [root@postgres128 ~]# pcs cluster start --all 192.168.111.128: Starting Cluster... 192.168.111.129: Starting Cluster... 7. 检验 1)检验corosync 在node1上执行: $ sudo pcs status corosync Membership information ---------------------- Nodeid Votes Name 1 1 192.168.111.128 (local) 2 1 192.168.111.129 2)检验pacemaker [root@postgres128 ~]# pcs status Cluster name: pgcluster WARNING: no stonith devices and stonith-enabled is not false Last updated: Wed Sep 6 02:20:28 2017 Last change: Current DC: NONE 0 Nodes configured 0 Resources configured Full list of resources: PCSD Status: 192.168.111.128: Online 192.168.111.129: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled 三、安装和配置PostgreSQL 配置参考 查看PG流复制结果。 [postgres@localhost ~]$ psql psql (9.6.0) Type "help" for help. postgres=# \x Expanded display is on. postgres=# select * from pg_stat_replication ; -[ RECORD 1 ]----+------------------------------ pid | 2111 usesysid | 16384 usename | replica application_name | standby01 client_addr | 192.168.111.129 client_hostname | client_port | 45608 backend_start | 2017-09-06 05:13:29.766227-04 backend_xmin | 1756 state | streaming sent_location | 0/50000D0 write_location | 0/50000D0 flush_location | 0/50000D0 replay_location | 0/5000098 sync_priority | 1 sync_state | sync 停止PostgreSQL服务 $ pg_stop waiting for server to shut down...... done server stopped 四、配置自动切换 1. 配置 在node1执行: 将配置步骤先写到脚本 # 将cib配置保存到文件 [root@postgres128 postgres]# vi cluster_setup.sh pcs cluster cib pgsql_cfg # 在pacemaker级别忽略quorum pcs -f pgsql_cfg property set no-quorum-policy="ignore" # 禁用STONITH pcs -f pgsql_cfg property set stonith-enabled="false" # 设置资源粘性,防止节点在故障恢复后发生迁移 pcs -f pgsql_cfg resource defaults resource-stickiness="INFINITY" # 设置多少次失败后迁移 pcs -f pgsql_cfg resource defaults migration-threshold="3" # 设置master节点虚ip pcs -f pgsql_cfg resource create vip-master IPaddr2 ip="192.168.111.228" cidr_netmask="24" op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="10s" on-fail="restart" op stop timeout="60s" interval="0s" on-fail="block" # 设置slave节点虚ip pcs -f pgsql_cfg resource create vip-slave IPaddr2 ip="192.168.111.229" cidr_netmask="24" op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="10s" on-fail="restart" op stop timeout="60s" interval="0s" on-fail="block" # 设置pgsql集群资源 # pgctl、psql、pgdata和config等配置根据自己的环境修改,node list填写节点的hostname,master_ip填写虚master_ip pcs -f pgsql_cfg resource create pgsql pgsql pgctl="/opt/pgsql96/bin/pg_ctl" psql="/opt/pgsql96/bin/psql" pgdata="/home/postgres/data" config="/home/postgres/data/postgresql.conf" rep_mode="sync" node_list="postgres128 postgres129" master_ip="192.168.111.228" repuser="replica" primary_conninfo_opt="password=replica keepalives_idle=60 keepalives_interval=5 keepalives_count=5" restore_command="cp /home/postgres/arch/%f %p" restart_on_promote=‘true‘ op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="4s" on-fail="restart" op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" op promote timeout="60s" interval="0s" on-fail="restart" op demote timeout="60s" interval="0s" on-fail="stop" op stop timeout="60s" interval="0s" on-fail="block" # 设置master/slave模式,clone-max=2,两个节点 pcs -f pgsql_cfg resource master pgsql-cluster pgsql master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true # 配置master ip组 pcs -f pgsql_cfg resource group add master-group vip-master # 配置slave ip组 pcs -f pgsql_cfg resource group add slave-group vip-slave # 配置master ip组绑定master节点 pcs -f pgsql_cfg constraint colocation add master-group with master pgsql-cluster INFINITY # 配置启动master节点 pcs -f pgsql_cfg constraint order promote pgsql-cluster then start master-group symmetrical=false score=INFINITY # 配置停止master节点 pcs -f pgsql_cfg constraint order demote pgsql-cluster then stop master-group symmetrical=false score=0 # 配置slave ip组绑定slave节点 pcs -f pgsql_cfg constraint colocation add slave-group with slave pgsql-cluster INFINITY # 配置启动slave节点 pcs -f pgsql_cfg constraint order promote pgsql-cluster then start slave-group symmetrical=false score=INFINITY # 配置停止slave节点 pcs -f pgsql_cfg constraint order demote pgsql-cluster then stop slave-group symmetrical=false score=0 # 把配置文件push到cib pcs cluster cib-push pgsql_cfg 2. 执行加载命令 [root@postgres128 postgres]# chmod +x cluster_setup.sh [root@postgres128 postgres]# ./cluster_setup.sh Adding pgsql-cluster master-group (score: INFINITY) (Options: symmetrical=false score=INFINITY first-action=promote then-action=start) Adding pgsql-cluster master-group (score: 0) (Options: symmetrical=false score=0 first-action=demote then-action=stop) Adding pgsql-cluster slave-group (score: INFINITY) (Options: symmetrical=false score=INFINITY first-action=promote then-action=start) Adding pgsql-cluster slave-group (score: 0) (Options: symmetrical=false score=0 first-action=demote then-action=stop) CIB updated 3. 启动数据库 3查看pcs status [root@postgres129 ~]# pcs status Cluster name: pgcluster Last updated: Wed Sep 6 22:13:39 2017 Last change: Wed Sep 6 06:00:26 2017 via cibadmin on postgres128 Stack: corosync Current DC: postgres128 (1) - partition with quorum Version: 1.1.10-29.el7-368c726 2 Nodes configured 4 Resources configured Online: [ postgres128 postgres129 ] Full list of resources: Master/Slave Set: pgsql-cluster [pgsql] Stopped: [ postgres128 postgres129 ] Resource Group: master-group vip-master (ocf::heartbeat:IPaddr2): Started postgres128 Resource Group: slave-group vip-slave (ocf::heartbeat:IPaddr2): Started postgres129 PCSD Status: 192.168.111.128: Unable to authenticate 192.168.111.129: Unable to authenticate Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
参考文献:
搭建+维护
https://my.oschina.net/aven92/blog/518928
https://my.oschina.net/aven92/blog/519458
标签:dev live down not ace set active resource 切换
原文地址:http://www.cnblogs.com/lottu/p/7490777.html