pacemaker+mysql+drbd
使用pacemaker创建一个主/备模式的集群,并且创建一个存储(drbd)
会使用到以下软件:
corosync:作为通信层和提供关系管理服务,心跳引擎,检测心跳信息
Pacemaker来实现资源管理
DRBD:作为一个经济的共享存储方案。
crm shell 来显示并修改配置文件
一 配置pacemaker
pacemaker可以去网站上下载,pacemaker有两种心跳引擎:heartbeat和corosync,以下实验使用corosync
server1:
1 yum install -y pacemaker corosync
2 cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
3 vim /etc/corosync/corosync.conf
内容:
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 172.25.38.0
mcastaddr: 226.94.1.138###多播地址,是通过该地址和端口确定哪些主机是一个组###
mcastport: 5405###多播端口###
ttl: 1
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log###日志文件###
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
service {
name: pacemaker###启动corosync的时候开启pacemaker###
ver: 0###版本为0,若是版本为1,则开启corosync的时候不会自动开启pacemaker###
}
4 scp corosync.conf server2:/etc/corosync/###server2的配置文件要相同###
5 /etc/init.d/corosync start###两个节点都要开启###
6 tail -f /var/log/cluster/corosync.log###查看日志信息###
7 crm_verify -VL###校验配置是否正确###
**************************************************************************
如果出现以下报错:
error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
这是因为没有STONITH(fence),默认是打开fence的,而当前没有配置fence,所以要将检测是否有fence的功能关闭,就可以了
解决:使用crm命令。
通过crm配置,有两种方式:
yum install -y crmsh-1.2.6-0.rc2.2.1.x86_64.rpm pssh-2.3.1-2.1.x86_64.rpm
1)交互式:可以补齐命令
[root@server1 ~]# crm
crm(live)# configure
crm(live)configure# show
node server1
node server2
property $id="cib-bootstrap-options" \
dc-version="1.1.10-14.el6-368c726" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="2"
crm(live)configure# property
batch-limit= node-health-yellow=
cluster-delay= pe-error-series-max=
cluster-recheck-interval= pe-input-series-max=
crmd-transition-delay= pe-warn-series-max=
dc-deadtime= placement-strategy=
default-action-timeout= remove-after-stop=
default-resource-stickiness= shutdown-escalation=
election-timeout= start-failure-is-fatal=
enable-acl= startup-fencing=
enable-startup-probes= stonith-action=
is-managed-default= stonith-enabled=
maintenance-mode= stonith-timeout=
migration-limit= stop-all-resources=
no-quorum-policy= stop-orphan-actions=
node-health-green= stop-orphan-resources=
node-health-red= symmetric-cluster=
node-health-strategy=
crm(live)configure# property stonith-enabled=false###关闭fence###
crm(live)configure# commit###提交###
2)非交互式:直接敲打完命令,不能补齐
[root@server1 ~]# crm configure property stonith-enabled=false
[root@server1 ~]# crm_verify -VL
*************************************************************************
8 创建资源
crm(live)configure# primitive vip ocf:heartbeat:IPaddr2 params ip=172.25.38.100 cidr_netmask=24 op monitor interval=30s###primitive指创建资源,创建一个vip,params指参数,op 指选择,monitor指监控,interval指每隔30s监控一次####
crm(live)configure# commit
crm(live)configure# bye ###离开交互界面###
***********************************************************************
如果在写的时候写错了:不能直接delete,会提示你正在运行,要先将资源停止,再删除
[root@server1 ~]# crm
crm(live)# configure
crm(live)configure# delete vip
ERROR: resource vip is running, can‘t delete it
crm(live)configure# cd
crm(live)# resource
crm(live)resource# stop vip
或者只是想修改,不想删除
[root@server1 ~]# crm
crm(live)# configure
crm(live)configure# edit
***********************************************************************
server2:
1 yum install -y pacemaker corosync
2 yum install -y crmsh-1.2.6-0.rc2.2.1.x86_64.rpm pssh-2.3.1-2.1.x86_64.rpm
3 /etc/init.d/corosync start
4 crm_com###监控节点的变化###
Last updated: Sun Jul 30 21:40:24 2017
Last change: Sun Jul 30 21:40:24 2017 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server1 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ server1 server2 ]
vip (ocf::heartbeat:IPaddr2):Started server1
测试:
1 将server1节点standby
[root@server1 corosync]# crm node standby
监控server2:
Last updated: Sun Jul 30 21:47:48 2017
Last change: Sun Jul 30 21:47:49 2017 via crm_attributeon server1
Stack: classic openais (with plugin)
Current DC: server1 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
1 Resources configured
Node server1: standby
Online: [ server2 ]
vip (ocf::heartbeat:IPaddr2):Started server2 ###接管资源##
2 将server1节点重新online,监控server2,会发现没有回切
[root@server1 corosync]# crm node online
监控server2:
Last updated: Sun Jul 30 21:49:10 2017
Last change: Sun Jul 30 21:49:10 2017 via crm_attributeon server1
Stack: classic openais (with plugin)
Current DC: server1 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ server1 server2 ]
vip (ocf::heartbeat:IPaddr2):Started server2
3 将server1节点的corosync关闭,监控server2
[root@server1 ~]# /etc/init.d/corosync stop
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:. [ OK ]
监控server2:发现server2并没有接管资源,那是因为默认集群节点不能少于两个,少于两个的就默认不是集群了
Last updated: Sun Jul 30 22:00:11 2017
Last change: Sun Jul 30 21:49:11 2017 via crm_attributeon server1
Stack: classic openais (with plugin)
Current DC: server2 - partition WITHOUT quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ server2 ]
OFFLINE: [ server1 ]
解决:
[root@server1 ~]# crm
crm(live)# configure
crm(live)configure# property no-quorum-policy=ignore###忽略该策略,使得节点数少于两个的时候,也能构成集群###
crm(live)configure# commit
再次测试:将server1的corosync关闭
[root@server1 ~]# /etc/init.d/corosync stop
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:. [ OK ]
监控server2:发现server2接管资源
Last updated: Sun Jul 30 22:06:46 2017
Last change: Sun Jul 30 22:04:52 2017 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server2 - partition WITHOUT quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ server2 ]
OFFLINE: [ server1 ]
vip (ocf::heartbeat:IPaddr2):Started server2
二 配置drbd
1 概念:
drbd是一个用软件实现的,无共享的,服务器之间镜像块设备内容的存储复制解决方案。在服务器之间的块设备(包括硬盘、分区、逻辑卷)进行镜像。也就是说当某一个应用程序完成写操作后,它提交的数据不仅仅会保存在本地块设备上,DRBD也会将这份数据复制一份,通过网络传输到另一个节点的块设备上,这样,两个节点上的块设备上的数据将会保存一致,这就是镜像功能。
DRBD特性:
1)时实行:当某个应用程序完成对数据的修改时,复制功能立即发生
2)透明型:应用程序的数据在镜像块设备上是独立透明的,他们的数据在两个节点上都保存一份,因此,无论哪一台服务器down了,都不会影响应用程序读取数据的操作,所以说是透明的
3)同步镜像和异步镜像:同步镜像表示当应用程序提交本地的写操作后,数据会同步写到两个节点上去;异步镜像表示当应用程序提交写操作后,只有在本地的节点上完成写操作后,另一个节点可以完成写操作。
先给每个节点添加一个4G的磁盘,(磁盘越大,同步的时间越久,实验时就先用4G的磁盘,同步的时间能快些)建立的存储通过以太网同步,是块级别的同步,两个磁盘的东西是一样
1 下载drbd
http://oss.linbit.com/drbd
drbd-8.4.2.tar.gz
server1:
源码编译drdb:
1 tar zxf drbd-8.4.2.tar.gz
2 cd drbd-8.4.2
3 ./configure --enable-spec --with-km###--enable-spec要生成rpm包--with-km,生成模块###
*****************************************************************
编译时出现的error:
configure: error: Cannot build utils without flex, either install flex or pass the --without-utils option.
解决:
yum provides */flex###查找以下flex属于哪个包###
Loaded plugins: product-id, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
HighAvailability/filelists_db | 38 kB 00:00
LoadBalancer/filelists_db | 3.9 kB 00:00
ResilientStorage/filelists_db | 39 kB 00:00
ScalableFileSystem/filelists_db | 3.0 kB 00:00
flex-2.5.35-8.el6.x86_64 : A tool for creating scanners (text pattern
: recognizers)
Repo : rhel-source
Matched from:
Filename : /usr/bin/flex
[root@server1 drbd-8.4.2]# yum install -y flex-2.5.35-8.el6.x86_64
*******************************************************************
4 rpmbuild -bb drbd.spec###-bb指产生二进制可执行文件,没有rpmbuild命令的先安装rpm-build####
********************************************************************
出现一些error:
[root@server1 drbd-8.4.2]# rpmbuild -bb drbd.spec
error: File /root/rpmbuild/SOURCES/drbd-8.4.2.tar.gz: No such file or directory
解决:
cp ~/drbd-8.4.2.tar.gz /root/rpmbuild/SOURCES/
然后再次rpmbuild -bb drbd.spec即可
********************************************************************
6 cd ~/rpmbuild/RPMS/x86_64/###查看产生的rpm包,发现少了drbd-km的rpm包###
[root@server1 x86_64]# ls
drbd-8.4.2-2.el6.x86_64.rpm
drbd-bash-completion-8.4.2-2.el6.x86_64.rpm
drbd-debuginfo-8.4.2-2.el6.x86_64.rpm
drbd-heartbeat-8.4.2-2.el6.x86_64.rpm
drbd-pacemaker-8.4.2-2.el6.x86_64.rpm
drbd-udev-8.4.2-2.el6.x86_64.rpm
drbd-utils-8.4.2-2.el6.x86_64.rpm
drbd-xen-8.4.2-2.el6.x86_64.rpm
7 rpmbuild -bb drbd-8.4.2/drbd-km.spec
**************************************************************
会出现一些error:drbd-km.spec是一个内核模块,因此需要内核开发包
error: Failed build dependencies:
kernel-devel is needed by drbd-km-8.4.2-2.el6.x86_64
解决:
yum install -y kernel-devel
然后再次rpmbuild -bb drbd-km.spec即可
**************************************************************
8 cd rpmbuild/RPMS/x86_64/###再次查看以下人rpm包###
[root@server1 x86_64]# ls
drbd-8.4.2-2.el6.x86_64.rpm
drbd-bash-completion-8.4.2-2.el6.x86_64.rpm
drbd-debuginfo-8.4.2-2.el6.x86_64.rpm
drbd-heartbeat-8.4.2-2.el6.x86_64.rpm
drbd-km-2.6.32_431.el6.x86_64-8.4.2-2.el6.x86_64.rpm
drbd-km-debuginfo-8.4.2-2.el6.x86_64.rpm
drbd-pacemaker-8.4.2-2.el6.x86_64.rpm
drbd-udev-8.4.2-2.el6.x86_64.rpm
drbd-utils-8.4.2-2.el6.x86_64.rpm
drbd-xen-8.4.2-2.el6.x86_64.rpm
9 scp * server2:/root/###传给server2###
10 rpm -ivh *###server2也要安装###
11 cd /etc/drbd.d/
12 vim sqldata.res
**********************************************************************
drbd的主配置文件为/etc/drbd.conf;为了管理的便捷性,目前通常会将些配置文件分成多个部分,且都保存至/etc/drbd.d目录中,主配置文件中仅使用"include"指令将这些配置文件片断整合起来。通常,/etc/drbd.d目录中的配置文件为global_common.conf和所有以.res结尾的文件。其中global_common.conf中主要定义global段和common段,而每一个.res的文件用于定义一个资源
resource段则用于定义drbd资源,每个资源通常定义在一个单独的位于/etc/drbd.d目录中的以.res结尾的文件中。资源在定义时必须为其命名,名字可以由非空白的ASCII字符组成。每一个资源段的定义中至少要包含两个host子段,以定义此资源关联至的节点,其它参数均可以从common段或drbd的默认中进行继承而无须定义
[root@server1 x86_64]# cat /etc/drbd.conf ###主配置文件###
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
[root@server1 x86_64]# cd /etc/drbd.d/
[root@server1 drbd.d]# ls
global_common.conf
[root@server1 drbd.d]# vim sqldata.res
内容:
resource sqldata {
meta-disk internal;
device /dev/drbd1;###映射磁盘###
syncer {
verify-alg sha1;
}
on server1 {###一定要写主机名###
disk /dev/vdb;###server1上的磁盘###
address 172.25.78.1:7789;
}
on server2 {
disk /dev/vdb;
address 172.25.78.2:7789;
}
}
***********************************************************************
13 scp sqldata.res server2:/etc/drbd.d/
14 drbdadm create-md sqldata(资源名称)###初始化资源###
15 /etc/init.d/drbd start###两个节点要同时开启服务,服务开启后,就有/dev/drbd1#
[root@server1 drbd.d]# /etc/init.d/drbd start
Starting DRBD resources: [
create res: sqldata
prepare disk: sqldata
adjust disk: sqldata
adjust net: sqldata
]
.
[root@server1 drbd.d]# ll /dev/drbd1
brw-rw---- 1 root disk 147, 1 Jul 31 01:30 /dev/drbd1
[root@server1 drbd.d]# cat /proc/drbd ###ds:Inconsistent表示底层数据没有同步###
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root@server1, 2017-07-30 23:21:09
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4194140
16 drbdadm primary sqldata --force###强制同步到primary节点#
###在另一个节点上可以监控同步信息###
watch -n 1 cat /proc/drbd
###同步完后,就会发现原来的Inconsistent变成了UpToDate,表示已经同步了###
[root@server1 drbd.d]# cat /proc/drbd
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root@server1, 2017-07-30 23:21:09
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4194140 nr:0 dw:0 dr:4194804 al:0 bm:255 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
测试:
将/dev/drbd格式化并挂载到/var/lib/mysql下。
注意:只有在primary节点下才可以对drbd操作
server1:
[root@server1 ~]# cat /proc/drbd ###先查看是否为primary节点###
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root@server1, 2017-07-30 23:21:09
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4194140 nr:0 dw:0 dr:4194804 al:0 bm:255 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@server1 ~]# mkfs.ext4 /dev/drbd1 ###格式化###
(省略过程...)
[root@server1 ~]# mount /dev/drbd1 /var/lib/mysql/###挂载到mysql的数据目录下###
[root@server1 ~]# chown mysql.mysql /var/lib/mysql/
[root@server1 ~]# /etc/init.d/mysqld start
[root@server1 ~]# mysql
[root@server1 mysql]# /etc/init.d/mysqld stop
[root@server1 ~]# umount /var/lib/mysql/
[root@server1 ~]# drbdadm secondary sqldata###变成secondary节点,因为等下server2要变成primary节点###
[root@server1 ~]# cat /proc/drbd
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root@server1, 2017-07-30 23:21:09
1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:4352812 nr:0 dw:158672 dr:4195917 al:50 bm:255 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
server2:
[root@server2 mysql]# drbdadm primary sqldata###只有primary节点可以对drbd操作###
[root@server2 mysql]# cat /proc/drbd
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root@server1, 2017-07-30 23:21:09
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:4352812 dw:4352812 dr:664 al:0 bm:255 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@server2 ~]# /etc/init.d/mysqld start
[root@server2 ~]# mysql
mysql> quit
[root@server2 ~]# cd /var/lib/mysql/
[root@server2 mysql]# ls
ibdata1 ib_logfile0 ib_logfile1 lost+found mysql mysql.sock test
[root@server2 ~]# /etc/init.d/mysqld stop
[root@server2 ~]# umount /var/lib/mysql/
三 配置fence
之前在作rhcs套件的时候做过fence的配置,在这里就不赘述了。
systemctl start fence_virtd.service ###在物理机上###
[root@foundation78 Desktop]# cd /etc/cluster/
[root@foundation78 cluster]# ls
fence_xvm.key
server1:
1 ll /etc/cluster/fence_xvm.key
2 which fence_xvm
/usr/sbin/fence_xvm
3 rpm -qf /usr/sbin/fence_xvm
fence-virt-0.2.3-15.el6.x86_64
4 stonith_admin -M -a fence_xvm
5 crm configure primitive vmfence stonith:fence_xvm params pcmk_host_map="server1:vm1;server2:vm2" op monitor interval=1min###添加资源###
server2:
1 ll /etc/cluster/fence_xvm.key
2 crm_mon
Last updated: Mon Jul 31 03:06:58 2017
Last change: Mon Jul 31 03:02:13 2017 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server2 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ server1 server2 ]
vip (ocf::heartbeat:IPaddr2):Started server2
vmfence (stonith:fence_xvm): Started server1
测试:
server1:server2的内核奔溃或者eth0 down了(ifdown eth0)
[root@server2 ~]# echo c > /proc/sysrq-trigger
监控server1:
Last updated: Mon Jul 31 03:12:11 2017
Last change: Mon Jul 31 03:02:13 2017 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server2 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ server1 server2 ]
vip (ocf::heartbeat:IPaddr2):Started server1###资源组切换到server1###
vmfence (stonith:fence_xvm): Started server1###等到server2断电重启后就会马上变成server2###
四 整合pacemake+drbd+mysql
在集群中配置drbd
server1:
crm
configure
primitive DBdata ocf:linbit:drbd params drbd_resource=sqldata op monitor interval=1min###添加drbd资源,激活drbd资源###
ms DBdataclone DBdata meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true###ms指令,设置主备环境,因为只有primary节点可以drbd操作,将drbd放入主备环境,notify=true:打开通知###
commit
**********************************************************************
提交后会出现以下的警告:那是因为你设置的时间小与默认时间,会按照默认的时间的来,不是什么问题,没关系
WARNING: DBdata: default timeout 20s for start is smaller than the advised 240
WARNING: DBdata: default timeout 20s for stop is smaller than the advised 100
WARNING: DBdata: action monitor not advertised in meta-data, it may not be supported by the RA
**********************************************************************
primitive DBfs ocf:heartbeat:Filesystem params device=/dev/drbd1 directory=/var/lib/mysql fstype=ext4###设置文件系统挂载###
colocation fs_on_debd inf: DBfs DBdataclone:Master####colocation指绑定,设置资源粘制,保证存储一定要与master在一起###
order DBfs-after-DBdata inf: DBdataclone:promote DBfs:start###指定顺序,文件系统的挂载一定要在drbd资源激活之后
primitive mysqlDB lsb:mysqld op monitor interval=30s
group mysqlservice vip DBfs mysqlDB
commit
监控server2:
Last updated: Mon Jul 31 04:15:04 2017
Last change: Mon Jul 31 04:15:01 2017 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server1 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
6 Resources configured
Online: [ server1 server2 ]
vmfence (stonith:fence_xvm): Started server2
Master/Slave Set: DBdataclone [DBdata]
Masters: [ server1 ]
Slaves: [ server2 ]
Resource Group: mysqlservice
vip (ocf::heartbeat:IPaddr2):Started server1
DBfs(ocf::heartbeat:Filesystem): Started server1
mysqlDB (lsb:mysqld): Started server1
测试:
server1:将主节点down掉
crm(live)# node
crm(live)node# standby
server2监控:
Node server1: standby
Online: [ server2 ]
vmfence (stonith:fence_xvm): Started server2
Master/Slave Set: DBdataclone [DBdata]
Masters: [ server2 ]
Stopped: [ server1 ]
Resource Group: mysqlservice
vip (ocf::heartbeat:IPaddr2):Started server2
DBfs(ocf::heartbeat:Filesystem): Started server2
mysqlDB (lsb:mysqld): Started server2
原文地址:http://12774272.blog.51cto.com/12764272/1952194