V 8 nfs+drbd+heartbeat

时间：2016-08-12 22:15:37 阅读：553 评论：0 收藏：0 [点我收藏+]

nfs+drbd+heartbeat，nfs或分布式存储mfs只要有单点都可用此方案解决

在企业实际生产场景中，nfs是中小企业最常用的存储架构解决方案之一，该架构方案部署简单、维护方便，只需通过配inotify+rsync简单而高效的数据同步方式就可实现对nfs存储系统的数据进行异机主从同步及类似MySQL的rw splitting，且多个从（读r）还可通过LVS或haproxy实现LB，既分担大并发读数据的压力又排除了从的单点故障；

web server上的存储：

方案一（读写在任意一台web server上都行，通过inotify+rsync将每个web server上的数据同步至其它web server，例如web1-->web2-->web3-->web2-->web1）；

方案二（在LB器上配置，在写（上传文件）时只能到web3上，r在web｛1，2｝上，使用inotify+rsync在同步时web3-->web2、web3-->web1）；

方案三（使用共享存储nfs，若只一个rw都在一个上就成了单点，再加一台，一主一备，彼此间使用inotify+rsync同步数据，这两个可rw都放到一个上，另一个仅用来备份数据，也可以读在备上写在主上；一般读多写少，可再加一台作为备，一主多从，减轻r的压力，同步数据时主-->备1、主-->备2；若某一个存储故障，web{1,2,3}要重新挂载，备挂掉一个不影响，主若挂掉则不可写；于是给主做高可用master-active和master-inactive，这两台主同一时间仅一台对外提供服务，master-inactive为不活动状态仅在切换后才对外提供服务）；

方案四（弃用nfs共享存储，使用master-active和master-inactive，只将数据写到共享存储上，再返回来将共享存储的数据同步到web{1,2,3}的本地，读时直接从本地拿；

一主多从模型中，若要实现当主挂掉时仍可写，且可继续同步到从，用nfs+drbd+heartbeat实现主的高可用，解决主单点问题，当master-activenfs故障切至master-inactive nfs上，这两主的数据是一致的，master-inactivenfs会自动与其它所有从nfs进行同步，从而实现nfs存储系统热备

master-active故障切至备master-inactive上时，备node要仍能向nfs slave同步数据，此时同步就不能全部同步而要仅同步切换后变更的数据，此处可用sersync代替inotify，通过sersync的-r选项（或者也可以先不让inotify启动，待备node的heartbeat启动并挂载好之后，再开启inotify服务）

注：此方案与MFS、FastDFS、GFS等FS相比，部署简单、维护控制方面较容易，符合简单易高效的原则；但也有缺陷，每个node都是全部的数据（和MySQL同步一样），大数据量文件同步很可能有数据延迟发生，可根据不同的数据目录进行拆分同步（类似MySQL的库表拆分方案），对于延迟也可开启多个同步实例并通过程序实现控制读写逻辑，还要能监测到同步状态

nfs高可用方案，解决在两主node在切换时，nfsslave读不到数据卡死状态，可从以下几方面入手：

rpcbind服务要一直确保开启（主node、备node、nfs客户端都要开启）；

nfs client（nfs slave）监控本地已挂载的nfs共享目录，如果发现读不了，执行重新挂载；

nfs client监控master-inactivenode是否有VIP出现或者drbd的状态变为Primary，如果有执行重新挂载（nfs服务切换时通过SSH等机制nfs client实现remount；利用nagios监控，如果master inactive node出现VIP，执行一个指脚本进行多台nfsclient的remount）；

如图：椭圆标注是此节操作的内容

注：单台server，无需文件存储，数据放本地，只有做集群的情况下才需要做专门的存储

注：问题（单点，rw都在一个上性能不好，企业中做运维要考虑的（数据保护；7*24小时持续服务）

注：

web1和web2一般用LNMP；

IMG1和IMG2一般nginx或lighttpd；

该方案既解决nfs master单点，又解决了并发读性能问题，但如果数据写并发持续加大，会导致如下问题：

适用于200-300张/s上传的图片，并发同步效率方面还可以，若高于300张/s可能导致master和slave同步延迟，解决办法：开启多线程同步，优化监控事件、磁盘IO、网络IO；

若IMG server很多的情况下，只有一台master，master既负责写，又负责给多台同步数据，压力会很大；

图片问题非常大时，每个node都是全部完整的数据，若总容量3T以上，可能导致单台server存储空间不够，解决办法：（一、可利用MySQL拆库的思路解决容量、写性能、同步延迟的问题，例如初期规则img1--img5，5个目录对应5个域名，挂载这5个目录，每个imgNUM变为一组新的nfs主从高可用及rw spltting的集群，rw splitting可用POST或webDAV的方式）；（二、通过DNS扩展多主的架构，增加新的服务意味着单点）；（三、利用MySQL、Oracle、Mongodb、cassandra等数据库的内部功能实现文件数据的同步，爱奇艺用mongodb的GridFS做图片存储）；

注：mongodb的GridFS做图片存储（支持分布式，设计思路：图片存储唯一；只存原始图；首次请求生成缩略图并生成静态文件；url固定，根据不同url产生缩略图；参考Abusing Amazon images

注：facebook图片管理架构

注：

给nfs做HA解决了单点，浪费了一台server；

nfs两主之间是通过heartbeat+drbd，采用drbd的C协议实时同步；

nfs(M)和nfs(S)之间通过inotify+rsync异步同步，nfs(S)通过VIP与nfs(M)进行同步，nfsslaveNUM用来读，nfs master用来写，这解决了并发读性能问题；也可将nfs master只写，再由nfs master推至appserver（弃用nfs方案）；

物理磁盘做RAID10或RAID0根据性能和冗余需求来选择；server之间、server和switch是用双千兆网卡bounding绑定；应用server（包括不限于web）通过VIP访问nfs(M)，通过不同的VIP访问LB的nfs(S)存储池；nfs(M)的数据在drbd的分区中；

在数据量不大的情况下，可将直接将数据从nfs(M)上直接同步至appserver本地，读全都从appserver本地读取，写要到nfs(M)上；

用inotify+rsync做从master--slave同步时，在并发写大的情况下会导致数据延迟或不同步；

注：

在企业实际工作场景中，只有万不得已才会去搞DB和文件存储的问题，平时应多在网站架构上做调整，以让用户请求最小化的访问DB及存储系统，例如做文件缓存和数据缓存（高并发的核心原则：把所有的用户访问请求都尽量往前推），而不是上来就搞分布式存储系统，对于中小企业用分布式存储就是大炮打蚊子，2012年facebook已经很大的时候还是用nfs存储系统（分布式不是万能的，会消耗大量的人力、物力，控制不好会带来灾难的后果）

注：

为缓解网站访问的压力，尽量将user访问的内容往前推，有放到user本地的就不要放到CDN，能放到CDN的就不要放到本地server，充分利用每一层的缓存，直到万不得已才让用户访问后端的DB，在此基础上若撑不住，解决办法：使用ssd+sata，还不行使用分布式存储

1、安装配置heartbeat

准备环境：

VIP：10.96.20.8

master：eth0（10.96.20.113）、eth1（172.16.1.113，不配网关及dns）、主机名（test-master）

backup：eth0（10.96.20.114）、eth1（172.16.1.114，不配网关及dns）、主机名（test-backup）

双网卡、双硬盘、

注：eth0为管理IP；eth1心跳连接及drbd传输通道，若是生产环境中心跳传输和数据传输用一个网卡要做限制，给心跳留有带宽

注：规范vmware中标签，Xshell中标签，公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录，方便分发及管理维护

test-master（分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致，/etc/hosts文件，ssh双机互信，时间同步，iptables，selinux）：

[root@test-master ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 6.5(Santiago)

[root@test-master ~]# uname -rm

2.6.32-431.el6.x86_64 x86_64

[root@test-master ~]# uname -n

test-master

[root@test-master ~]# ifconfig | grep eth0 -A 1

eth0 Link encap:Ethernet HWaddr00:0C:29:1F:B6:AC

inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0

[root@test-master ~]# ifconfig | grep eth1 -A 1

eth1 Link encap:Ethernet HWaddr00:0C:29:1F:B6:B6

inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0

[root@test-master ~]# routeadd -host 172.16.1.114 dev eth1 #（添加主机路由，心跳传送通过指定网卡出去，此句可追加到/etc/rc.local中，也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113）

[root@test-master ~]# ssh-keygen-t rsa -f ./.ssh/id_rsa -P ‘‘

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:12root@test-master

The key‘s randomart image is:

+--[ RSA 2048]----+

| E o.. |

| .+ + |

|.+.* . |

|oo* o. . |

|+o.. =S |

|+. o . + |

|o o . |

| . |

| |

+-----------------+

[root@test-master ~]# ssh-copy-id-i ./.ssh/id_rsa root@test-backup

The authenticity of host ‘test-backup(10.96.20.114)‘ can‘t be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continue connecting(yes/no)? yes

Warning: Permanently added ‘test-backup‘ (RSA) tothe list of known hosts.

root@test-backup‘s password:

Now try logging into the machine, with "ssh‘root@test-backup‘", and check in:

.ssh/authorized_keys

to make sure we haven‘t added extra keys that youweren‘t expecting.

[root@test-master ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdate time.windows.com&> /dev/null

[root@test-master ~]# service crond restart

Stopping crond: [ OK ]

Starting crond: [ OK ]

[root@test-master ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[root@test-master ~]# rpm -ivh epel-release-6-8.noarch.rpm

warning: epel-release-6-8.noarch.rpm: Header V3RSA/SHA256 Signature, key ID 0608b895: NOKEY

Preparing... ########################################### [100%]

1:epel-release ########################################### [100%]

[root@test-master ~]# yum search heartbeat

……

heartbeat-devel.i686 : Heartbeat developmentpackage

heartbeat-devel.x86_64 : Heartbeat developmentpackage

heartbeat-libs.i686 : Heartbeat libraries

heartbeat-libs.x86_64 : Heartbeat libraries

heartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux

[root@test-master ~]# yum-y install heartbeat

[root@test-master ~]# chkconfig heartbeat off

[root@test-master ~]# chkconfig --list heartbeat

heartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off

test-backup：

[root@test-backup ~]# uname -n

test-backup

[root@test-backup ~]# ifconfig | grep eth0 -A 1

eth0 Link encap:Ethernet HWaddr00:0C:29:15:E6:BB

inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0

[root@test-backup ~]# ifconfig | grep eth1 -A 1

eth1 Link encap:Ethernet HWaddr00:0C:29:15:E6:C5

inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0

[root@test-backup ~]# routeadd -host 172.16.1.113 dev eth1

[root@test-backup ~]# ssh-keygen-t rsa -f ./.ssh/id_rsa -P ‘‘

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in ./.ssh/id_rsa.pub.

The key fingerprint is:

08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:b8root@test-backup

The key‘s randomart image is:

+--[ RSA 2048]----+

| . |

| =. |

| . = * |

| . . . .. + + |

|. + . ..SE . |

| o = . . |

|. . = . |

| o . . . |

|o .o... |

+-----------------+

[root@test-backup ~]#ssh-copy-id -i ./.ssh/id_rsa root@test-master

The authenticity of host ‘test-master(10.96.20.113)‘ can‘t be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continue connecting(yes/no)? yes

Warning: Permanently added ‘test-master‘ (RSA) tothe list of known hosts.

root@test-master‘s password:

Now try logging into the machine, with "ssh‘root@test-master‘", and check in:

.ssh/authorized_keys

to make sure we haven‘t added extra keys that youweren‘t expecting.

[root@test-backup ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdate time.windows.com&> /dev/null

[root@test-backup ~]# service crond restart

Stopping crond: [ OK ]

Starting crond: [ OK ]

[root@test-backup ~]# wgethttp://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[root@test-backup ~]# rpm -ivh epel-release-6-8.noarch.rpm

[root@test-backup ~]# yum -y install heartbeat

[root@test-backup ~]# chkconfig heartbeat off

[root@test-backup ~]# chkconfig --list heartbeat

heartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off

test-master：

[root@test-master ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

[root@test-master ~]# cd /etc/ha.d

[root@test-master ha.d]# ls

authkeys ha.cf harc haresources rc.d README.config resource.d shellfuncs

[root@test-master ha.d]# vim authkeys #（使用#ddif=/dev/random count=1 bs=512 | md5sum生成随机数，sha1后跟随机数）

auth 1

1 sha1912d6402295ac8d47109e56b177073b9

[root@test-master ha.d]# chmod 600 authkeys #（此文件权限600，否则启动服务时会报错）

[root@test-master ha.d]# ll !$

ll authkeys

-rw-------. 1 root root 692 Aug 7 21:51 authkeys

[root@test-master ha.d]# vim ha.cf

debugfile /var/log/ha-debug #（调试日志）

logfile /var/log/ha-log

logfacility local1 #（在rsyslog服务中配置通过local1接收日志）

keepalive 2 #（指定心跳间隔时间，即2s发一次广播）

deadtime 30 #（指定备node在30s内没收到主node的心跳信息则立即接管对方的服务资源）

warntime 10 #（指定心跳延迟的时间为10s，当10s内备node没收到主node的心跳信息，就会往日志中写警告，此时不会切换服务）

initdead 120 #（指定在heartbeat首次运行后，需等待120s才启动主node的各资源，此项用于解决等待对方heartbeat服务启动了自己才启，此项值至少要是deadtime的两倍）

udpport 694

#bcast eth0 #（指定心跳使用以太网广播方式在eth0上广播，若要使用两个实际网络传送心跳则要为bcast eth0 eth1）

mcast eth0 225.0.0.11 694 1 0 #（设置多播通信的参数，多播地址在LAN内必须是唯一的，因为有可能有多个heartbeat服务，多播地址使用D类IP（224.0.0.0--239.255.255.255），格式为mcastdev mcast_group port ttl loop）

auto_failback on #（用于主node恢复后failback）

node test-master #（主node主机名，uname -n结果）

node test-backup #（备node主机名）

crm no #（是否开启CRM功能）

[root@test-master ha.d]# vim haresources

test-master IPaddr::10.96.20.8/24/eth0 #（此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|start，IPaddr即是/etc/ha.d/resource.d/下的脚本）

[root@test-master ha.d]#scp authkeys ha.cf haresources root@test-backup:/etc/ha.d/

authkeys 100% 692 0.7KB/s 00:00

ha.cf 100% 10KB 10.3KB/s 00:00

haresources 100% 5944 5.8KB/s 00:00

[root@test-master ha.d]# service heartbeat start

Starting High-Availability services: INFO: Resource is stopped

Done.

[root@test-master ha.d]# ssh test-backup ‘service heartbeat start‘

Starting High-Availability services:2016/08/07_22:39:00 INFO: Resource isstopped

Done.

[root@test-master ha.d]# ps aux | grep heartbeat

root 63089 0.0 3.1 50124 7164 ? SLs 22:38 0:00 heartbeat: mastercontrol process

root 63093 0.0 3.1 50076 7116 ? SL 22:38 0:00 heartbeat: FIFOreader

root 63094 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: write:mcast eth0

root 63095 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: read:mcast eth0

root 63136 0.0 0.3 103264 836 pts/0 S+ 22:39 0:00 grep heartbeat

[root@test-master ha.d]# ssh test-backup ‘ps aux |grep heartbeat‘

root 3050 0.0 3.1 50124 7164 ? SLs 22:39 0:00 heartbeat: mastercontrol process

root 3054 0.0 3.1 50076 7116 ? SL 22:39 0:00 heartbeat: FIFOreader

root 3055 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: write:mcast eth0

root 3056 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: read:mcast eth0

root 3094 0.0 0.5 106104 1368 ? Ss 22:39 0:00 bash -c ps aux | grep heartbeat

root 3108 0.0 0.3 103264 832 ? S 22:39 0:00 grep heartbeat

[root@test-master ha.d]# netstat -tnulp | grep heartbeat

udp 0 0 225.0.0.11:694 0.0.0.0:* 63094/heartbeat:wr

udp 0 0 0.0.0.0:50268 0.0.0.0:* 63094/heartbeat:wr

[root@test-master ha.d]# ssh test-backup ‘netstat-tnulp | grep heartbeat‘

udp 0 0 0.0.0.0:58019 0.0.0.0:* 3055/heartbeat:wri

udp 0 0 225.0.0.11:694 0.0.0.0:* 3055/heartbeat: wri

[root@test-master ha.d]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0

[root@test-master ha.d]# ssh test-backup ‘ip addr |grep 10.96.20‘

inet10.96.20.114/24 brd 10.96.20.255 scope global eth0

[root@test-master ha.d]# service heartbeat stop

Stopping High-Availability services: Done.

[root@test-master ha.d]# ip addr | grep 10.96.20

inet10.96.20.113/24 brd 10.96.20.255 scope global eth0

[root@test-master ha.d]# ssh test-backup ‘ip addr |grep 10.96.20‘

inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0

[root@test-master ha.d]# service heartbeat start

Starting High-Availability services: INFO: Resource is stopped

Done.

[root@test-master ha.d]# ip addr | grep 10.96.20

inet10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ha.d]# ssh test-backup ‘ip addr |grep 10.96.20‘

inet10.96.20.114/24 brd 10.96.20.255 scope global eth0

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services: Done.

[root@test-master ~]# ssh test-backup ‘serviceheartbeat stop‘

Stopping High-Availability services: Done.

2、安装配置drbd

test-master：

[root@test-master ~]# fdisk -l

……

Disk /dev/sdb: 2147 MB, 2147483648 bytes

255 heads, 63 sectors/track, 261 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 512bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000

[root@test-master ~]# parted /dev/sdb #（parted命令可支持大于2T的硬盘，将新硬盘分两个区，一个区用于放数据，另一个区用于drbd的meta data）

GNU Parted 2.1

Using /dev/sdb

Welcome to GNU Parted! Type ‘help‘ to view a listof commands.

(parted) h

align-checkTYPE N checkpartition N for TYPE(min|opt) alignment

checkNUMBER do asimple check on the file system

cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER copy file system to another partition

help[COMMAND] printgeneral help, or help on COMMAND

mklabel,mktable LABEL-TYPE create a new disklabel (partitiontable)

mkfs NUMBERFS-TYPE make aFS-TYPE file system on partition NUMBER

mkpart PART-TYPE [FS-TYPE] START END make a partition

mkpartfsPART-TYPE FS-TYPE START END make apartition with a file system

move NUMBERSTART END movepartition NUMBER

name NUMBERNAME namepartition NUMBER as NAME

print [devices|free|list,all|NUMBER] display the partition table, availabledevices, free space, all found partitions, or a

particular partition

quit exitprogram

rescueSTART END rescuea lost partition near START and END

resizeNUMBER START END resizepartition NUMBER and its file system

rmNUMBER delete partition NUMBER

selectDEVICE choosethe device to edit

set NUMBERFLAG STATE change theFLAG on partition NUMBER

toggle[NUMBER [FLAG]] togglethe state of FLAG on partition NUMBER

unitUNIT setthe default unit to UNIT

version display the version number and copyrightinformation of GNU Parted

(parted) mklabel gpt

(parted) mkpart primary 0 1024

Warning: The resulting partition is not properlyaligned for best performance.

Ignore/Cancel? Ignore

(parted) mkpart primary 1025 2147

Warning: The resulting partition is not properlyaligned for best performance.

Ignore/Cancel? Ignore

(parted) p

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 2147MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

Number Start End Size File system Name Flags

1 17.4kB 1024MB 1024MB primary

2 1025MB 2147MB 1122MB primary

[root@test-master ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-master ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

warning: elrepo-release-6-6.el6.elrepo.noarch.rpm:Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY

Preparing... ########################################### [100%]

1:elrepo-release ########################################### [100%]

[root@test-master ~]# yum -y install drbd kmod-drbd84

[root@test-master ~]# modprobe drbd

FATAL: Module drbd not found.

[root@test-master ~]# yum -y install kernel* #（更新内核后要重启系统）

[root@test-master ~]# uname -r

2.6.32-642.3.1.el6.x86_64

[root@test-master ~]# depmod

[root@test-master ~]# lsmod| grep drbd

drbd 372759 0

libcrc32c 1246 1 drbd

[root@test-master ~]# ll /usr/src/kernels/

total 12

drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64.debug

[root@test-master ~]# echo "modprobe drbd >/dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[root@test-master ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

test-backup：

[root@test-backup ~]# parted /dev/sdb

(parted) mklabel gpt

(parted) mkpart primary 0 4096

Warning: The resulting partition is not properlyaligned for best performance.

Ignore/Cancel? Ignore

(parted) mkpart primary 4097 5368

(parted) p

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 5369MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

Number Start End Size File system Name Flags

1 17.4kB 4096MB 4096MB primary

2 4097MB 5368MB 1271MB primary

[root@test-backup ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-backup ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-backup ~]# ll /etc/yum.repos.d/

total 20

-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo

-rw-r--r--. 1 root root 2150 Feb 9 2014elrepo.repo

-rw-r--r--. 1 root root 957 Nov 4 2012 epel.repo

-rw-r--r--. 1 root root 1056 Nov 4 2012epel-testing.repo

-rw-r--r--. 1 root root 529 Mar 30 23:00 rhel-source.repo.bak

[root@test-backup ~]# yum -y install drbd kmod-drbd84

[root@test-backup ~]# yum -y install kernel*

[root@test-backup ~]# depmod

[root@test-backup ~]# lsmod | grep drbd

drbd 372759 0

libcrc32c 1246 1 drbd

[root@test-backup ~]# chkconfig drbd off

[root@test-backup ~]# chkconfig --list drbd

drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

[root@test-backup ~]# echo "modprobe drbd >/dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[root@test-backup ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

test-master：

[root@test-master ~]# vim /etc/drbd.d/global_common.conf

[root@test-master ~]# egrep -v "#|^$" /etc/drbd.d/global_common.conf

global {

usage-countno;

}

common {

handlers{

}

startup{

}

options{

}

disk{

on-io-error detach;

}

net {

}

syncer{

rate50M;

verify-algcrc32c;

}

[root@test-master ~]# vim /etc/drbd.d/data.res

resource data {

protocol C;

ontest-master {

device /dev/drbd0;

disk /dev/sdb1;

address 172.16.1.113:7788;

meta-disk /dev/sdb2[0];

}

ontest-backup {

device /dev/drbd0;

disk /dev/sdb1;

address 172.16.1.114:7788;

meta-disk /dev/sdb2[0];

}

[root@test-master ~]# cd /etc/drbd.d

[root@test-master drbd.d]# scp global_common.conf data.res root@test-backup:/etc/drbd.d/

global_common.conf 100% 2144 2.1KB/s 00:00

data.res 100% 251 0.3KB/s 00:00

[root@test-master drbd.d]# drbdadm --help

USAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}

GENERAL OPTIONS:

--stacked,-S

--dry-run,-d

--verbose,-v

--config-file=..., -c ...

--config-to-test=..., -t ...

--drbdsetup=...,-s ...

--drbdmeta=..., -m ...

--drbd-proxy-ctl=..., -p ...

--sh-varname=..., -n ...

--peer=...,-P ...

--version,-V

--setup-option=..., -W ...

--help, -h

COMMANDS:

attach disk-options

detach connect

net-options disconnect

up resource-options

down primary

secondary invalidate

invalidate-remote outdate

resize verify

pause-sync resume-sync

adjust adjust-with-progress

wait-connect wait-con-int

role cstate

dstate dump

dump-xml create-md

show-gi get-gi

dump-md wipe-md

apply-al hidden-commands

[root@test-master drbd.d]# drbdadm create-md data

initializing activity log

NOT initializing bitmap

Writing meta data...

New drbd meta data block successfully created.

[root@test-master drbd.d]# ssh test-backup ‘drbdadm create-md data‘

NOT initializing bitmap

initializing activity log

Writing meta data...

New drbd meta data block successfully created.

[root@test-master drbd.d]#drbdadm up data

[root@test-master drbd.d]# ssh test-backup ‘drbdadm up data‘

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

ns:0 nr:0dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[root@test-master drbd.d]# ssh test-backup ‘cat /proc/drbd‘

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

ns:0 nr:0dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[root@test-master drbd.d]# drbdadm -- --overwrite-data-of-peer primary data #（仅在主上执行）

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r-----

ns:339968nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:660016

[=====>..............]sync‘ed: 34.3% (660016/999984)K

finish:0:00:15 speed: 42,496 (42,496) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

ns:630784nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:369200

[===========>........]sync‘ed: 63.3% (369200/999984)K

finish:0:00:09 speed: 39,424 (39,424) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

ns:942080nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:57904

[=================>..]sync‘ed: 94.3% (57904/999984)K

finish:0:00:01 speed: 39,196 (39,252) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@test-master drbd.d]# ssh test-backup ‘cat /proc/drbd‘

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r-----

ns:0nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@test-master drbd.d]# mkdir/drbd

[root@test-master drbd.d]# ssh test-backup ‘mkdir /drbd‘

[root@test-master drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0 #（仅在主上执行，meta分区不要格式化）

Writing superblocks and filesystem accountinginformation: done

[root@test-master drbd.d]# tune2fs -c -1 /dev/drbd0

tune2fs 1.41.12 (17-May-2010)

Setting maximal mount count to -1

[root@test-master drbd.d]# mount /dev/drbd0 /drbd

[root@test-master drbd.d]# cd /drbd

[root@test-master drbd]# for i in `seq 1 10`; do touch test$i; done

[root@test-master drbd]# ls

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

[root@test-master drbd]# cd

[root@test-master ~]# umount /dev/drbd0

[root@test-master ~]# drbdadm secondary data

[root@test-master ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

test-backup：

[root@test-backup ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

ns:0nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@test-backup ~]# drbdadm primary data

[root@test-backup ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r-----

ns:0nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@test-backup ~]# mount /dev/drbd0 /drbd

[root@test-backup ~]# ls /drbd

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

3、调试heartbeat+drbd

[root@test-master ~]# ssh test-backup ‘umount/drbd‘

[root@test-master ~]# ssh test-backup ‘drbdadmsecondary data‘

[root@test-master ~]# service drbd stop

Stopping all DRBD resources: .

[root@test-master ~]# ssh test-backup ‘service drbdstop‘

Stopping all DRBD resources: .

[root@test-master ~]# service heartbeat status

heartbeat is stopped. No process

[root@test-master ~]# ssh test-backup ‘serviceheartbeat status‘

heartbeat is stopped. No process

[root@test-master ~]# ll/etc/ha.d/resource.d/{Filesystem,drbddisk}

-rwxr-xr-x. 1 root root 3162 Jan 12 2016 /etc/ha.d/resource.d/drbddisk

-rwxr-xr-x. 1 root root 1903 Dec 2 2013/etc/ha.d/resource.d/Filesystem

[root@test-master ~]# vim /etc/ha.d/haresources #（此行内容相当于脚本加参数的执行方式，例如#/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start|stop，#/etc/ha.d/resource.d/drbddiskdata start|stop，#/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start|stop；heartbeat就是这样按配置的先后顺序控制资源的，如果heartbeat出问题了，可通过查看日志并单独运行这些命令排错）

test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd/0::/drbd::ext4

[root@test-master ~]# scp /etc/ha.d/haresourcesroot@test-backup:/etc/ha.d/

haresources 100% 5996 5.9KB/s 00:00

[root@test-master~]# service drbd start #（在主node执行）

Starting DRBD resources: [

createres: data

preparedisk: data

adjustdisk: data

adjustnet: data

]

..........

***************************************************************

DRBD‘s startupscript waits for the peer node(s) to appear.

- If thisnode was already a degraded cluster before the

reboot,the timeout is 0 seconds. [degr-wfc-timeout]

- If thepeer was available before the reboot, the timeout

is 0seconds. [wfc-timeout]

(Thesevalues are for resource ‘data‘; 0 sec -> wait forever)

To abortwaiting enter ‘yes‘ [ 23]:

[root@test-backup~]# service drbd start #（在备node执行）

Starting DRBD resources: [

createres: data

preparedisk: data

adjustdisk: data

adjustnet: data

]

[root@test-master ~]# drbdadm role data

Secondary/Secondary

[root@test-master ~]# ssh test-backup ‘drbdadm roledata‘

Secondary/Secondary

[root@test-master ~]# drbdadm -- --overwrite-data-of-peer primary data

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# service heartbeat start

Starting High-Availability services: INFO: Resource is stopped

Done.

[root@test-master ~]# ssh test-backup ‘serviceheartbeat start‘

Starting High-Availability services: 2016/08/09_03:08:11INFO: Resource is stopped

Done.

[root@test-master ~]# ip addr | grep 10.96.20

inet10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 6.3G 11G 38% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-master ~]# ls /drbd

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services: Done.

[root@test-master ~]# ssh test-backup ‘ip addr |grep 10.96.20‘

inet10.96.20.114/24 brd 10.96.20.255 scope global eth0

inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ~]# ssh test-backup ‘df -h‘

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 3.9G 13G 24% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-master ~]# ssh test-backup ‘ls /drbd‘

lost+found

test1

test10

test2

test3

test4

test5

test6

test7

test8

test9

[root@test-master ~]# drbdadm role data

Secondary/Primary

[root@test-master ~]# service heartbeat start #（主node恢复后，先确保把drbd理顺，弄正常，再开启heartbeat服务）

Starting High-Availability services: INFO: Resource is stopped

Done.

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# ip addr | grep 10.96.20

inet10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 6.3G 11G 38% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-master ~]# ls /drbd

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

注：若两端出现Primary/Unknown或Secondary/Unknown，调整方法：

#service heartbeat stop #（把两端heartbeat服务停掉）

#drbdadm secondary data #（将备node的drbd置从）

#drbdadm disconnect data

#drbdadm -- --discard-my-data connect data

#drbdadm role data

#drbdadm connect data #（主node操作）

4、安装配置nfs

在两个主node和nfs slave1上均如下操作：

[root@test-master ~]# yum -y groupinstall ‘NFS fileserver‘

[root@test-master ~]# rpm -qa nfs-utils rpcbind

nfs-utils-1.2.3-70.el6_8.1.x86_64

rpcbind-0.2.0-12.el6.x86_64

[root@test-master ~]# service rpcbind start

[root@test-master ~]# service nfs start

Starting NFS services: [ OK ]

Starting NFS quotas: [ OK ]

Starting NFS mountd: [ OK ]

Starting NFS daemon: [ OK ]

Starting RPC idmapd: [ OK ]

[root@test-master ~]# chkconfig rpcbind on

[root@test-master ~]# chkconfig nfs on

[root@test-master ~]# chkconfig --list rpcbind

rpcbind 0:off 1:off 2:on 3:on 4:on 5:on 6:off

[root@test-master ~]# chkconfig --list nfs

nfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off

在两个主node上操作：

[root@test-master ~]# vim /etc/exports

/drbd 10.96.20.*(rw,sync,all_squash,anonuid=65534,anongid=65534,mp,fsid=2)

[root@test-master ~]# chmod 777 -R /drbd

[root@test-master ~]# service nfs reload #（相当于#exportfs-r）

5、测试：

两端主均开启heartbeat

在nfs-slave上测试，正常

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services:

/sbin/service: line 66: 17235 Killed env -i PATH="$PATH"TERM="$TERM" "${SERVICEDIR}/${SERVICE}" ${OPTIONS}

[root@test-master ~]# tail -f /var/log/ha-log #（测试在对heartbeat停服时，切换过程中一直卸载不掉挂载的分区，最终会强制重启server）

Filesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:21 INFO: No processes on/drbd were signalled. force_unmount is

Filesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:22 ERROR: Couldn‘t unmount /drbd; trying cleanup with KILL

Filesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:22 INFO: No processes on/drbd were signalled. force_unmount is

Filesystem(Filesystem_/dev/drbd0)[19791]: 2016/08/09_04:36:23 ERROR: Couldn‘t unmount/drbd, giving up!

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[19783]: 2016/08/09_04:36:23 ERROR: Generic error

ResourceManager(default)[17256]: 2016/08/09_04:36:23 ERROR: Return code 1from /etc/ha.d/resource.d/Filesystem

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[20014]: 2016/08/09_04:36:23 INFO: Running OK

ResourceManager(default)[17256]: 2016/08/09_04:36:23 CRIT: Resource STOP failure. Reboot required!

ResourceManager(default)[17256]: 2016/08/09_04:36:23 CRIT: Killingheartbeat ungracefully!

[root@test-backup ~]# drbdadm role data #（主node那边server重启后，备node查看已接管）

Primary/Unknown

[root@test-backup ~]# ip addr

……

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff

inet10.96.20.114/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

inet6fe80::20c:29ff:fe15:e6bb/64 scope link

valid_lft forever preferred_lft forever

[root@test-backup ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 3.9G 13G 24% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-backup ~]# ls /drbd

lost+found test111 test2 test222.txt test3 test4 test5 test6 test7 test8 test9

两主node的热备是实现了，但nfs slave挂载时一直挂载不上，卡住了，服务端（nfs master active）保存有nfs客户端挂载状态，这时需重启nfs服务端，于是在heartbeat的haresources配置文件中加入脚本，让其切换时重启nfs

关闭两主node的drbd和heartbeat服务

[root@test-master ~]# vim /etc/ha.d/haresources

test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd0::/drbd::ext4 killnfs

[root@test-master ~]# cd /etc/ha.d/resource.d/

[root@test-master resource.d]# vim killnfs

---------------script start-------------

#!/bin/bash

for i in {1..10};do

killall nfsd

done

service nfs start

exit 0

----------------script end--------------

[root@test-master resource.d]# chmod 755 killnfs

[root@test-master resource.d]# ll killnfs

-rwxr-xr-x. 1 root root 79 Aug 9 21:02 killnfs

[root@test-master resource.d]# scp killnfs root@test-backup:/etc/ha.d/resource.d/

killnfs 100% 79 0.1KB/s 00:00

[root@test-master resource.d]# cd ..

[root@test-master ha.d]# scp haresources root@test-backup:/etc/ha.d/

haresources 100% 6003 5.9KB/s 00:00

调整好drbd再开启heartbeat，重新测试，nfs slave在主切换时正常，没有挂载不上或卡住的问题

注：调试的一个大前提是，确保drbd是正常的，再开启heartbeat这样就不会有问题

注：ganji图片架构演变

注：用户上传图片到web server上后，web server把图片POST到对应设置ID的图片server上，图片server上的php接收到POST来的图片把图片写入到本地磁盘并返回对应的成功状态码，前端web server根据返回成功的状态码把图片server对应的ID和对应的图片path写入到DB server；用户访问页面时，根据请求从DB读取图片server ID和图片的URL到对应图片server上访问

本文出自 “Linux运维重难点学习笔记” 博客，请务必保留此出处http://jowin.blog.51cto.com/10090021/1837154

V 8 nfs+drbd+heartbeat

标签：nfs 高可用

原文地址：http://jowin.blog.51cto.com/10090021/1837154

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行