KVM部署LVS集群故障案例一则

时间：2016-09-20 11:42:15 阅读：225 评论：0 收藏：0 [点我收藏+]

标签：

一、故障现象

KVM部署LVS（Linux Virtual Server）集群后，能够单独以HTTP方式访问RS（Real Server）的实际IP，但无法通过VIP（Virtual IP）访问。

二、故障分析过程

1.简化架构

在原部署环境中，采用的架构是LVS的DR（Direct Return）模式，如下图所示：

技术分享

为了便于故障排查，我们简化为

技术分享

也就是在2台宿主机上，各保留一个虚拟机，角色分别是LVS的Director（调度器）和RS。

该架构中的服务器（及虚拟机）的IP和MAC地址如下：

角色	IP	MAC	网络结构
宿主机1	x.y.z.70	a0:d3:c1:f4:66:ac	宿主机1的eth0和Director1的eth0（在宿主机1中对应为vnet0）桥接到br0
Director1	x.y.z.200	02:00:73:b6:53:c8	宿主机1的eth0和Director1的eth0（在宿主机1中对应为vnet0）桥接到br0
宿主机2	x.y.z.73	a0:d3:c1:f9:f3:fc	宿主机2的eth0和RS2的eth0（在宿主机2中对应为vnet0）桥接到br0
RS2	x.y.z.226	02:00:73:b6:53:e2	宿主机2的eth0和RS2的eth0（在宿主机2中对应为vnet0）桥接到br0
VIP	x.y.z.208	02:00:73:b6:53:c8
Client IP	192.243.119.145

2.确认Director1是否能够正确识别到RS2提供的服务

在Director1上，使用如下的命令检查

[root@Director1 ~]# ipvsadm -ln --sort

IP Virtual Server version 1.2.1 (size=4096)

Prot LocalAddress:Port Scheduler Flags

-> RemoteAddress:Port Forward Weight ActiveConn InActConn

TCP x.y.z.208:80 rr

?-> x.y.z.226:80 Route 2 0 0

由以上的输出?可以看出，Director1能够正确识别到RS2提供的服务。

3.确认Director1是否能够正确重写MAC地址

在Director1上执行以下的命令：

[root@Director1 ~]# tcpdump -vvv -nnn -e -i eth0 host 192.243.119.145

?11:39:19.372804 84:78:ac:27:6c:41 > 02:00:73:b6:53:c8,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 47264, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.51643 > x.y.z.208.80:Flags [S], cksum 0x000e (correct), seq 3639534333,win 14600, options [mss 1460,sackOK,TS val 780753501ecr 0,nop,wscale 7], length 0

?11:39:19.372815 02:00:73:b6:53:c8 > 02:00:73:b6:53:e2,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 47264, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.51643 > x.y.z.208.80:Flags [S], cksum 0x000e (correct), seq 3639534333,win 14600, options [mss 1460,sackOK,TS val 780753501ecr 0,nop,wscale 7], length 0

?11:39:20.372079 84:78:ac:27:6c:41 > 02:00:73:b6:53:c8,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 47265, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.51643 > x.y.z.208.80:Flags [S], cksum 0xfc25 (correct), seq 3639534333,win 14600, options [mss 1460,sackOK,TS val 780754501ecr 0,nop,wscale 7], length 0

?11:39:20.372091 02:00:73:b6:53:c8> 02:00:73:b6:53:e2, ethertype IPv4 (0x0800), length 74: (tos 0x0,ttl 48, id 47265, offset 0, flags [DF], proto TCP (6), length 60)

192.243.119.145.51643 > x.y.z.208.80:Flags [S], cksum 0xfc25 (correct), seq 3639534333,win 14600, options [mss 1460,sackOK,TS val 780754501ecr 0,nop,wscale 7], length 0

---以下略去客户端第3、4、5次重传数据

从?所示的以太网帧中，我们能够看到?所示的以太网帧目的MAC地址被Director1重写成了RS2的MAC地址，同时源MAC地址被Director1重写成了Director1本身的MAC地址。

从?和?所示的以太网帧中，我们能够看到Client（192.243.119.145）在1s后（?中的TSval 780754501比?中的TSval 780753501大1000ms，TCPSequence相同）（注：TS val表示发送方的时间戳，单位是ms）进行了SYN包重传，说明Client和RS2没有正常建立TCP连接。此时，Director1仍然正确重写了MAC地址。

4.确认宿主机1是否能够正确转发虚拟机Director1重写后的帧

在宿主机上执行以下命令：

[root@HOST1 ~]# tcpdump -vvv -nnn -e -i br0 host 192.243.119.145

?11:39:19.430993 84:78:ac:27:6c:41 > 02:00:73:b6:53:c8,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 47264, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.51643 > x.y.z.208.80:Flags [S], cksum 0x000e (correct), seq 3639534333,win 14600, options [mss 1460,sackOK,TS val 780753501ecr 0,nop,wscale 7], length 0

?11:39:20.430238 84:78:ac:27:6c:41 > 02:00:73:b6:53:c8,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 47265, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.51643 > x.y.z.208.80:Flags [S], cksum 0xfc25 (correct), seq 3639534333,win 14600, options [mss 1460,sackOK,TS val 780754501ecr 0,nop,wscale 7], length 0

---以下略去客户端第3、4、5次重传数据

从?和?所示的以太网帧中，我们能够看到宿主机1未转发虚拟机Director1重写后的以太网帧（其中?是?的Client重传，我们没有观察到?和?的MAC地址被重写的情况）。

很明显，问题出在宿主机1上，它没有转发Director1重写后的以太网帧。

我们来看看为什么会发生这个问题。

5.检查宿主机1上虚拟机Director1的网络配置

在宿主机1上执行以下命令：

[root@HOST1 ~]# virsh dumpxml r2-10105

<interfacetype=‘bridge‘>

<modeltype=‘virtio‘/>

?<filterref filter=‘clean-traffic‘>

?<parameter name=‘IP‘ value=‘x.y.z.200‘/>

</filterref>

<aliasname=‘net0‘/>

</interface>

从?和?中，我们看到Director1被引用了一个名为‘clean-traffic‘的过滤策略，同时给变量IP赋值为Director1的IP地址x.y.z.200。

在宿主机1上执行以下命令：

[root@HOST1 ~]# virsh nwfilter-dumpxml clean-traffic

<filterreffilter=‘no-mac-spoofing‘/>

?<filterreffilter=‘no-ip-spoofing‘/>

<macprotocolid=‘ipv4‘/>

</rule>

<filterreffilter=‘allow-incoming-ipv4‘/>

<filterreffilter=‘no-arp-spoofing‘/>

<macprotocolid=‘arp‘/>

</rule>

<filterreffilter=‘no-other-l2-traffic‘/>

<filterreffilter=‘qemu-announce-self‘/>

</filter>

[root@HOST1 ~]# virsh nwfilter-dumpxml no-ip-spoofing

</rule>

?<ipsrcipaddr=‘$IP‘/>

</rule>

?<rule action=‘drop‘ direction=‘out‘ priority=‘1000‘/>

</filter>

从?、?、?中，我们看到在宿主机1中把Director1的发出的数据包的源地址限定为x.y.z.200，任何从Director1发出的数据包，如果源IP为其他，则被过滤掉（drop）。

注意：

在LVS的DR模式部署中，Director会重写来自客户端的以太网帧的源MAC地址和目的MAC地址，但会保留源IP地址和目的IP地址，因此Director发出的数据包得源IP地址并不是在这个虚拟机上配置的IP地址，也就是进行了IP的伪装（spoofing）。恰好被宿主机1上对它限制给过滤了。

三、故障解决方法

根据以上的分析，我们知道，为了能够在KVM集群中配置DR模式的LVS集群，我们必须禁用’no-ip-spoofing’的过滤策略。

方法是：

1）执行：

virsh edit r2-10105

2）删除以下内容：

</filterref>

3）重启虚拟机。

4）我们采用以下的步骤抓包验证。

[root@HOST1 ~]# tcpdump -i br0 -s 0 host 192.243.119.145 -vvv-nnn -s 0 -e

tcpdump: listening on br0, link-type EN10MB (Ethernet), capture size65535 bytes

?16:39:37.842828 84:78:ac:27:6c:41 > 02:00:73:b6:53:c8,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 59437, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.48377 > x.y.z.208.80:Flags [S], cksum 0x8721 (correct), seq 2886813932, win 14600, options [mss1460,sackOK,TS val 885170369 ecr 0,nop,wscale 7], length 0

?16:39:37.842990 02:00:73:b6:53:c8 > 02:00:73:b6:53:e2,ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 48, id 59437, offset 0, flags[DF], proto TCP (6), length 60)

192.243.119.145.48377 > x.y.z.208.80:Flags [S], cksum 0x8721 (correct), seq 2886813932, win 14600, options [mss1460,sackOK,TS val 885170369 ecr 0,nop,wscale 7], length 0

---以下略去其他正常数据通信

从?和?所示的以太网帧中，我们能够看到宿主机1转发了虚拟机Director1重写后的以太网帧。因此，这个问题得以完美解决。

四、再探原理

通过这个故障案例，我们对KVM的网络过滤策略有了更深的理解。那么，这些过滤策略到底是如何在系统里面生效的呢？

通过在宿主机上配置iptables、ebtables，宿主机能够对虚拟机进行网络限制。其中，iptables实现对四层TCP、UDP端口的网络流量过滤；ebtables对二层MAC地址、三层IP地址进行过滤。如下是在宿主机1上对虚拟机进行MAC地址限定、源IP限定的规则：

[root@HOST1 ~]# ebtables -t nat --list

Bridge table: nat

Bridge chain: PREROUTING, entries: 1, policy: ACCEPT

-i vnet0 -j libvirt-I-vnet0 #宿主机vnet0入流量走libvirt-I-vnet0策略

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT

-o vnet0 -j libvirt-O-vnet0 #宿主机vnet0出流量走libvirt-O-vnet0策略

Bridge chain: libvirt-I-vnet0, entries: 9, policy: ACCEPT

-j I-vnet0-mac

-p IPv4 -j I-vnet0-ipv4-ip

-p IPv4 -j ACCEPT

-p ARP -j I-vnet0-arp-mac

-p ARP -j I-vnet0-arp-ip

-p ARP -j ACCEPT

-p 0x8035 -j I-vnet0-rarp

-p 0x835 -j ACCEPT

-j DROP

Bridge chain: libvirt-O-vnet0, entries: 4, policy: ACCEPT

-p IPv4 -j O-vnet0-ipv4

-p ARP -j ACCEPT

-p 0x8035 -j O-vnet0-rarp

-j DROP

Bridge chain: I-vnet0-mac, entries: 1, policy: ACCEPT

-s 2:0:73:b6:53:c8 -j RETURN #限定源MAC地址

-j DROP

Bridge chain: I-vnet0-ipv4-ip, entries: 3, policy: ACCEPT

-p IPv4 --ip-src 0.0.0.0 --ip-proto udp -j RETURN

-p IPv4 --ip-src x.y.z.200 -j RETURN #限定源IP地址

-j DROP

Bridge chain: O-vnet0-ipv4, entries: 1, policy: ACCEPT

-j ACCEPT

五、总结

在KVM环境中部署LVS集群时，要特别注意宿主机上iptables、ebtables对虚拟机的影响。因为此时，虚拟机并不是使用用户态的的应用程序进行代理，而是使用了网络地址转换（NAT模式）、MAC地址重写（DR模式）等“非常规”方法。

通过这个案例，我们同时可以知道，在遇到未知可能涉及到网络方便的问题时，使用网络分析技术可以提供有效的信息来帮助我们定位和排除问题。

阅读原文

KVM部署LVS集群故障案例一则

标签：

原文地址：http://www.cnblogs.com/276815076/p/5887959.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行