Kubernetes能够把集群中不同Node节点上的Pod连接起来,并且默认情况下,每个Pod之间是可以相互访问的。但在某些场景中,不同的Pod不应该互通,这个时候就需要进行访问控制。那么如何实现呢?
简介
??Kubernetes提供了NetworkPolicy的Feature,支持按Namespace和按Pod级别的网络访问控制。它利用label指定namespaces或pod,底层用iptables实现。这篇文章简单介绍Kubernetes NetworkPolicy在Calico上的工作原理。
控制面数据流
??Network Policy是一种kubernetes资源,经过定义、存储、配置等流程使其生效。以下是简要流程:
- 通过kubectl client创建network policy资源;
- calico的policy-controller监听network policy资源,获取到后写入calico的etcd数据库;
- node上calico-felix从etcd数据库中获取policy资源,调用iptables做相应配置。
资源配置模板
??Network Policy支持按Pod和Namespace级别的访问控制,定义该资源可以参考以下模板。
指定pod标签访问
??我们要对namespace为myns,带有"role: backend"标签的所有pod进行访问控制:只允许标签为"role: frontend"的Pod,并且TCP端口为6379的数据流入,其他流量都不允许。
kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
name: allow-frontend
namespace: myns
spec:
podSelector:
matchLabels:
role: backend
ingress:
- from:
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: TCP
port: 6379
指定namespaces标签访问
??我们要对标签为"role: frontend"的所有Pod进行访问控制:只允许namespace标签为"user: bob"的各Pod,并且TCP端口为443的数据流入,其他流量都不允许。
kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
name: allow-tcp-443
spec:
podSelector:
matchLabels:
role: frontend
ingress:
- ports:
- protocol: TCP
port: 443
from:
- namespaceSelector:
matchLabels:
user: bob
NetworkPolicy数据结构定义
??看完上边的示例,,想必大家对NetworkPolicy的资源对象有一定的了解。接下来我们具体看下Kubernetes对该接口的定义:
type NetworkPolicy struct {
TypeMeta
ObjectMeta
Spec NetworkPolicySpec
}
type NetworkPolicySpec struct {
PodSelector unversioned.LabelSelector `json:"podSelector"`
Ingress []NetworkPolicyIngressRule `json:"ingress,omitempty"`
}
type NetworkPolicyIngressRule struct {
Ports *[]NetworkPolicyPort `json:"ports,omitempty"`
From *[]NetworkPolicyPeer `json:"from,omitempty"`
}
type NetworkPolicyPort struct {
Protocol *api.Protocol `json:"protocol,omitempty"`
Port *intstr.IntOrString `json:"port,omitempty"`
}
type NetworkPolicyPeer struct {
PodSelector *unversioned.LabelSelector `json:"podSelector,omitempty"`
NamespaceSelector *unversioned.LabelSelector `json:"namespaceSelector,omitempty"`
}
??简而言之,该资源指定了“被控制访问Pod”和“准入Pod”两类Pod,这可以从spec的podSelector和ingress-from的Selector进行配置。
??接下来我们就看下Kubernetes+Calico的Network policy实现细节。
测试版本
??以下是测试中使用的组件版本:
- kubernetes:
- master: v1.9.0
- node: v1.9.0
- calico:
- v2.5.0
- calico-policy-controller
- quay.io/calico/kube-policy-controller:v0.7.0
运行配置
- calico侧,除基本配置外的新建资源:
- service-account: calico-policy-controller
- rbac:
- ServiceRole: calico-policy-controller
- ServiceRoleBinding: calico-policy-controller
- deployment: calico-policy-controller
- Kubernets侧,新建network policy资源;
运行状态
??在原有正常工作的Kubernetes集群上,我们新加了calico-policy-controller容器,它里面主要运行controller进程:
- calico-policy-controller:
进程
/ # ps aux PID USER TIME COMMAND 1 root 0:00 /pause 7 root 0:00 /dist/controller 13 root 0:12 /dist/controller
端口:
/ # netstat -apn | grep contr tcp 0 0 10.138.102.219:45488 10.138.76.26:2379 ESTABLISHED 13/controller tcp 0 0 10.138.102.219:44538 101.199.110.26:6443 ESTABLISHED 13/controller
??我们可以看到,启动了controller进程,该进程Established两个端口:6443对应的kubernetes api-server端口;2379对应的calico etcd端口。
Calico-felix对policy的配置
数据包走向
??下图是calico流量处理流程(从这里找到)。每个Node的calico-felix从etcd数据库拿下来policy信息,用iptables做底层实现,最主要的就是:cali-pi-[POLICY]@filter 这个Chain。
Network Policy报文处理过程中使用的标记位:
0x2000000: 是否已经经过了policy规则检测,置1表示已经过
符号解释:
from-XXX: XXX发出的报文;
tw: 简写,to wordkoad endpoint;
to-XXX: 发送到XXX的报文;
po: 简写,policy outbound;
cali-: 前缀,calico的规则链;
pi: 简写,policy inbound;
wl: 简写,workload endpoint;
pro: 简写,profile outbound;
fw: 简写,from workload endpoint;
pri: 简写,profile inbound。
(receive pkt)
cali-PREOUTING@raw -> cali-from-host-endpoint@raw -> cali-PREROUTING@nat
| ^ |
| (-i cali+) | |
+--- (from workload endpoint) ----+ |
|
(dest may be container‘s floating ip) cali-fip-dnat@nat
|
(rotuer decision)
|
+--------------------------------------------+
| |
cali-INPUT@filter cali-FORWARD@filter
(-i cali+) | (-i cali+) | (-o cali+)
+----------------------------+ +------------+-------------+
| | | | |
cali-wl-to-host cali-from-host-endpoint | cali-from-host-endpoint |
@filter @filter | @filter |
| < END > | | |
| | cali-to-host-endpoint |
| | @filter |
| will return to nat‘s | < END > |
| cali-POSTROUTING | |
cali-from-wl-dispatch@filter <---------------------+ cali-to-wl-dispatch@filter
| \--------------+ |
+-----------------------+ | +----------------------+
| | | | |
cali-fw-cali0ef24b1 cali-fw-cali0ef24b2 | cali tw-cali03f24b1 cali-tw-cali03f24b2
@filter @filter | filter @filter
(-i cali0ef24b1) (-i cali0ef24b2) | (-o cali0ef24b1) (-o cali0ef24b2)
| | | | |
+-----------------------+ | +----------------------+
| | |
cali-po-[POLICY]@filter | cali-pi-[POLICY]@filter
| | |
cali-pro-[PROFILE]@filter | cali-pri-[PROFILE]@filter
| | |
< END > +------------> cali-POSTROUTING@nat
+---------->/ |
| cali-fip-snat@nat
| |
| cali-nat-outgoing@nat
| |
| (if dip is local: send to lookup)
+---------+--------+ (else: send to nic‘s qdisc)
| | < END >
cali-to-host-endpoint@filter |
| |
+------------------+
^ (-o cali+)
|
cali-OUTPUT@filter
^
(send pkt) |
(router descition) -> cali-OUTPUT@nat -> cali-fip-dnat@nat
??下面通过访问“禁止所有流量”策略的Pod,来观察对应的iptables处理:
流量进入前
[root@host31 ~]# iptables -nxvL cali-tw-cali1f79f9e08f2 -t filter
Chain cali-tw-cali1f79f9e08f2 (1 references)
pkts bytes target prot opt in out source destination
0 0 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:fthBuDq5I1oklYOL */ /* Start of policies */ MARK and 0xfdffffff
0 0 cali-pi-default.web-deny-all all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:Kp-Liqb4hWavW9dD */ mark match 0x0/0x2000000
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:Qe6UBTrru3RfK2MB */ /* Drop if no policies passed packet */ mark match 0x0/0x2000000
流量进入后
[root@host31 ~]# iptables -nxvL cali-tw-cali1f79f9e08f2 -t filter
Chain cali-tw-cali1f79f9e08f2 (1 references)
pkts bytes target prot opt in out source destination
3 180 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:fthBuDq5I1oklYOL */ /* Start of policies */ MARK and 0xfdffffff
3 180 cali-pi-default.web-deny-all all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:Kp-Liqb4hWavW9dD */ mark match 0x0/0x2000000
3 180 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:Qe6UBTrru3RfK2MB */ /* Drop if no policies passed packet */ mark match 0x0/0x2000000
??可以看到,DROP的pkts由0变成了3。即该数据包经过MARK、cali-pi-default.web-deny-all两个target处理,被标记符合“拒绝”条件,流经到DROP被丢弃。
流程分析案例
??以下是一个“禁止所有流量进入”的测试案例,通过它看下整体流程。
模型
- DENY all traffic to an application
查看app-web的标签
??在default的namespace下创建了一个名称为web的service。它的IP和标签如下:
[root@host02 /home/test]# kubectl get service --all-namespaces | grep web
default web ClusterIP 192.168.82.141 <none> 80/TCP 1d
[root@host02 /home/test/]# kubectl get pod --all-namespaces -o wide --show-labels | grep web
default web-667bdcb4d8-cpvbb 1/1 Running 0 1d 10.139.54.158 host30.add.bjdt.qihoo.net app=web,pod-template-hash=2236876084
配置policy
??首先,通过kubectl查看k8s资源:
[root@host02 /home/test]# kubectl get networkpolicy web-deny-all -o yaml
- apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
name: web-deny-all
namespace: default
spec:
podSelector:
matchLabels:
app: web
policyTypes:
- Ingress
??接下来,通过calicoctl和etcdctl查看calico资源:
[root@host02 /home/test]# calicoctl get policy default.web-deny-all -o yaml
- apiVersion: v1
kind: policy
metadata:
name: default.web-deny-all
spec:
egress:
- action: allow
destination: {}
source: {}
order: 1000
selector: calico/k8s_ns == ‘default‘ && app == ‘web‘
[root@host02 /home/test]# /home/test/etcdctl-wrapper-v2.sh get /calico/v1/policy/tier/default/policy/default.web-deny-all
{"outbound_rules": [{"action": "allow"}], "order": 1000, "inbound_rules": [], "selector": "calico/k8s_ns == ‘default‘ && app == ‘web‘"}
查看felix进行Network Policy配置的日志
增加 && 删除Policy
2018-02-11 11:13:22.029 [INFO][257] label_inheritance_index.go 203: Updating selector selID=Policy(name=default.api-allow)
2018-02-11 09:39:35.642 [INFO][257] label_inheritance_index.go 209: Deleting selector Policy(name=default.api-allow)
查看node上的iptables规则
[root@host30 ~]# iptables -nxvL cali-tw-cali96bc57f337a
Chain cali-tw-cali96bc57f337a (1 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:oSVcrqJ8U46FxQEJ */ ctstate RELATED,ESTABLISHED
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:nudTdCphcvic4flm */ ctstate INVALID
2 120 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:QWGVPDFBXrYgBHjv */ MARK and 0xfeffffff
2 120 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:fnpcHeCllWo_kg1u */ /* Start of policies */ MARK and 0xfdffffff
2 120 cali-pi-default.web-deny-all all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:ibEcyP2JurQBR2JS */ mark match 0x0/0x2000000
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:dIb1kwxUZz8DgRje */ /* Return if policy accepted */ mark match 0x1000000/
0x1000000
2 120 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:1O4PxUpswz0ZqJnr */ /* Drop if no policies passed packet */ mark match 0x
0/0x2000000
0 0 cali-pri-k8s-pod-network all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:rb9GDlntQSXL3Sen */
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:s2lDMKnLGp_JSpKk */ /* Return if profile accepted */ mark match 0x1000000
/0x1000000
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:q8OkJmM7E9TcFsQr */ /* Drop if no profiles matched */
从另一pod上访问该服务
[root@host02 /home/test]# kubectl run --rm -i -t --image=alpine test-$RANDOM -- sh
If you don‘t see a command prompt, try pressing enter.
/ # wget -qO- --timeout=3 http://192.168.82.141:80
wget: download timed out
/ #
??可见,访问该service的80端口失败;ping所对应的Pod试试:
[root@web-test-74b4dbb994-5zcvq /]# ping 10.139.54.158
PING 10.139.54.158 (10.139.54.158) 56(84) bytes of data.
^C
--- 10.139.54.158 ping statistics ---
45 packets transmitted, 0 received, 100% packet loss, time 44000ms
??Ping该Pod也是失败,达到了“禁止所有流量进入”的预期。
总结
??Kubernetes的NetworkPolicy实现了访问控制,解决了部分网络安全的问题。但截至现在,Kubernetes、Calico对其支持尚未完全,部分特性(egress等)仍在进行中;另一方面calico的每个Node上配置大量iptables规则,加上不同维度控制的增加,导致运维、排障难度较大。所以对网络访问控制有需求的用户来讲,能否使用还需综合考虑。