CoreDNS解析失败排查方式

时间：2020-06-27 11:47:34 阅读：388 评论：0 收藏：0 [点我收藏+]

标签：连通 connect gre share nec available star device emc

1. 查看coreDNS是否正常启动

kubectl -n kube-system get po|grep core

2. 如果不正常，并确定yaml配置无误，可将coreDNS pod通过修改deployment yaml 添加 nodeName尝试调度到其他节点，排查是否为node原因，或直接通过第4条排查问题node

       spec:
         nodeName: 1.1.1.1
         containers:
         - name: xxx
           image: xxx
           ports:
           - containerPort: 8080

3. 如果通过nodeName成功运行并启动coreDNS pod 可在任意node上通过sevice name解析coreDNS的可用性，示例：

   # coreDNS地址：10.96.0.10
   # 任意服务的service name：tiller-deploy
   nslookup tiller-deploy.kube-system.svc.cluster.local 10.96.0.10

4. 如果某一节点出现解析失败，则测试node 到pod是否连通

   # pod ip: 10.96.0.10
   ping 10.53.5.165

5. 连通性测试失败，查看问题node flannel是否正常运行，如果正常运行，继续排查

   # 1、查看问题节点flannel容器
   docker ps | grep flannel
   # 2、查看flannle网卡状态
    ifconfig  flannel.1
   # 3、查看路由表与正常节点对比是否齐全，示例：
    route -n
   10.244.1.0      10.244.1.0      255.255.255.0   UG    0      0        0 flannel.1
   10.244.2.0      10.244.2.0      255.255.255.0   UG    0      0        0 flannel.1
   10.244.3.0      10.244.3.0      255.255.255.0   UG    0      0        0 flannel.1
   10.244.4.0      10.244.4.0      255.255.255.0   UG    0      0        0 flannel.1
   10.244.5.0      10.244.5.0      255.255.255.0   UG    0      0        0 flannel.1
   10.244.6.0      10.244.6.0      255.255.255.0   UG    0      0        0 flannel.1
   10.244.7.0      10.244.7.0      255.255.255.0   UG    0      0        0 flannel.1
   # 如果上面查看有问题，确定是否启动NetworkManager服务，该服务会导致flannel异常
   # 查看 
   systemctl status NetworkManager
   # 关闭 && 禁用
   systemctl stop NetworkManager && systemctl disable NetworkManager
   
   # 如果是该服务影响，日志中会出现此类问题：
   device (flannel.1): state change: unmanager -> unavailable (reason ‘connection-assumed‘)
   
   # 并检查问题解析地址
   cat /etc/resolv.conf
   
   # 删除flannel 重新拉起flannel
   docker rm -f flannel

6. 如果flannel未启动情况

   # 1. 查看kubelet是否启动
   netstat -tnlp| grep kubelet
   
   # 2. 未启动，则查看swap分区是否开启，正常情况下为swap原因导致
   # 临时关闭swap分区, 重启失效;
   swapoff -a
   # 永久关闭swap分区
   sed -ri ‘s/.*swap.*/#&/‘ /etc/fstab
   
   # 通过free 查看swap状态为关闭状态
   free -m
                 total        used        free      shared  buff/cache   available
   Swap:             0           0           0
   # 启动kubelet，会拉起flannel 
   systemctl start kubelet
   # 如果kubelet启动和flannel正常启动可通过第5条排查问题，并测试服务的可用性

CoreDNS解析失败排查方式

标签：连通 connect gre share nec available star device emc

原文地址：https://www.cnblogs.com/Wshile/p/13197461.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行