解决 DNS conntrack 塞满 导致查询超时的问题
这个其实除了能解决 dns 塞满,还可以解决 ipv6 和 ipv4 并发解析, 落到同一个 coredns 的时候 conntrack 冲突导致丢包的问题。
IPVS 模式配置 node-local-dns 我们kube-proxy 用的是 IPVS mode
保存该文件 https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml
内容如下:
registry.k8s.io/dns/k8s-dns-node-cache:1.23.0 建议保存替换成自己仓库的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 apiVersion: v1 kind: ServiceAccount metadata: name: node-local-dns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: v1 kind: Service metadata: name: kube-dns-upstream namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "KubeDNSUpstream" spec: ports: - name: dns port: 53 protocol: UDP targetPort: 53 - name: dns-tcp port: 53 protocol: TCP targetPort: 53 selector: k8s-app: kube-dns --- apiVersion: v1 kind: ConfigMap metadata: name: node-local-dns namespace: kube-system labels: addonmanager.kubernetes.io/mode: Reconcile data: Corefile: | __PILLAR__DNS__DOMAIN__:53 { errors cache { success 9984 30 denial 9984 5 } reload loop bind __PILLAR__LOCAL__DNS__ __PILLAR__DNS__SERVER__ forward . __PILLAR__CLUSTER__DNS__ { force_tcp } prometheus :9253 health __PILLAR__LOCAL__DNS__:8080 } in-addr.arpa:53 { errors cache 30 reload loop bind __PILLAR__LOCAL__DNS__ __PILLAR__DNS__SERVER__ forward . __PILLAR__CLUSTER__DNS__ { force_tcp } prometheus :9253 } ip6.arpa:53 { errors cache 30 reload loop bind __PILLAR__LOCAL__DNS__ __PILLAR__DNS__SERVER__ forward . __PILLAR__CLUSTER__DNS__ { force_tcp } prometheus :9253 } .:53 { errors cache 30 reload loop bind __PILLAR__LOCAL__DNS__ __PILLAR__DNS__SERVER__ forward . __PILLAR__UPSTREAM__SERVERS__ prometheus :9253 } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: node-local-dns namespace: kube-system labels: k8s-app: node-local-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: updateStrategy: rollingUpdate: maxUnavailable: 10 % selector: matchLabels: k8s-app: node-local-dns template: metadata: labels: k8s-app: node-local-dns annotations: prometheus.io/port: "9253" prometheus.io/scrape: "true" spec: priorityClassName: system-node-critical serviceAccountName: node-local-dns hostNetwork: true dnsPolicy: Default tolerations: - key: "CriticalAddonsOnly" operator: "Exists" - effect: "NoExecute" operator: "Exists" - effect: "NoSchedule" operator: "Exists" containers: - name: node-cache image: registry.k8s.io/dns/k8s-dns-node-cache:1.23.0 resources: requests: cpu: 25m memory: 5Mi args: [ "-localip" , "__PILLAR__LOCAL__DNS__,__PILLAR__DNS__SERVER__" , "-conf" , "/etc/Corefile" , "-upstreamsvc" , "kube-dns-upstream" ] securityContext: capabilities: add: - NET_ADMIN ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9253 name: metrics protocol: TCP livenessProbe: httpGet: host: __PILLAR__LOCAL__DNS__ path: /health port: 8080 initialDelaySeconds: 60 timeoutSeconds: 5 volumeMounts: - mountPath: /run/xtables.lock name: xtables-lock readOnly: false - name: config-volume mountPath: /etc/coredns - name: kube-dns-config mountPath: /etc/kube-dns volumes: - name: xtables-lock hostPath: path: /run/xtables.lock type: FileOrCreate - name: kube-dns-config configMap: name: kube-dns optional: true - name: config-volume configMap: name: node-local-dns items: - key: Corefile path: Corefile.base --- apiVersion: v1 kind: Service metadata: annotations: prometheus.io/port: "9253" prometheus.io/scrape: "true" labels: k8s-app: node-local-dns name: node-local-dns namespace: kube-system spec: clusterIP: None ports: - name: metrics port: 9253 targetPort: 9253 selector: k8s-app: node-local-dns
安装:
169.254.20.10 写死用这个杜撰的 IP即可,生产和镜像都一样
1 2 3 4 5 kubedns=`kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}` domain="cluster.local" localdns="169.254.20.10" sed -i "s/__PILLAR__LOCAL__DNS__/$localdns /g; s/__PILLAR__DNS__DOMAIN__/$domain /g; s/,__PILLAR__DNS__SERVER__//g; s/__PILLAR__CLUSTER__DNS__/$kubedns /g" nodelocaldns.yaml kubectl apply -f nodelocaldns.yaml
验证:
进入一个 pod,把/etc/resolv.conf 的 name server 改为169.254.20.10。然后 nslookup 看服务和其它域名解析正常即可。
配置 DNS优先使用 node-local-dns 主要有两种:
1. 全局生效,修改 kubelet 指定的 cluster dns 1 2 3 4 5 6 7 8 9 10 11 12 kubedns=`kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}` sed -i "s/$kubedns /169.254.20.10/g" /var/lib/kubelet/config.yaml systemctl restart kubelet systemctl status -l kubelet
2.在 pod 的 dnsConfig 指定 1 2 3 4 5 6 7 8 9 10 11 dnsConfig: nameservers: - 169.254 .20 .10 searches: - test.svc.cluster.local - svc.cluster.local - cluster.local - nykjsrv.cn options: - name: ndots value: '5'
参考