Ingress Nginx 实现原理

May 7, 2021 21:00 · 2062 words · 5 minute read Network Nginx Kubernetes

Ingress 是将 Kubernetes 集群中服务对外暴露的一种 API 对象。

一个最小化的 Ingress 对象(Kubernetes v1.19+):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: minimal-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test
            port:
              number: 80

Kubernetes v1.19+ 版本的 Ingress API 定义与之前有所不同。

Kubernetes 版本 Ingress API 版本
v1.5 - v1.17 extensions/v1beta1
v1.8 - v1.18 networking.k8s.io/v1beta1
v1.19+ networking.k8s.io/v1

要正常使用 Ingress,必须首先在 Kubernetes 集群中部署 Ingress Controller。

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.46.0/deploy/static/provider/cloud/deploy.yaml
kubectl get po -n ingress-nginx
NAME                                        READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create-6764l        0/1     Completed   0          13m
ingress-nginx-admission-patch-jlhrn         0/1     Completed   0          13m
ingress-nginx-controller-57cb5bf694-shflg   1/1     Running     0          13m

然后部署一个 flask demo app:

$ kubectl apply -f https://raw.githubusercontent.com/crazytaxii/kubernetes-extra/master/deploy/flask-demo/all.yaml
$ cat <<EOF >flask-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: flask-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: flask-service
            port:
              number: 80
EOF
$ kubectl apply -f flask-ingress.yaml -n flask-demo
$ kubectl get svc -n ingress-nginx
NAME                                 TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx-controller             LoadBalancer   10.107.230.163   <pending>     80:31787/TCP,443:31913/TCP   54m
ingress-nginx-controller-admission   ClusterIP      10.100.9.152     <none>        443/TCP                      54m
$ curl http://10.211.55.62:31787
<html><head><title>Docker + Flask Demo</title></head><body><table><tr><td> Start Time </td> <td>2021-May-07 07:53:17</td> </tr><tr><td> Hostname </td> <td>flask-deployment-68c5b8bb95-fzzw4</td> </tr><tr><td> Local Address </td> <td>10.244.0.9</td> </tr><tr><td> Remote Address </td> <td>10.244.0.8</td> </tr><tr><td> Server Hit </td> <td>1</td> </tr></table></body></html>

现在我们就可以通过主机 IP + NodePort 来访问部署在 Kubernetes 集群中的服务了。

下面我们来探索 Ingress Controller 也就是 openresty 如何将流量导向集群中的 app 容器。

ingress controller 在 Kubernetes 集群中以 ingress-nginx 命名空间中的 ingress-nginx-controller-57cb5bf694-shflg Pod 形式运行,无论是 nginx 还是 openresty 都有一份基础配置即 /etc/nginx/nginx.conf 文件:

$ kubectl exec ingress-nginx-controller-57cb5bf694-shflg -n ingress-nginx -- cat /etc/nginx/nginx.conf
http {
	lua_package_path "/etc/nginx/lua/?.lua;;";

	lua_shared_dict balancer_ewma 10M;
	lua_shared_dict balancer_ewma_last_touched_at 10M;
	lua_shared_dict balancer_ewma_locks 1M;
	lua_shared_dict certificate_data 20M;
	lua_shared_dict certificate_servers 5M;
	lua_shared_dict configuration_data 20M;
	lua_shared_dict global_throttle_cache 10M;
	lua_shared_dict ocsp_response_cache 5M;

	resolver 10.96.0.10 valid=30s ipv6=off;

	upstream upstream_balancer {
		### Attention!!!
		#
		# We no longer create "upstream" section for every backend.
		# Backends are handled dynamically using Lua. If you would like to debug
		# and see what backends ingress-nginx has in its memory you can
		# install our kubectl plugin https://kubernetes.github.io/ingress-nginx/kubectl-plugin.
		# Once you have the plugin you can use "kubectl ingress-nginx backends" command to
		# inspect current backends.
		#
		###

		server 0.0.0.1; # placeholder

		balancer_by_lua_block {
			balancer.balance()
		}
	}

	## start server _
	server {
		server_name _ ;

		listen 80 default_server reuseport backlog=511 ;
		listen 443 default_server reuseport backlog=511 ssl http2 ;

		location / {

			set $namespace      "flask-demo";
			set $ingress_name   "flask-ingress";
			set $service_name   "flask-service";
			set $service_port   "80";
			set $location_path  "/";
			set $global_rate_limit_exceeding n;

			rewrite_by_lua_block {
				lua_ingress.rewrite({
					force_ssl_redirect = false,
					ssl_redirect = true,
					force_no_ssl_redirect = false,
					use_port_in_redirects = false,
				global_throttle = { namespace = "", limit = 0, window_size = 0, key = { }, ignored_cidrs = { } },
				})
				balancer.rewrite()
				plugins.run()
			}

			set $proxy_upstream_name "flask-demo-flask-service-80";
			set $proxy_host          $proxy_upstream_name;
			set $pass_access_scheme  $scheme;

			set $pass_server_port    $server_port;

			set $best_http_host      $http_host;
			set $pass_port           $pass_server_port;

			proxy_pass http://upstream_balancer;
		}
	}
	## end server _

	# default server, used for NGINX healthcheck and access to nginx stats
	server {
		listen 127.0.0.1:10246;
		set $proxy_upstream_name "internal";

		location /configuration {
			client_max_body_size                    21m;
			client_body_buffer_size                 21m;
			proxy_buffering                         off;

			content_by_lua_block {
				configuration.call()
			}
		}

		location / {
			content_by_lua_block {
				ngx.exit(ngx.HTTP_NOT_FOUND)
			}
		}
	}
}

因为配置文件内容很长这里就只贴关键部分。proxy_pass 反向代理设置为 upstream_balancer,而 upstream_balancer 中出现了 openresty 中应用层负载均衡的语法 balancer_by_lua_block { lua-script }

lua_package_path "/etc/nginx/lua/?.lua;;" 表示 lua 脚本在 /etc/nginx/lua 路径下,我们直接去找 ingress-nginx 这个项目的源码仓库 https://github.com/kubernetes/ingress-nginx/tree/controller-v0.46.0 并找到 balancer.balance() 函数的定义 https://github.com/kubernetes/ingress-nginx/blob/controller-v0.46.0/rootfs/etc/nginx/lua/balancer.lua#L311-L335

function _M.balance()
  local balancer = get_balancer()
  if not balancer then
    return
  end

  local peer = balancer:balance()
  if not peer then
    ngx.log(ngx.WARN, "no peer was returned, balancer: " .. balancer.name)
    return
  end

  if peer:match(PROHIBITED_PEER_PATTERN) then
    ngx.log(ngx.ERR, "attempted to proxy to self, balancer: ", balancer.name, ", peer: ", peer)
    return
  end

  ngx_balancer.set_more_tries(1)

  local ok, err = ngx_balancer.set_current_peer(peer)
  if not ok then
    ngx.log(ngx.ERR, "error while setting current upstream peer ", peer,
            ": ", err)
  end
end

balancer 对象由 get_balancer() 函数获取到:

local function get_balancer()
  local backend_name = ngx.var.proxy_upstream_name -- flask-demo-flask-service-80
  local balancer = balancers[backend_name]
  if not balancer then
    return
  end

  ngx.ctx.balancer = balancer

  return balancer
end

在配置文件 /etc/nginx/nginx.conf 中 proxy_upstream_name 这个 nginx 变量已经被设置为 flask-demo-flask-service-80 了,那我们就要找到写 balancers 全局 table 相关的代码 https://github.com/kubernetes/ingress-nginx/blob/controller-v0.46.0/rootfs/etc/nginx/lua/balancer.lua#L107-L139

local function sync_backend(backend)
  if not backend.endpoints or #backend.endpoints == 0 then
    balancers[backend.name] = nil
    return
  end

  local implementation = get_implementation(backend)
  local balancer = balancers[backend.name]

  if not balancer then
    balancers[backend.name] = implementation:new(backend)
    return
  end

  -- every implementation is the metatable of its instances (see .new(...) functions)
  -- here we check if `balancer` is the instance of `implementation`
  -- if it is not then we deduce LB algorithm has changed for the backend
  if getmetatable(balancer) ~= implementation then
    ngx.log(ngx.INFO,
        string.format("LB algorithm changed from %s to %s, resetting the instance",
                      balancer.name, implementation.name))
    balancers[backend.name] = implementation:new(backend)
    return
  end

  balancer:sync(backend)
end

sync_backend 又在 sync_backends() 函数中遍历 new_backends table 时被循环调用:

local function sync_backends()
  local backends_data = configuration.get_backends_data()
  if not backends_data then
    balancers = {}
    return
  end

  local new_backends, err = cjson.decode(backends_data)
  if not new_backends then
    ngx.log(ngx.ERR, "could not parse backends data: ", err)
    return
  end

  local balancers_to_keep = {}
  for _, new_backend in ipairs(new_backends) do
    if is_backend_with_external_name(new_backend) then
      local backend_with_external_name = util.deepcopy(new_backend)
      backends_with_external_name[backend_with_external_name.name] = backend_with_external_name
    else
      sync_backend(new_backend)
    end
    balancers_to_keep[new_backend.name] = true
  end
end

首先调用 configuration.get_backends_data() 得到所有后端数据:

function _M.get_backends_data()
  return configuration_data:get("backends")
end

这个 key 为 backend 的值在 configuration.handle_backends() 函数调用后就已经被预先插入了:

local function handle_backends()
  if ngx.var.request_method == "GET" then
    ngx.status = ngx.HTTP_OK
    ngx.print(_M.get_backends_data())
    return
  end

  local backends = fetch_request_body()
  if not backends then
    ngx.log(ngx.ERR, "dynamic-configuration: unable to read valid request body")
    ngx.status = ngx.HTTP_BAD_REQUEST
    return
  end
end

local function fetch_request_body()
  ngx.req.read_body()
  local body = ngx.req.get_body_data()

  if not body then
    -- request body might've been written to tmp file if body > client_body_buffer_size
    local file_name = ngx.req.get_body_file()
    local file = io.open(file_name, "rb")

    if not file then
      return nil
    end

    body = file:read("*all")
    file:close()
  end

  return body
end

看着像读取 HTTP 请求应答的数据,但这里看不出来对应的 HTTP 请求,全局搜索 handle_backends(),找到了 call() 函数:

function _M.call()
  -- a lot of code here

  if ngx.var.request_uri == "/configuration/backends" then
    handle_backends()
    return
  end

  ngx.status = ngx.HTTP_NOT_FOUND
  ngx.print("Not found!")
end

这条 HTTP 请求的路径为 /configuration/backends,/configuration 这个路径在 nginx 配置中看到过:

location /configuration {
    client_max_body_size                    21m;
    client_body_buffer_size                 21m;
    proxy_buffering                         off;

    content_by_lua_block {
        configuration.call()
    }
}

我们在 Ingress Controller Pod 也就是 ingress-nginx-controller-57cb5bf694-shflg 容器中手动 curl 一把:

kubectl exec ingress-nginx-controller-57cb5bf694-shflg -n ingress-nginx -- curl -s http://127.0.0.1:10246/configuration/backends
[{"name":"flask-demo-flask-service-80","service":{"metadata":{"creationTimestamp":null},"spec":{"ports":[{"name":"http","protocol":"TCP","port":80,"targetPort":5000}],"selector":{"app":"flask"},"clusterIP":"10.109.208.21","type":"ClusterIP","sessionAffinity":"None"},"status":{"loadBalancer":{}}},"port":80,"sslPassthrough":false,"endpoints":[{"address":"10.244.0.10","port":"5000"},{"address":"10.244.0.11","port":"5000"},{"address":"10.244.0.9","port":"5000"}],"sessionAffinityConfig":{"name":"","mode":"","cookieSessionAffinity":{"name":""}},"upstreamHashByConfig":{"upstream-hash-by-subset-size":3},"noServer":false,"trafficShapingPolicy":{"weight":0,"header":"","headerValue":"","headerPattern":"","cookie":""}},{"name":"upstream-default-backend","port":0,"sslPassthrough":false,"endpoints":[{"address":"127.0.0.1","port":"8181"}],"sessionAffinityConfig":{"name":"","mode":"","cookieSessionAffinity":{"name":""}},"upstreamHashByConfig":{},"noServer":false,"trafficShapingPolicy":{"weight":0,"header":"","headerValue":"","headerPattern":"","cookie":""}}]

flask-demo 相关的数据赫然在列,拿到集群中所有 backends 数据后,sync_backends 函数将遍历 backend 列表,以 name 字段的值为 table 的 key,插入 implementation:new(backend) 对象。如果没有特殊设置,这里的 implementation 就是 round_robin 负载均衡算法 https://github.com/kubernetes/ingress-nginx/blob/controller-v0.46.0/rootfs/etc/nginx/lua/balancer/round_robin.lua#L9-L19

function _M.new(self, backend)
  local nodes = util.get_nodes(backend.endpoints)
  local o = {
    instance = self.factory:new(nodes),
    traffic_shaping_policy = backend.trafficShapingPolicy,
    alternative_backends = backend.alternativeBackends,
  }
  setmetatable(o, self)
  self.__index = self
  return o
end

function _M.balance(self)
  return self.instance:find()
end

一切都渐渐明朗起来了,backend.endpoints 也就是三个 flask pod 的 Pod IP 与端口,get_nodes 函数在 util.lua 文件中定义 https://github.com/kubernetes/ingress-nginx/blob/controller-v0.46.0/rootfs/etc/nginx/lua/util.lua#L16-L26

function _M.get_nodes(endpoints)
  local nodes = {}
  local weight = 1

  for _, endpoint in pairs(endpoints) do
    local endpoint_string = endpoint.address .. ":" .. endpoint.port
    nodes[endpoint_string] = weight
  end

  return nodes
end

我们回到上面的 balancer.lua 文件,当 get_balancer() 获取到 balancer 对象后,调用 balancer:balance() 函数将返回 self.instance:find(),从三个 endpoints 中挑出一个返回。

这也就是通过 Ingress 访问 Kubernetes 集群中的服务时,负载均衡器代理的是 Pod IP 而不是 Cluster IP 的原理。