Multus CNI
Feb 7, 2022 15:30 · 2924 words · 6 minute read
Multus CNI 部署
集群已选用 Calico 作为网络插件并配置为 IPIP 模式。
$ cat /etc/cni/net.d/10-calico.conflist | jq
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"log_file_path": "/var/log/calico/cni/cni.log",
"datastore_type": "kubernetes",
"nodename": "multuscni-test0",
"mtu": 0,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {
"portMappings": true
}
},
{
"type": "bandwidth",
"capabilities": {
"bandwidth": true
}
}
]
}
-
部署 Multus CNI 网络插件
$ kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset-thick-plugin.yml customresourcedefinition.apiextensions.k8s.io/network-attachment-definitions.k8s.cni.cncf.io created clusterrole.rbac.authorization.k8s.io/multus created clusterrolebinding.rbac.authorization.k8s.io/multus created serviceaccount/multus created daemonset.apps/kube-multus-ds created $ kubectl get po -n kube-system | grep multus kube-multus-ds-5wmn9 1/1 Running 0 51s
-
创建 NetworkAttachmentDefinition
$ cat <<EOF | kubectl apply -f - apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: macvlan-conf spec: config: '{ "cniVersion": "0.3.1", "type": "macvlan", "master": "eth1", "mode": "bridge", "ipam": { "type": "host-local", "ranges": [ [ { "subnet": "10.37.132.0/24", "rangeStart": "10.37.132.20", "rangeEnd": "10.37.132.50", "gateway": "10.37.132.1" } ] ] } }' EOF networkattachmentdefinition.k8s.cni.cncf.io/macvlan-conf created $ kubectl get network-attachment-definition NAME AGE macvlan-conf 28s
-
创建一个测试 Pod
$ cat <<EOF | kubectl create -f - apiVersion: v1 kind: Pod metadata: name: pod-case-01 annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf spec: containers: - name: pod-case-01 image: docker.io/centos/tools:latest command: - /sbin/init EOF pod/pod-case-01 created $ ps -ef | grep "/sbin/init" root 30104 30083 0 11:08 ? 00:00:00 /sbin/init $ nsenter -n -t 30104 $ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 4: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default link/ether 9a:76:9b:84:bd:17 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.1.12/32 scope global eth0 valid_lft forever preferred_lft forever 5: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 62:74:fb:df:a8:35 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.37.132.20/24 brd 10.37.132.255 scope global net1 valid_lft forever preferred_lft forever $ ip r default via 169.254.1.1 dev eth0 10.37.132.0/24 dev net1 proto kernel scope link src 10.37.132.20 169.254.1.1 dev eth0 scope link
- 第一张网卡 eth0,IP 为 192.168.1.12
- 第二张网卡 net1,IP 为 10.37.132.20
Multus CNI 工作原理
multus CNI 网络插件的配置文件:
$ cat /etc/cni/net.d/00-multus.conf | jq
{
"capabilities": {
"bandwidth": true,
"portMappings": true
},
"cniVersion": "0.3.1",
"delegates": [
{
"cniVersion": "0.3.1",
"name": "k8s-pod-network",
"plugins": [
{
"datastore_type": "kubernetes",
"ipam": {
"type": "calico-ipam"
},
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
},
"log_file_path": "/var/log/calico/cni/cni.log",
"log_level": "info",
"mtu": 0,
"nodename": "multuscni-test0",
"policy": {
"type": "k8s"
},
"type": "calico"
},
{
"capabilities": {
"portMappings": true
},
"snat": true,
"type": "portmap"
},
{
"capabilities": {
"bandwidth": true
},
"type": "bandwidth"
}
]
}
],
"logLevel": "verbose",
"logToStderr": true,
"kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
"name": "multus-cni-network",
"type": "multus"
}
calico CNI 配置文件完整地出现在了 multus 插件配置的 delegates
字段中。
我们在部署 multus 插件时创建了一个类型为 NetworkAttachmentDefinition
的 CRD(CustomResourceDefinition)对象,而 NetworkAttachmentDefinition
来自于 k8snetworkplumbingwg/network-attachment-definition-client 项目的 API 定义:https://github.com/k8snetworkplumbingwg/network-attachment-definition-client/blob/master/pkg/apis/k8s.cni.cncf.io/v1/types.go
// +genclient
// +genclient:noStatus
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +resourceName=network-attachment-definitions
type NetworkAttachmentDefinition struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec NetworkAttachmentDefinitionSpec `json:"spec"`
}
创建容器网络栈
测试 Pod pod-case-01 的资源定义中携带了一条注解 k8s.v1.cni.cncf.io/networks: macvlan-conf
,macvlan-conf 正是我们所创建的 NetworkAttachmentDefinition。
multus 作为 CNI 插件的一种实现,同样遵循 CNI 接口规范,实现 ADD、DEL 等操作:
-
ADD https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L524-L702:
func CmdAdd(args *skel.CmdArgs, exec invoke.Exec, kubeClient *k8s.ClientInfo) (cnitypes.Result, error) { n, err := types.LoadNetConf(args.StdinData) logging.Debugf("CmdAdd: %v, %v, %v", args, exec, kubeClient) if err != nil { return nil, cmdErr(nil, "error loading netconf: %v", err) } // a lot of code here }
-
DEL https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L730-L871:
func CmdDel(args *skel.CmdArgs, exec invoke.Exec, kubeClient*k8s.ClientInfo) error { in, err := types.LoadNetConf(args.StdinData) logging.Debugf("CmdDel: %v, %v, %v", args, exec, kubeClient) if err != nil { return err } // a lot of code here }
本文只研究在 Kubernetes 启动 Pod 时 Multus CNI 如何为其创建网络栈,即 ADD 操作的实现,销毁 Pod 时的删除操作不做讨论。
-
加载委托插件(delegate)并将其添加至 multus 配置 https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/k8sclient/k8sclient.go#L309-L374
_, kc, err := k8s.TryLoadPodDelegates(pod, n, kubeClient, resourceMap) if err != nil { return nil, cmdErr(k8sArgs, "error loading k8s delegates k8s args: %v", err) }
-
尝试解析 Pod 注解中是否携带
v1.multus-cni.io/default-network
键值对(用户指定默认网络):https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/k8sclient/k8sclient.go#L586-L613
// TryLoadPodDelegates attempts to load Kubernetes-defined delegates and add them to the Multus config. // Returns the number of Kubernetes-defined delegates added or an error. func TryLoadPodDelegates(pod *v1.Pod, conf *types.NetConf, clientInfo *ClientInfo, resourceMap map[string]*types.ResourceInfo) (int, *ClientInfo, error) { // a lot of code here delegate, err := tryLoadK8sPodDefaultNetwork(clientInfo, pod, conf) if err != nil { return 0, nil, logging.Errorf("TryLoadPodDelegates: error in loading K8s cluster default network from pod annotation: %v", err) } if delegate != nil { logging.Debugf("TryLoadPodDelegates: Overwrite the cluster default network with %v from pod annotations", delegate) conf.Delegates[0] = delegate } }
-
尝试解析 Pod 注解中是否携带
k8s.v1.cni.cncf.io/networks
键值对(用户指定 NetworkAttachmentDefinition):https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/k8sclient/k8sclient.go#L436-L452
// GetPodNetwork gets net-attach-def annotation from pod func GetPodNetwork(pod *v1.Pod) ([]*types.NetworkSelectionElement, error) { logging.Debugf("GetPodNetwork: %v", pod) netAnnot := pod.Annotations[networkAttachmentAnnot] defaultNamespace := pod.ObjectMeta.Namespace if len(netAnnot) == 0 { return nil, &NoK8sNetworkError{"no kubernetes network found"} } networks, err := parsePodNetworkAnnotation(netAnnot, defaultNamespace) if err != nil { return nil, err } return networks, nil }
我们定义的 pod-case-01 Pod 确实携带了注解
k8s.v1.cni.cncf.io/networks: macvlan-conf
键值对。在
parsePodNetworkAnnotation
函数中,拆解k8s.v1.cni.cncf.io/networks
对应的值用于初始化types.NetworkSelectionElement
并追加至切片:https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/k8sclient/k8sclient.go#L173-L240func parsePodNetworkAnnotation(podNetworks, defaultNamespace string) ([]*types.NetworkSelectionElement, error) { if strings.IndexAny(podNetworks, "[{\"") >= 0 { if err := json.Unmarshal([]byte(podNetworks), &networks); err != nil { return nil, logging.Errorf("parsePodNetworkAnnotation: failed to parse pod Network Attachment Selection Annotation JSON format: %v", err) } } else { // Comma-delimited list of network attachment object names for _, item := range strings.Split(podNetworks, ",") { // Remove leading and trailing whitespace. item = strings.TrimSpace(item) // Parse network name (i.e. <namespace>/<network name>@<ifname>) netNsName, networkName, netIfName, err := parsePodNetworkObjectName(item) if err != nil { return nil, logging.Errorf("parsePodNetworkAnnotation: %v", err) } networks = append(networks, &types.NetworkSelectionElement{ Name: networkName, Namespace: netNsName, InterfaceRequest: netIfName, }) } } }
networkName
就是 macvlan-conf。 -
从 Kubernetes 集群中获取指定的 NetworkAttachmentDefinition:https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/k8sclient/k8sclient.go#L242-L294
func getKubernetesDelegate(client *ClientInfo, net *types.NetworkSelectionElement, confdir string, pod *v1.Pod, resourceMap map[string]*types.ResourceInfo) (*types.DelegateNetConf, map[string]*types.ResourceInfo, error) { logging.Debugf("getKubernetesDelegate: %v, %v, %s, %v, %v", client, net, confdir, pod, resourceMap) customResource, err := client.NetClient.NetworkAttachmentDefinitions(net.Namespace).Get(context.TODO(), net.Name, metav1.GetOptions{}) if err != nil { errMsg := fmt.Sprintf("cannot find a network-attachment-definition (%s) in namespace (%s): %v", net.Name, net.Namespace, err) if client != nil { client.Eventf(pod, v1.EventTypeWarning, "NoNetworkFound", errMsg) } return nil, resourceMap, logging.Errorf("getKubernetesDelegate: " + errMsg) } // a lot of code here configBytes, err := netutils.GetCNIConfig(customResource, confdir) if err != nil { return nil, resourceMap, err } delegate, err := types.LoadDelegateNetConf(configBytes, net, deviceID, resourceName) if err != nil { return nil, resourceMap, err } }
解析后得到 NetworkAttachmentDefinition 的
spec
字段中的配置字符串:{ "cniVersion":"0.3.1", "type":"macvlan", "master":"eth1", "mode":"bridge", "ipam":{ "type":"host-local", "ranges":[ [ { "subnet":"10.37.132.0/24", "rangeStart":"10.37.132.20", "rangeEnd":"10.37.132.50", "gateway":"10.37.132.1" } ] ] } }
即 macvlan CNI 插件。
我们通过 NetworkAttachmentDefinition 定义了 macvlan CNI 插件(用于配置 Pod 网络栈第二张网卡)的配置,而 multus 通过 Pod 注解中携带的 NetworkAttachmentDefinition 名称读取到 CNI 插件作为 delegate。
k8s.v1.cni.cncf.io/networks
的值的数量对应了 Pod 网络栈除 eh0 外网卡的数量,多个 CNI 插件配置都会被追加至delegates
切片。最后
delegates
会被追加至 multus 的 CNI 配置结构的Delegates
字段:https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/k8sclient/k8sclient.go#L338-L351delegates, err := GetNetworkDelegates(clientInfo, pod, networks, conf, resourceMap) if err != nil { if _, ok := err.(*NoK8sNetworkError); ok { return 0, clientInfo, nil } return 0, nil, logging.Errorf("TryLoadPodDelegates: error in getting k8s network for pod: %v", err) } if err = conf.AddDelegates(delegates); err != nil { return 0, nil, err }
-
-
在补充完配置结构体后,遍历它的
Delegates
字段:https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L600-L686-
获取网卡名称:https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L93-106
func getIfname(delegate *types.DelegateNetConf, argif string, idx int) string { logging.Debugf("getIfname: %v, %s, %d", delegate, argif, idx) if delegate.IfnameRequest != "" { return delegate.IfnameRequest } if delegate.MasterPlugin { // master plugin always uses the CNI-provided interface name return argif } // Otherwise construct a unique interface name from the delegate's // position in the delegate list return fmt.Sprintf("net%d", idx) }
所以我们看到 Pod 网络栈中第二张网卡名称一般都是 net1(默认)。
-
然后调用 delegate 网络插件的 ADD 操作为 Pod 配置网络栈:
https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L344-L349
result, err = confAdd(rt, delegate.Bytes, multusNetconf, exec) if err != nil { return nil, err }
https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L168-L186
func confAdd(rt *libcni.RuntimeConf, rawNetconf []byte, multusNetconf *types.NetConf, exec invoke.Exec) (cnitypes.Result, error) { logging.Debugf("confAdd: %v, %s", rt, string(rawNetconf)) // In part, adapted from K8s pkg/kubelet/dockershim/network/cni/cni.go binDirs := filepath.SplitList(os.Getenv("CNI_PATH")) binDirs = append([]string{multusNetconf.BinDir}, binDirs...) cniNet := libcni.NewCNIConfigWithCacheDir(binDirs, multusNetconf.CNIDir, exec) conf, err := libcni.ConfFromBytes(rawNetconf) if err != nil { return nil, logging.Errorf("error in converting the raw bytes to conf: %v", err) } result, err := cniNet.AddNetwork(context.Background(), conf, rt) if err != nil { return nil, err } return result, nil }
-
有多少 delegate 就执行多少次对应 CNI 网络插件的 ADD 操作。
-
根据 delegate CNI 插件的配置添加默认网关:https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.8/pkg/multus/multus.go#L650-L656
if adddefaultgateway { tmpResult, err = netutils.SetDefaultGW(args, ifName, delegate.GatewayRequest, &tmpResult) if err != nil { return nil, cmdErr(k8sArgs, "error setting default gateway: %v", err) } }
-
以上就是 multus CNI 网络插件在 Pod 创建时初始化网络栈的完整过程。multus CNI 网络插件本身并不会执行创建网卡、设置路由表等操作,而是读取 Kubernetes 集群的默认网络插件和 NetworkAttachmentDefinition 定义的 CNI 网络插件配置来调用它们执行 ADD 操作完成 Pod 网络栈配置。