KubeVirt cloud-init 重复初始化

Mar 15, 2024 22:30 · 1260 words · 3 minute read KubeVirt Virtualization Linux Kubernetes

一句话描述问题现象:KubeVirt v0.51.0 虚机重启后,Guest OS 用户后来设置的密码被 cloud-init 还原。

VM 重启 API:https://kubevirt.io/api-reference/v0.51.0/operations.html#_v1restart

VirtualMachine 定义如下:

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  labels:
    vm.mececs.io/name: ecs-test0
  name: ecs-test0
  namespace: default
spec:
  running: true
  template:
    metadata:
      annotations:
        attachnet1.mec-nets.ovn.kubernetes.io/allow_live_migration: "true"
        attachnet1.mec-nets.ovn.kubernetes.io/default_route: "true"
        attachnet1.mec-nets.ovn.kubernetes.io/logical_switch: ovn-default
        attachnet1.mec-nets.ovn.kubernetes.io/pod_nic_type: macvtap-port
        k8s.v1.cni.cncf.io/networks: mec-nets/attachnet1
        ovn.kubernetes.io/allow_live_migration: "true"
      labels:
        kubevirt.io/vm: ecs-test0
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 2
          threads: 1
        devices:
          disks:
          - bootOrder: 1
            disk:
              bus: virtio
            name: bootdisk
          - disk:
              bus: virtio
            name: cloudinitdisk
          interfaces:
          - macvtap: {}
            name: attachnet1
        machine:
          type: q35
        memory:
          guest: 2Gi
        resources:
          limits:
            cpu: "2"
            memory: 2Gi
          requests:
            cpu: "2"
            memory: 2Gi
      hostname: ecs-test0
      networks:
      - multus:
          networkName: mec-nets/attachnet1
        name: attachnet1
      volumes:
      - name: bootdisk
        persistentVolumeClaim:
          claimName: ecs-test0-bootpvc-ws90iy
      - cloudInitConfigDrive: # OpenStack config drive
          userData: |-
            #cloud-config
            user: root
            password: atomic
            ssh_pwauth: True
            chpasswd: { expire: False }            
        name: cloudinitdisk

Config Drive 是 OpenStack 虚机使用 cloud-init 的一种技术,元数据(文件结构)遵循特定的 Config Drive 格式,在虚机启动时由 cloud-init 读取并执行期望的操作。

我们首先来探究一下 KubeVirt 虚机如何使用 cloud-init。创建 VirtualMachine 对象后查看其 libvirt domain 定义:

$ kubectl get po -l "kubevirt.io/vm=ecs-test0"
NAME                            READY   STATUS    RESTARTS   AGE
virt-launcher-ecs-test0-lgd4w   1/1     Running   0          89s

$ kubectl exec -it virt-launcher-ecs-test0-lgd4w -- virsh dumpxml 1
# ...
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='block' device='disk' model='virtio-non-transitional'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native' discard='unmap'/>
      <source dev='/dev/bootdisk' index='2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <boot order='1'/>
      <alias name='ua-bootdisk'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk' model='virtio-non-transitional'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' discard='unmap'/>
      <source file='/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/ecs-test0/noCloud.iso' index='1'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='ua-cloudinitdisk'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </disk>
# ...

cloud-init 以一个 ISO 文件的形式被虚机(libvirt domain)使用,在 Guest OS 中以 /dev/vdx 磁盘呈现:

$ virtctl console ecs-test0
Successfully connected to ecs-test0 console. The escape sequence is ^]

CentOS Linux 7 (Core)
Kernel 3.10.0-1160.6.1.el7.x86_64 on an x86_64

ecs-test0 login: root
Password:
[root@ecs-test0 ~]# lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda    253:0    0   1M  0 disk
vdb    253:16   0  15G  0 disk
└─vdb1 253:17   0  15G  0 part /

在 Guest OS 内将其(/dev/vda/)挂载至 /mnt 目录:

[root@ecs-test0 ~]# mount /dev/vda/ /mnt/
mount: /dev/vda is write-protected, mounting read-only

[root@ecs-test0 ~]# cat /mnt/openstack/latest/meta_data.json
{"instance_id":"ecs-test0.default","hostname":"ecs-test0","uuid":"31b0f086-8f77-567d-91c0-6e9fef54cb2e"}

先透露一下,UUID 是关键,查看几个文件:

[root@ecs-test0 ~]# ls -al /var/lib/cloud/instances
total 0
drwxr-xr-x  3 root root  50 Mar 15 16:36 .
drwxr-xr-x. 8 root root 105 Mar 15 16:36 ..
drwxr-xr-x  5 root root 218 Mar 15 16:36 31b0f086-8f77-567d-91c0-6e9fef54cb2e

[root@ecs-test0 ~]# cat /var/lib/cloud/data/instance-id
31b0f086-8f77-567d-91c0-6e9fef54cb2e

[root@ecs-test0 ~]# cloud-init --version
/bin/cloud-init 19.4

查看 cloud-init 执行初始化时的源码:

https://github.com/canonical/cloud-init/blob/19.4/cloudinit/stages.py#L331-L349

class Init(object):
    # a lot of code here...
    def previous_iid(self):
        if self._previous_iid is not None:
            return self._previous_iid

        dp = self.paths.get_cpath('data')
        iid_fn = os.path.join(dp, 'instance-id')
        try:
            self._previous_iid = util.load_file(iid_fn).strip()
        except Exception:
            self._previous_iid = NO_PREVIOUS_INSTANCE_ID

        LOG.debug("previous iid found to be %s", self._previous_iid)
        return self._previous_iid

    def is_new_instance(self):
        previous = self.previous_iid()
        ret = (previous == NO_PREVIOUS_INSTANCE_ID or
               previous != self.datasource.get_instance_id())
        return ret

cloud-init 通过 /var/lib/cloud/data/instance-id 文件来存储“上一次”初始化时的实例 UUID,如果和元数据中的 UUID 不一致,就认为该虚机是一个“新”的实例(未被初始化过),需要做初始化。

也就是说,cloud-init 重复初始化是因为元数据中的 UUID 改变了

我们接着看 KubeVirt 是如何生成 cloud-init ISO 文件的。virt-launcher 在定义和启动 libvirt domain 之前需要先准备好所有需要的文件,其中就包括 cloud-init ISO。

https://github.com/kubevirt/kubevirt/blob/v0.51.0/pkg/cloud-init/cloud-init.go#L113-L152

// ReadCloudInitVolumeDataSource scans the given VMI for CloudInit volumes and
// reads their content into a CloudInitData struct. Does not resolve secret refs.
func ReadCloudInitVolumeDataSource(vmi *v1.VirtualMachineInstance, secretSourceDir string) (cloudInitData *CloudInitData, err error) {
    // a lot of code here...
    for _, volume := range vmi.Spec.Volumes {
        // a lot of code here...
        if volume.CloudInitConfigDrive != nil {

            keys, err := resolveConfigDriveSecrets(vmi, secretSourceDir)
            if err != nil {
                return nil, err
            }

            cloudInitData, err = readCloudInitConfigDriveSource(volume.CloudInitConfigDrive)
            cloudInitData.ConfigDriveMetaData = readCloudInitConfigDriveMetaData(string(vmi.UID), vmi.Name, hostname, vmi.Namespace, keys, flavor)
            cloudInitData.VolumeName = volume.Name
            return cloudInitData, err
        }
    }
    return nil, nil
}

// https://github.com/kubevirt/kubevirt/blob/v0.51.0/pkg/cloud-init/cloud-init.go#L377-L385
func readCloudInitConfigDriveMetaData(uid, name, hostname, namespace string, keys map[string]string, instanceType string) *ConfigDriveMetadata {
    return &ConfigDriveMetadata{
        InstanceType:  instanceType,
        UUID:          uid,
        InstanceID:    fmt.Sprintf("%s.%s", name, namespace),
        Hostname:      hostname,
        PublicSSHKeys: keys,
    }
}

KubeVirt v0.51.0 使用 VMI 的 UID 来作为元数据中的 UUID。而调用重启 API 后,virt-controller 组件会先删除再创建新的 VirtualMachineInstance 对象,VMI 的 UID 必然发生改变。导致 cloud-init 重复初始化,覆盖掉用户已修改的新密码。

社区已在 PR#7961 中修复了该问题,并随 KubeVirt v0.53.0 发布:

func cloudInitUUIDFromVMI(vmi *v1.VirtualMachineInstance) string {
    if vmi.Spec.Domain.Firmware == nil {
        return uuid.NewRandom().String()
    }
    return string(vmi.Spec.Domain.Firmware.UUID)
}

使用 firmware UUID 作为 cloud-init Config Drive 元数据中的 UUID 持久化,而如果我们在定义 VirtualMachine 时未写 firmware UUID,则会根据以下规则随 VirtualMachine/VirtualMachineInstance 的名字生成:

https://github.com/kubevirt/kubevirt/blob/v1.0.0/pkg/virt-controller/watch/vm.go#L1608-1631

const magicUUID = "6a1a24a1-4061-4607-8bf4-a3963d0c5895"

var firmwareUUIDns = uuid.Parse(magicUUID)

// setStableUUID makes sure the VirtualMachineInstance being started has a 'stable' UUID.
// The UUID is 'stable' if doesn't change across reboots.
func setupStableFirmwareUUID(vm *virtv1.VirtualMachine, vmi *virtv1.VirtualMachineInstance) {

    logger := log.Log.Object(vm)

    if vmi.Spec.Domain.Firmware == nil {
        vmi.Spec.Domain.Firmware = &virtv1.Firmware{}
    }

    existingUUID := vmi.Spec.Domain.Firmware.UUID
    if existingUUID != "" {
        logger.V(4).Infof("Using existing UUID '%s'", existingUUID)
        return
    }

    vmi.Spec.Domain.Firmware.UUID = types.UID(uuid.NewSHA1(firmwareUUIDns, []byte(vmi.ObjectMeta.Name)).String())
}

不会再随 VMI 重建而改变。