qemu rbd 卷存储空间同步释放
Jun 30, 2023 23:00 · 980 words · 2 minute read
一句话描述问题:libvirt + qemu 实例,使用 Ceph 存储,数据卷中删除文件后 Ceph 中存储资源并未立即释放。
Guest OS 操作系统 CentOS 7.8,内核版本 3.10.0。
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 8:0 0 50G 0 disk
vda 253:0 0 20G 0 disk
└─vda1 253:1 0 20G 0 part /
vdb 253:16 0 1M 0 disk
vda 为数据盘,格式化后将其挂载到 /mnt 挂载点:
$ mkfs.xfs /dev/vda
Discarding blocks...Done.
meta-data=/dev/vda isize=512 agcount=4, agsize=3276800 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=13107200, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=6400, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
$ mount /dev/vda /mnt/
[94035.914556] XFS (vda): Mounting V5 Filesystem
[94036.028276] XFS (vda): Ending clean mount
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 8:0 0 50G 0 disk /mnt
vda 253:0 0 20G 0 disk
└─vda1 253:1 0 20G 0 part /
vdb 253:16 0 1M 0 disk
查看 /dev/vda 对应的 rbd 卷在 Ceph 集群中的存储使用:
$ rbd du mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c
warning: fast-diff map is not enabled for csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c. operation may be slow.
NAME PROVISIONED USED
csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c 50 GiB 44 MiB
rbd 卷本身 50GB,但实际只使用了 44 MiB。
现在向 /mnt 路径下写入 20G 数据:
$ dd if=/dev/zero of=/mnt/test bs=1GB count=20
20+0 records in
20+0 records out
20000000000 bytes (20 GB) copied, 268.295 s, 74.5 MB/s
$ df -h /mnt/
Filesystem Size Used Avail Use% Mounted on
/dev/vda 50G 19G 32G 38% /mnt
查看 rbd 卷在 Ceph 中的存储使用:
$ rbd du mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c
warning: fast-diff map is not enabled for csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c. operation may be slow.
NAME PROVISIONED USED
csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c 50 GiB 19 GiB
然后删掉 dd 生成的大体积测试文件:
$ rm -rf /mnt/test
再次查看 rbd 卷在 Ceph 中的存储使用:
$ rbd du mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c
warning: fast-diff map is not enabled for csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c. operation may be slow.
NAME PROVISIONED USED
csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c 50 GiB 19 GiB
Ceph 侧 rbd 卷中的存储并未释放,这会造成存储资源的浪费。
Discard
环境满足 Ceph 0.46+ 和 qemu 1.1+,Ceph 块设备就可以支持 Discard 操作,即 Guest OS 文件系统可以发送 TRIM 请求来让块设备回收不使用的空间。
libvirt domain 定义要满足:
- 数据盘使用 SCSI 总线
- qemu 磁盘驱动 discard 设置为 unmap
- SCSI 的 PCI 控制器 model 设置为 virtio-scsi
我们将实例关机后 virsh edit
修改其 domain:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' discard='unmap'/>
<auth username='csi-rbd-provisioner'>
<secret type='ceph' uuid='8fedf300-282c-4531-a66d-ca2691aaa88b'/>
</auth>
<source protocol='rbd' name='mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c' index='1'>
<host name='192.168.81.37' port='6789'/>
<host name='192.168.81.38' port='6789'/>
<host name='192.168.81.39' port='6789'/>
</source>
<target dev='sda' bus='scsi'/>
<serial>ecs-test0-datavol</serial>
<alias name='ua-datadisk'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
保存后开机进入实例执行 fstrim
:
# guest os
$ mount /dev/vda /mnt/
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 50G 0 disk /mnt
vda 253:0 0 20G 0 disk
└─vda1 253:1 0 20G 0 part /
vdb 253:16 0 1M 0 disk
$ fstrim /mnt/
# ceph
$ rbd du mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c
warning: fast-diff map is not enabled for csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c. operation may be slow.
NAME PROVISIONED USED
csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c 50 GiB 68 MiB
fstrim 主动触发了 rbd 卷的存储空间释放,20GB 的空间被回收。如果想要删除文件自动触发 TRIM,那么在挂载数据卷时带上 discard
选项:
# guest os
$ mount -o discard /dev/sda /mnt/
[98639.925204] XFS (sda): Mounting V5 Filesystem
[98640.082683] XFS (sda): Ending clean mount
$ dd if=/dev/zero of=/mnt/test bs=1GB count=20
20+0 records in
20+0 records out
20000000000 bytes (20 GB) copied, 261.615 s, 76.4 MB/s
$ df -h /mnt/
Filesystem Size Used Avail Use% Mounted on
/dev/sda 50G 19G 32G 38% /mnt
# ceph
$ rbd du mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c
warning: fast-diff map is not enabled for csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c. operation may be slow.
NAME PROVISIONED USED
csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c 50 GiB 18 GiB
# guest os
$ rm -rf /mnt/test
$ df -h /mnt/
Filesystem Size Used Avail Use% Mounted on
/dev/sda 50G 33M 50G 1% /mnt
# ceph
$ rbd du mec-ecs-pool/csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c
warning: fast-diff map is not enabled for csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c. operation may be slow.
NAME PROVISIONED USED
csi-vol-65a6dd12-162a-11ee-81b6-38ca843ae36c 50 GiB 76 MiB
如果 Guest OS(Linux)内核版本在 5.0 以上,domain 就无需使用 SCSI 总线和驱动了,直接用 VirtIO 即可。