[To be translated] Nova:libvirt image 的生命周期

本文详细介绍了OpenStack通过libvirt启动虚拟机时虚拟机磁盘映像的流转过程,包括从glance下载、缓存、预处理以及可选的备份操作,特别关注了cow(copy-on-write)图像的使用和空间预分配策略。

翻译自:http://www.pixelbeat.org/docs/openstack_libvirt_images/

The main stages of a Virtual Machine disk image as it transfers through OpenStack to be booted under libvirt are:

OpenStack libvirt disk image flow

Initially the image is downloaded from glance and cached in libvirt base. We'll consider the options for handling a qcow2 image stored in glance, as that format can be downloaded quite efficiently from glance as it supports compression, and image sparseness can be maintained. This article will focus on the flow and transformations in "libvirt base", which is used to cache, preprocess and optionally back, VM disk images.

Configuration

First we'll summarize the config variables involved, before presenting the operations associated with each config combination, in each OpenStack release. Note I'm describing upstream OpenStack here, and not my employer's Red Hat OpenStack which has back-ported enhancements between versions where appropriate.

 

ConfigDefaultReleaseDescription
use_cow_imagesTrueCactusWhether to use CoW images for "libvirt instance disks"
force_raw_imagesTrueEssexAllows disabling convert to raw in "libvirt base" for operational reasons
libvirt_images_type'default'FolsomDeprecates use_cow_images and allows selecting LVM libvirt images
[libvirt]/images_type'default'IcehouseDeprecates libvirt_images_type in the [DEFAULT] section
preallocate_images'none'GrizzlyInstance disks preallocation mode. 'space' => fallocate images
resize_fs_using_block_deviceFalseHavanaAllows enabling of direct resize for qcow2 images

 

The main reason that raw images are written in "libvirt base" by default (since Diablo), is to remove possible compression from the qcow2 image received from glance. Note compression in qcow2 images is read only, and so this will impact reads from unwritten portions of the qcow2 image. Users may want to change this option, depending on CPU resources and I/O bandwidth available. For example, systems with slower I/O or less space available, may want to trade the higher CPU requirements of compression, to minimize input bandwidth. Note raw images are used unconditionally with libvirt_images_type=lvm.

Whether to use CoW images for the "libvirt instance disks" also depends on I/O characteristics of the user's system. Without CoW, more space will be used for common parts of the disk image, but on the flip side depending on the backing store and host caching, there may be better concurrency achieved by having each VM operate on its own copy.

Enabling preallocation of space for the "libvirt instance disks" can help with both space guarantees and I/O performance. Even when not using CoW instance disks, the copy each VM gets is sparse and so the VM may fail unexpectedly at run time with ENOSPC. By running fallocate(1) on the instance disk images, we immediately and efficiently allocate the space for them in the file system (if supported). Also run time performance should be improved as the file system doesn't have to dynamically allocate blocks at run time, reducing CPU overhead and more importantly file fragmentation.

Disk image operations

For each release and config combination, here are the created files and associated operations in getting a qcow2 image from glance through to being booted in a libvirt Virtual Machine.

Folsom, force_raw_images=True, use_cow_images=True

This results in each instance booting from a CoW image, backed by a resized raw image.

Nova commandSource codeNotes
wget http://glance/$image -O base_/$hex.partimages.fetch 
qemu-img convert -O raw $hex.part $hex.convertedimages.fetch_to_rawCreates sparse file
mv $hex.converted $hex; rm $hex.partimages.fetch_to_raw 
 imagebackend.create_image 
cp $hex $hex_$size  libvirt.utils.copy_imageCreates sparse file
qemu-img resize $hex_$size $size  disk.extend 
resize2fs $hex_size  disk.extendUnpartitioned ext[234]
qemu-img create -f qcow2 -o backing_file=... $instance_dir/disk  libvirt.utils.create_image 

Folsom, force_raw_images=True, use_cow_images=False

This results in each instance booting from a copy of a resized raw image.

Nova commandSource codeNotes
wget http://glance/$image -O base_/$hex.partimages.fetch 
qemu-img convert -O raw $hex.part $hex.convertedimages.fetch_to_rawCreates sparse file
mv $hex.converted $hex; rm $hex.partimages.fetch_to_raw 
 imagebackend.create_image 
cp $hex $instance_dir/disk  libvirt.utils.copy_image 
qemu-img resize disk  disk.extend 
resize2fs disk  disk.extendUnpartitioned ext[234]

Folsom, force_raw_images=False, use_cow_images=False

This results in each instance booting from a copy of a resized qcow2 image.

Nova commandSource codeNotes
wget http://glance/$image -O base_/$hex.partimages.fetch 
mv $hex.part $heximages.fetch_to_raw 
 imagebackend.create_image 
cp $hex $instance_dir/disk  libvirt.utils.copy_image 
qemu-img resize disk  disk.extend 
resize2fs disk  disk.extendIgnored for qcow2 ¹

Folsom, force_raw_images=False, use_cow_images=True

This results in each instance booting from a CoW image, backed by a resized qcow2 image.

Nova commandSource codeNotes
wget http://glance/$image -O base_/$hex.partimages.fetch 
mv $hex.part $heximages.fetch_to_raw 
 imagebackend.create_image 
cp $hex $hex_$size  libvirt.utils.copy_image 
qemu-img resize $hex_$size $size  disk.extend 
resize2fs $hex_size  disk.extendIgnored for qcow2 ¹
qemu-img create -f qcow2 -o backing_file=... $instance_dir/disk  libvirt.utils.create_image 

Grizzly, force_raw_images=True, use_cow_images=True

Grizzly introduces a change for use_cow_images=True, where it will resize in the $instance_dir rather than in base_. So the resize will not be cached, but this is minimal CPU tradeoff per instance boot, for the extra space saved in base_. We'll just present the default config values here which illustrates the only significant change from Folsom.
This results in each instance booting from a resized CoW image, backed by a raw image.

Nova commandSource codeNotes
wget http://glance/$image -O base_/$hex.partimages.fetch 
qemu-img convert -O raw $hex.part $hex.convertedimages.fetch_to_rawCreates sparse file
mv $hex.converted $hex; rm $hex.partimages.fetch_to_raw 
 imagebackend.create_image 
qemu-img create -f qcow2 -o backing_file=... $instance_dir/disk  libvirt.utils.create_image 
qemu-img resize disk $size  disk.extend 
resize2fs disk  disk.extendGrizzly always ignores ²

[² Update Sep 2013: Stanislaw Pitucha also noticed that the above referenced Grizzly change introduced a regression where unpartitioned qcow2 images were no longer resized. See the Havana resize_fs_using_block_device option below for details.]

Grizzly, preallocate_images='space'

Grizzly also has new fallocate functionality in this area controlled by the preallocate_images config option. If that is set to 'space', then after the operations above, the $instance_dir/ images will be fallocated to immediately determine if enough space is available, and to possibly improve VM I/O performance due to ongoing allocation avoidance, and better locality of block allocations.

¹ Havana, resize_fs_using_block_device=False

As noted in the first Grizzly change above, Stanislaw Pitucha noticed that change introduced a regression where unpartitioned qcow2 images were no longer resized. He supplied a fix to resize qcow directly rather than relying on the raw image being available, which would also cater for the force_raw_images=False case that even pre Grizzly did not. This new option can be used to enable this support, but there are some large performance and possible security issues so it's not enabled by default. This support will be available in the upcoming Havana release.

General performance considerations

Performance has improved in this area through each OpenStack release, with some of the main topics to consider, for past and future changes being:

Minimize I/O

Note these were implemented in Essex:

  • Copy images, then resize, rather than vice versa
  • Directly generating images in the $instance_dir/
  • Intelligent reading of sparse input
  • Reproduction of sparse input on output
  • Use compression

Note this was implemented in Folsom:

Minimize storage

  • Use compression
  • Use sparse output/generation
  • Avoid resized copies when not needed
  • Use CoW if appropriate

Improve caching

  • Avoid thrashing the page cache with large intermediate images
  • Improve low level caching through better storage allocation

Preprocessing

    • Preprocessing may be possible on images like preallocation=metadata which trades off initial CPU cost for possibly much better run time I/O performance
    • Such cost would be some what alleviated by having asynchronous population of the base_ cache

转载于:https://www.cnblogs.com/sammyliu/p/4427567.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值