Hyper-V: advantages and dangers of checkpoints

When I first heard about snapshots of virtual machines (VMs), I imagined that it was about making a full copy of the current state of a VM, then putting it in place sure. Thus, if necessary, it would be enough to restore the snapshot by loading the copy. After working with these snapshots, I understood the process and realized that I was wrong.

To clarify the naming conventions, from Windows Server 2012 R2 , Microsoft renamed its Hyper-V “snapshots” to “checkpoints”. System Center Virtual Machine Manager has always called them “checkpoints,” so this harmonization of Microsoft terminology is a good thing, even though PowerShell commands still speak of “snapshots.” In fact, there is no technical difference between the two.

For Hyper-V on Windows Server 2008 and 2012, Microsoft recommends that you do not use control points in production, but only in test and development environments. They are still supported (except when running Exchange or SQL), which can be confusing and lead system administrators to question whether or not to use Hyper-V control points. in production systems. There has been little change since Windows Server 2012, but Microsoft has taken this cautious approach for a variety of reasons.

The first reason is that server performance degrades as soon as a control point is used, because of the peak I / O it implies. Without solving it entirely, it is possible to mitigate the problem by recording the control points on a disk other than the VHD file of the VM .

Second reason: each control point takes up extra disk space, which is only released when the VM stops. However, the effects on the functioning of your environment can be disastrous, as we will explain in the rest of the article.

Behind the scenes of Hyper-V control points

So what exactly happens when you click on the checkpoint option of a VM in the Hyper-V Manager?

First, Hyper-V creates a differentiation disc with the .AVHD file extension. The location of this file depends on the path configured for the control points.

Initially, the file is relatively small (32 MB in my test), but in the background, the initial .vhd file of the VM is suspended.

A copy of the configuration file, which has the extension .XML, is made to cover the hardware changes made to the VM itself. The current state of the memory is saved in another file, under the extension .BIN, which makes it possible to restore the control point exactly as it was.

The fourth file, with a .VSV extension, is used for the registration status of devices associated with the VM.

No impact is found on the available and running VM (except for a decrease in performance), but the Hyper-V host begins to juggle the .VHD and .AVHD files for reads and writes.

When a read request arrives at the VM, the VM first checks whether the differencing disk contains a record of the corresponding data. If not, then the host reads the data in the original .VHD file.

In the case of a write request, the change is made in the .AVHD file. The following example is very rudimentary (in fact, it would be millions of 1 and 0!), But representative of the operations performed on the data of the VM:

0001110001 – Original VHD
_____0____ – AVHD Control Point 1

Each time the data is modified, the .AVHD file only stores a record of this change. When Hyper-V control points multiply, things can get complicated:

0001110001 – Original VHD
_____0____ – AVHD Control Point 1
_____11___ – AVHD Control Point 2
_1____0___ – AVHD Control Point 3

Each checkpoint gives rise to a separate .AVHD file, which tracks changes from its creation until you delete the snapshot or create another snapshot.

In the above example, when checkpoint 2 is created, checkpoint 1 changes to read-only. When Control Point 3 is created, the 2 becomes read-only and the 1 remains read-only, just like the original VHD.

As you can see, the use of the disk can quickly get carried away with the increase of the number of control points, which is not without consequence on the performances. Although there have been few changes from the original dataset, 50% more space is needed to track all three checkpoints.

Windows Server operating systems perform many tasks in the background; all these little writing operations and these modifications accumulate more quickly than we think.

It’s also worth noting that a single checkpoint file can not exceed the size of the original VHD:

0001110001 – Original VHD
1110001110 – AVHD Control Point 1

In this example, all data bits have changed. As a result, you can not use more disk space unless you create another snapshot. This is one of the main risks of Hyper-V checkpoints. If you run out of disk space, all VMs will go into the “Critical – Paused” state, which of course is very bad when running in production. Indeed, from the point of view of the user, a paused VM is not more useful than if it was stopped, and the execution of the VMs in this state can not resume until the disk space remains insufficient.

The amount of disk space to allocate to snapshots is difficult to estimate, but it’s good practice to put them on a disk other than the one that logs the VM’s VHD file. In a scenario where a snapshot uses all available space, other VMs are not affected because they still have plenty of space, unless, of course, all your VMs have active snapshots on the same disk.

It is easy to check whether control points are active on each Hyper-V host using the following PowerShell command:

Get-VM | Get-VMSnapshot

This command lists all control points, which gives you easy access to the VMs for which you want to remove them.

Limit and delete control points

If you consider control points to be too risky, or if you want to limit them to some VMs, you can associate them with a path that does not exist. In this case, make sure that the parameter is not accessible to the staff to prevent it from being modified.

Open the Hyper-V Manager, highlight the VM from which you want to remove a control point, and then right-click on that control point in the “Checkpoint” window. In the context menu, use the “Delete Snapshot” option.

This operation cancels the changes to the original VHD file and, therefore, deletes all other files created at the same time as the checkpoint. Note that if you are still using Windows Server 2008, the rollback process is only applied after the VM stops. In Windows Server 2012, however, it runs live.

Leave a Reply

Your email address will not be published. Required fields are marked *