vm-gets-unresponsive-when-removing-a-snapshot-snapshot-consolidation

A customer of us informed us about the following situation he ran into. Some of his VMware vSphere VMs became unresponsive for about an our (based on VMware KB article it can be 30 minutes or much longer) when he removed a VM snapshot.

Issue cause

As written in VMware KB article 2039754 the issue occurs if the virtual machine generates data faster than the consolidate rate.

That means if the VM is running a write intensive application, i. e. a database system, it can happen that the data amount written to the disk is exceeding the data amount transferred by consolidating the snapshot. As Snapshot consolidation is an asynchronous job that is trying to transfer data from the snapshots to the base disks, VMware is using a stun technology to get a chance of deleting the snapshots faster than the new generated data by the Guest application comes in.

If the VM snapshot consolidation doesn’t work well as the incoming data rate is too high, VMware is increasing the asynchronous consolidation run cycle from 5 to 10, 20, 30 and so on until the 9th iterations. Then it goes to 60 minutes per Snapshot consolidation cycle. When the maximum of iterations if reached, the VM is stunned and a synchronous consolidation is forced.

As the customer noticed, the Snapshot consolidation can take a while and the VM becomes unresponsive. The time it takes varies by the overall system performance.

Learn more about a typical snapshot related issues, opvizor automatically finds in minutes:

Sign Up for opvizor!

Detection

During the Snapshot consolidation you can see entries in the vmware.log similar to (extract VMware KB article 2039754):

vmx| Checkpoint_Unstun: vm stopped for 3711025 us vmx| Checkpoint_Unstun: vm stopped for 574655 us vmx| Checkpoint_Unstun: vm stopped for 2191061 us Create snapshot smvi_2a175570-ed2f-.... Operation completed Consolidate starts Intermediate snapshot taken, took 1.8s VM runs for 2 seconds, while consolidate of scsi0:0 is in progress Move to next disk, no more interations for scsi0:0 are necessary, stunned for 0.6s Consolidate of scsi0:1 finished, another iteration is needed. Intermediate snapshot is deleted, and another is created. VM stunned for 2.7s.

Resolution

You can patch your system:

  • ESXi 4.x Patch 07
  • ESXi 5.0 Patch 09
  • ESXi 5.1 Update 2

There is also a workaround:

Shut down the virtual machine and change the VM configuration parameter

  • snapshot.maxIterations = 20 (or higher)
  • snapshot.maxConsolidateTime = 60 seconds
  • snapshot.maxIterations = 0

or use this powershell script:

[ps] $snapiter = New-Object VMware.Vim.optionvalue $snapiter.Key="snapshot.maxIterations" $snapiter.Value="20" # set to 0 to avoid asynchronous iterations and directly stun the VM $snapmaxcon = New-Object VMware.Vim.optionvalue $snapmaxcon.Key="snapshot.maxConsolidateTime" $snapmaxcon.Value="60"

$vmConfigSpec = New-Object VMware.Vim.VirtualMachineConfigSpec $vmConfigSpec.extraconfig = $snapiter $vmConfigSpec.extraconfig = $snapmaxcon foreach($vm in (get-view -viewtype virtualmachine)){$vm.ReconfigVM($vmConfigSpec)} [/ps]

Try Snapwatcher to manage typical snapshot related issues:

opvizor Snapwatcher - Say Goodbye to inconsistent VM Snapshots

This is the first automated solution that constantly monitors all snapshots in the entire VMware vCenter systems to catch and repair old, broken or inconsistent VM snapshots. Our powerful grid allows you to manage actions on hundreds of snapshots in one simple dashboard view. Snapwatcher does the hard work to protect your disk space from broken snapshots so you don’t have to.

You can download and try Snapwatcher here by clicking “Try Snapwatcher by opvizor!

CNIL
Metrics and Logs

(formerly, Opvizor Performance Analyzer)

VMware vSphere & Cloud
PERFORMANCE MONITORING, LOG ANALYSIS, LICENSE COMPLIANCE!

Monitor and Analyze Performance and Log files:
Performance monitoring for your systems and applications with log analysis (tamperproof using immudb) and license compliance (RedHat, Oracle, SAP and more) in one virtual appliance!

Subscribe to Our Newsletter

Get the latest product updates, company news, and special offers delivered right to your inbox.

Subscribe to our newsletter

Use Case - Tamper-resistant Clinical Trials

Goal:

Blockchain PoCs were unsuccessful due to complexity and lack of developers.

Still the goal of data immutability as well as client verification is a crucial. Furthermore, the system needs to be easy to use and operate (allowing backup, maintenance windows aso.).

Implementation:

immudb is running in different datacenters across the globe. All clinical trial information is stored in immudb either as transactions or the pdf documents as a whole.

Having that single source of truth with versioned, timestamped, and cryptographically verifiable records, enables a whole new way of transparency and trust.

Use Case - Finance

Goal:

Store the source data, the decision and the rule base for financial support from governments timestamped, verifiable.

A very important functionality is the ability to compare the historic decision (based on the past rulebase) with the rulebase at a different date. Fully cryptographic verifiable Time Travel queries are required to be able to achieve that comparison.

Implementation:

While the source data, rulebase and the documented decision are stored in verifiable Blobs in immudb, the transaction is stored using the relational layer of immudb.

That allows the use of immudb’s time travel capabilities to retrieve verified historic data and recalculate with the most recent rulebase.

Use Case - eCommerce and NFT marketplace

Goal:

No matter if it’s an eCommerce platform or NFT marketplace, the goals are similar:

  • High amount of transactions (potentially millions a second)
  • Ability to read and write multiple records within one transaction
  • prevent overwrite or updates on transactions
  • comply with regulations (PCI, GDPR, …)


Implementation:

immudb is typically scaled out using Hyperscaler (i. e. AWS, Google Cloud, Microsoft Azure) distributed across the Globe. Auditors are also distributed to track the verification proof over time. Additionally, the shop or marketplace applications store immudb cryptographic state information. That high level of integrity and tamper-evidence while maintaining a very high transaction speed is key for companies to chose immudb.

Use Case - IoT Sensor Data

Goal:

IoT sensor data received by devices collecting environment data needs to be stored locally in a cryptographically verifiable manner until the data is transferred to a central datacenter. The data integrity needs to be verifiable at any given point in time and while in transit.

Implementation:

immudb runs embedded on the IoT device itself and is consistently audited by external probes. The data transfer to audit is minimal and works even with minimum bandwidth and unreliable connections.

Whenever the IoT devices are connected to a high bandwidth, the data transfer happens to a data center (large immudb deployment) and the source and destination date integrity is fully verified.

Use Case - DevOps Evidence

Goal:

CI/CD and application build logs need to be stored auditable and tamper-evident.
A very high Performance is required as the system should not slow down any build process.
Scalability is key as billions of artifacts are expected within the next years.
Next to a possibility of integrity validation, data needs to be retrievable by pipeline job id or digital asset checksum.

Implementation:

As part of the CI/CD audit functionality, data is stored within immudb using the Key/Value functionality. Key is either the CI/CD job id (i. e. Jenkins or GitLab) or the checksum of the resulting build or container image.

White Paper — Registration

We will also send you the research paper
via email.

CodeNotary — Webinar

White Paper — Registration

Please let us know where we can send the whitepaper on CodeNotary Trusted Software Supply Chain. 

Become a partner

Start Your Trial

Please enter contact information to receive an email with the virtual appliance download instructions.

Start Free Trial

Please enter contact information to receive an email with the free trial details.