vmware-cpu-co-stop-and-sql-server-performance

A very useful article from David Klee about the intricacies of virtualization CPU scheduling and its impact on the performance of the VMs.

You can read the complete article here, or see some excerpts below.

VMware CPU Co-Stop

You’ll need access into VMware’s vCenter Server to see this metric. It’s not visible from inside the VM. At least read-only access is advised, and if you’re a DBA, you need access into this layer anyway so you can better do your job.

First of all, a hypervisor metric called Ready Time / Wait Time Per Dispatch indicates that the hypervisor is queuing up the VM’s requests for executing tasks on the physical CPUs. The amount of time taken in the vCPU scheduling queue during an individual scheduling queue is measured, and the percentage performance hit on that vCPU is measured and reported. This metric is fairly easy to understand if you are given the math and explanation. But, what about the scheduling of a widely parallelized process or task, such as a database query that is executed on all vCPUs? Is it different?

It’s worse.

In early versions of the hypervisors, if you had multiple vCPUs on a VM and executed a task that needed all available vCPUs to process a task, the host would need to each physical CPU free before that task would be executed in parallel. It was very strict, and it meant that the performance overhead was exceptionally high. Newer versions of the hypervisor have a relaxed CPU ‘co-scheduler’, where the CPU queues for a VM’s vCPUs might not execute exactly in parallel, but that the performance impact on the source VM and background VMs on the same host is not as great. However, as a result of this overhead, the overall time taken to execute that task is limited by the slowest physical CPU, or most bottlenecked CPU queue, in the group. Even with relaxed co-scheduling, sometimes all the vCPUs need to be scheduled to run simultaneously, and this is where Co-Stop comes in to play.

VMware’s CPU Co-Stop metric shows you the amount of time that a parallelized request spends trying to line up the vCPU schedulers for the simultaneous execution of a task on multiple vCPUs. It’s measured in milliseconds spent in the queue per vCPU per polling interval. Higher is bad. Very bad. The operating system is constantly reviewing the running processes, and checking their runtime states. It can detect that a CPU isn’t keeping up with the others, and might actually flag a CPU is actually BAD if it can’t keep up and the difference is too great.

If you see blips above zero, you’ve got a performance challenge. The higher the number gets, the worse the performance impact can be. And… it’s not just the performance of this VM. It’s the performance of all of the VMs on the host. The vCPUs on the other VMs are sure to be impacted by this scheduling delay, and their performance will be negatively impacted as well.

To view the CPU Co-Stop values for one of your SQL Server VMs, open the vCenter Server web client link provided to your by your VM administrators, select your VM, select the Monitor tab, select Performance, and then Advanced. Click the CPU view and then Chart Options. Select ‘Co-Stop’ under real-time stats.

CPU Co-Stop and SQL Server Performance

Photo courtesy of http://www.davidklee.net/

The numbers that you see are milliseconds per 20-second polling interval that are spent in this paused state.

VMware CPU Co-Stop and SQL Server Performance

Photo courtesy of http://www.davidklee.net/

This VM is healthy (at least from this view). Occasionally you’ll see small blips, and that’s OK. It’s when you see higher Co-Stop times that are sustained over minutes or hours where you get the large performance impacts. I’ve seen this value over 8000 per vCPU, and that VM took a performance hit over 90%.

I want to see sustained numbers no greater than 10ms per polling interval. Not 10 percent, mind you… 10ms.

SQL Server will widely parallelize high impact queries across as many CPUs as it can, up to the number set in the Max Degree of Parallelism setting. If come of these vCPUs are artificially slowed down to let the others catch up, your query performance will suffer. This delay is one that is more immediately felt than Ready time.

How to detect it?

If you’re looking for a very modern way to check and monitor not just Co-Stopping but all important performance metrics without spending much time on configuration and setup, you should give Performance Analyzer a try. It doesn’t take more than a couple of minutes and you see the first results.

VMware Co-Stop Performance Issue

That can be even more useful when you combine it with our Database View, i. e. MS SQL as database performance is hurt a lot by Co-Stopping.

Microsoft SQL Analysis

What can we do about it?

You can reduce the number of vCPUs on the VM, but only if you can prove that you do not need all of them. You can “right-size” the VMs with the right number of vCPUs and/or the right vNUMA configuration to match the physical server’s NUMA configuration. You can move background workloads to other hosts to reduce the load on the host that your impacted VM is running on. You can reduce the background activity on the physical CPUs by throttling back background VMs using resource pools or CPU throttling, if necessary.

Whatever you do, go to your VM administrators if you experience sustained high vCPU Co-Stop times. They might not be aware of this problem, or even know what Co-Stop is.

Please be warned that if you are running a critical SQL Server or other mission-critical application with a snapshot on the VM, you might experience high vCPU Co-Stop times, and your VM’s performance can considerably suffer. VMware even has a KB article on this topic. Please make your VM administrators aware of this challenge.  Do not use snapshots as a temporary backup for your critical VMs – EVER.

Find here some more articles from David

Here are some other articles about this topics:

Start your Free 30 day trial today!

CNIL
Metrics and Logs

(formerly, Opvizor Performance Analyzer)

VMware vSphere & Cloud
PERFORMANCE MONITORING, LOG ANALYSIS, LICENSE COMPLIANCE!

Monitor and Analyze Performance and Log files:
Performance monitoring for your systems and applications with log analysis (tamperproof using immudb) and license compliance (RedHat, Oracle, SAP and more) in one virtual appliance!

Subscribe to Our Newsletter

Get the latest product updates, company news, and special offers delivered right to your inbox.

Subscribe to our newsletter

Use Case - Tamper-resistant Clinical Trials

Goal:

Blockchain PoCs were unsuccessful due to complexity and lack of developers.

Still the goal of data immutability as well as client verification is a crucial. Furthermore, the system needs to be easy to use and operate (allowing backup, maintenance windows aso.).

Implementation:

immudb is running in different datacenters across the globe. All clinical trial information is stored in immudb either as transactions or the pdf documents as a whole.

Having that single source of truth with versioned, timestamped, and cryptographically verifiable records, enables a whole new way of transparency and trust.

Use Case - Finance

Goal:

Store the source data, the decision and the rule base for financial support from governments timestamped, verifiable.

A very important functionality is the ability to compare the historic decision (based on the past rulebase) with the rulebase at a different date. Fully cryptographic verifiable Time Travel queries are required to be able to achieve that comparison.

Implementation:

While the source data, rulebase and the documented decision are stored in verifiable Blobs in immudb, the transaction is stored using the relational layer of immudb.

That allows the use of immudb’s time travel capabilities to retrieve verified historic data and recalculate with the most recent rulebase.

Use Case - eCommerce and NFT marketplace

Goal:

No matter if it’s an eCommerce platform or NFT marketplace, the goals are similar:

  • High amount of transactions (potentially millions a second)
  • Ability to read and write multiple records within one transaction
  • prevent overwrite or updates on transactions
  • comply with regulations (PCI, GDPR, …)


Implementation:

immudb is typically scaled out using Hyperscaler (i. e. AWS, Google Cloud, Microsoft Azure) distributed across the Globe. Auditors are also distributed to track the verification proof over time. Additionally, the shop or marketplace applications store immudb cryptographic state information. That high level of integrity and tamper-evidence while maintaining a very high transaction speed is key for companies to chose immudb.

Use Case - IoT Sensor Data

Goal:

IoT sensor data received by devices collecting environment data needs to be stored locally in a cryptographically verifiable manner until the data is transferred to a central datacenter. The data integrity needs to be verifiable at any given point in time and while in transit.

Implementation:

immudb runs embedded on the IoT device itself and is consistently audited by external probes. The data transfer to audit is minimal and works even with minimum bandwidth and unreliable connections.

Whenever the IoT devices are connected to a high bandwidth, the data transfer happens to a data center (large immudb deployment) and the source and destination date integrity is fully verified.

Use Case - DevOps Evidence

Goal:

CI/CD and application build logs need to be stored auditable and tamper-evident.
A very high Performance is required as the system should not slow down any build process.
Scalability is key as billions of artifacts are expected within the next years.
Next to a possibility of integrity validation, data needs to be retrievable by pipeline job id or digital asset checksum.

Implementation:

As part of the CI/CD audit functionality, data is stored within immudb using the Key/Value functionality. Key is either the CI/CD job id (i. e. Jenkins or GitLab) or the checksum of the resulting build or container image.

White Paper — Registration

We will also send you the research paper
via email.

CodeNotary — Webinar

White Paper — Registration

Please let us know where we can send the whitepaper on CodeNotary Trusted Software Supply Chain. 

Become a partner

Start Your Trial

Please enter contact information to receive an email with the virtual appliance download instructions.

Start Free Trial

Please enter contact information to receive an email with the free trial details.