virtual-machine-performance-trends

During these extraordinary times, we received more customer inquiries than ever before. Customer all over the world had to fight performance issues and degraded server response time because of the new mass of people working from home. Plenty of new virtual desktops were deployed utilizing the existing resources.

Of course you can fight these new resource request with more available resources. But lack of budget and time to delivery are often the roadblock to get to find quick solution. Luckily, many times we could help by utilizing the existing resources in a much better way — find and eliminate the nasty configuration issues that massively decrease performance.

The one thing on top of the list is … NUMA (NUMA Remote Node usage, NUMA Migrations). Your systems can easily suffer over 50% performance degradation – that’s quite something if its your database everyone is working with.

NUMA performance trends

As there are many different ways to optimize the NUMA memory usage we thought of something really helpful. Instead of only detecting bad NUMA configuration and tracking the NUMA performance metrics, we wanted to show a trend. That way you can make changes and track the impact for a single VM, an ESXi host or a whole cluster over time.

The main NUMA metrics to check

Before we look into the trends, let’s start with the most important configurations that result in better or worse NUMA aligment.

What are the main NUMA metrics every VMware vSphere admin should keep an eye on:

  • NUMA Home Node usage
  • NUMA Remote Node usage
  • NUMA Home Node migration

All of these metrics are affected by:

  • Host configuration
    • VMware ESXi version (the higher the better the options)
    • CPU Sockets
    • NUMA Nodes per Socket
    • NUMA Node memory size
    • BIOS NUMA node interleaving (some poor configurations has NUMA node interleaving active = very bad idea)
    • BIOS Power management (don’t select balanced if you expect highest performance) – https://kb.vmware.com/s/article/1018206
  • VM configuration
    • VM Hardware version
    • CPU Sockets (overspending can hurt)
    • CPU Cores per Socket (should be aligned with the underlying hardware)
    • Memory size (don’t configure over the capacity of a NUMA node) or make sure that vNUMA is active
    • CPU/Memory Hot-Add activated (no good outcome for NUMA – kills even the best VMkernel intention of proper NUMA alignment); be very careful with that
    • vNUMA (9 vCPU’s and higher activate vNUMA; or an advanced option numa.vcpu.min)

I’m sure there is more to look at, but these are the most common things we’ve caught in the last years.

The benchmark

When you start optimizing your environment for a better resource usage and less resource waste, it’s important to set a benchmark. I would recommend using PowerCLI scripts to get the most important configuration documented, so you can start to compare against these.

Some good scripts are inventory and NUMA checks

Inventory Reporting for vSphere:

https://github.com/AsBuiltReport/AsBuiltReport.VMware.vSphere

Virtual Machine Compute Optimizer Report:

https://flings.vmware.com/virtual-machine-compute-optimizer

Definitely create one of these before your start optimizing as, it helps you to keep track.

One of the many benefits of using Opvizor Performance Analyzer is, that it can store common configuration and performance data for months or even years, so the benchmark is built-in by selecting the past time range. Btw. we’re working hard on a new feature to compare VM performance before and after certain changes.

Performance Trends

Out of the box you get many different Performance related dashboards that help you finding NUMA misalignment and resulting performance degradation. As a rule of thumb, the higher the NUMA migrations the worse, the higher the NUMA Remote Node usage the worse.

Of course it happens when a VM is migrated or from time to time that these counters go up temporary – but never permanent or in the hundreds (or GB) a day.

Dashboards to get a quick overview what VMs suffer the most:

Starter: VMware Virtual Machines

Check NUMA Performance per VM

ESXi details: VMware Performance: ESXi host

Check the Performance of the ESXi host and NUMA nodes, metrics

Check the trends

Existing customers can access these dashboards in the Customer Portal. In case you want to test them during the trial please contact our sales team.

Compare VM performance today, yesterday and weeks ago

The performance comparison can be done either for an individual VM, the ESXi host, the cluster or even across multiple vCenter. The time range to compare is also flexible and you can pick 2days, 7 days or 30days (or any other time).

To compare the whole environment and how each of these days changed form a NUMA Remote Node usage or NUMA migration perspective check our the NUMA special dashboard.

How did the situation change compared to 7 days ago and what day had the highest impact

Bottom Line

Our customers found plenty of ways to get far more performance out of their existing environment and saved ten-, hundred or sometimes a million dollar in new hardware by removing the performance blocker.

Check your NUMA metrics – you might be surprised how many performance issues have its root there.

CNIL
Metrics and Logs

(formerly, Opvizor Performance Analyzer)

VMware vSphere & Cloud
PERFORMANCE MONITORING, LOG ANALYSIS, LICENSE COMPLIANCE!

Monitor and Analyze Performance and Log files:
Performance monitoring for your systems and applications with log analysis (tamperproof using immudb) and license compliance (RedHat, Oracle, SAP and more) in one virtual appliance!

Subscribe to Our Newsletter

Get the latest product updates, company news, and special offers delivered right to your inbox.

Subscribe to our newsletter

Use Case - Tamper-resistant Clinical Trials

Goal:

Blockchain PoCs were unsuccessful due to complexity and lack of developers.

Still the goal of data immutability as well as client verification is a crucial. Furthermore, the system needs to be easy to use and operate (allowing backup, maintenance windows aso.).

Implementation:

immudb is running in different datacenters across the globe. All clinical trial information is stored in immudb either as transactions or the pdf documents as a whole.

Having that single source of truth with versioned, timestamped, and cryptographically verifiable records, enables a whole new way of transparency and trust.

Use Case - Finance

Goal:

Store the source data, the decision and the rule base for financial support from governments timestamped, verifiable.

A very important functionality is the ability to compare the historic decision (based on the past rulebase) with the rulebase at a different date. Fully cryptographic verifiable Time Travel queries are required to be able to achieve that comparison.

Implementation:

While the source data, rulebase and the documented decision are stored in verifiable Blobs in immudb, the transaction is stored using the relational layer of immudb.

That allows the use of immudb’s time travel capabilities to retrieve verified historic data and recalculate with the most recent rulebase.

Use Case - eCommerce and NFT marketplace

Goal:

No matter if it’s an eCommerce platform or NFT marketplace, the goals are similar:

  • High amount of transactions (potentially millions a second)
  • Ability to read and write multiple records within one transaction
  • prevent overwrite or updates on transactions
  • comply with regulations (PCI, GDPR, …)


Implementation:

immudb is typically scaled out using Hyperscaler (i. e. AWS, Google Cloud, Microsoft Azure) distributed across the Globe. Auditors are also distributed to track the verification proof over time. Additionally, the shop or marketplace applications store immudb cryptographic state information. That high level of integrity and tamper-evidence while maintaining a very high transaction speed is key for companies to chose immudb.

Use Case - IoT Sensor Data

Goal:

IoT sensor data received by devices collecting environment data needs to be stored locally in a cryptographically verifiable manner until the data is transferred to a central datacenter. The data integrity needs to be verifiable at any given point in time and while in transit.

Implementation:

immudb runs embedded on the IoT device itself and is consistently audited by external probes. The data transfer to audit is minimal and works even with minimum bandwidth and unreliable connections.

Whenever the IoT devices are connected to a high bandwidth, the data transfer happens to a data center (large immudb deployment) and the source and destination date integrity is fully verified.

Use Case - DevOps Evidence

Goal:

CI/CD and application build logs need to be stored auditable and tamper-evident.
A very high Performance is required as the system should not slow down any build process.
Scalability is key as billions of artifacts are expected within the next years.
Next to a possibility of integrity validation, data needs to be retrievable by pipeline job id or digital asset checksum.

Implementation:

As part of the CI/CD audit functionality, data is stored within immudb using the Key/Value functionality. Key is either the CI/CD job id (i. e. Jenkins or GitLab) or the checksum of the resulting build or container image.

White Paper — Registration

We will also send you the research paper
via email.

CodeNotary — Webinar

White Paper — Registration

Please let us know where we can send the whitepaper on CodeNotary Trusted Software Supply Chain. 

Become a partner

Start Your Trial

Please enter contact information to receive an email with the virtual appliance download instructions.

Start Free Trial

Please enter contact information to receive an email with the free trial details.