Performance Benchmarking Microsoft’s Cobalt 100 VMs on Databricks

Performance Benchmarking Microsoft’s Cobalt 100 VMs on Databricks

In the ever-evolving landscape of cloud computing, performance is king. As data scientists and engineers, we constantly seek ways to optimize workflows and extract maximum value from our infrastructure.

In this post, we take a deep dive into the performance comparison of Cobalt 100 and x86 VMs on Databricks using TPC-DS datasets. We aim to understand how these VM types — Cobalt 100 (Standard_d4pds_v6) and x86 (Standard_d4ds_v5) — perform under different data loads. We conducted the benchmarks with both 10GB and 100GB TPC-DS datasets.

The Experiment

VM Types

I picked VMs of the same size for both categories. They both have 4 CPU Cores and 16 GB of memory each Cobalt 100 VM type — Standard_d4pds_v6 x86 VM type — Standard_d4ds_v5

Cluster Configurations

  • Workers — 2
  • Driver — 1
  • Databricks Runtime version — 15.4 LTS
  • Photon disabled
  • Autoscaling disabled

See the screenshots of both the cluster setups below.

Cobalt 100 Cluster config
x86 cluster config

Benchmarking Methodology

We employed the TPC-DS benchmark dataset1, a standard for evaluating performance in data processing systems. The tests were conducted on both 10GB and 100GB datasets. Each query set was executed five times per VM type, with the median runtime selected to mitigate any anomalies or edge cases.

Benchmark Results for the 100GB Dataset

The benchmark consisted of ~120 different queries2 on both the Cobalt100 and x86-based machines. Below are some key results showing the median runtimes for each machine across various queries:

The results from the 100GB dataset highlight significant performance differences between the two VM types. Let’s look at the details.


Overall performance comparison

Overall performance comparison for the 100GB dataset

For the 100GB TPCDS dataset, Cobalt 100 VMs were ~18% faster.

Below are the runtimes for each VM type Cobalt 100: 3,644 seconds x86: 4,450 seconds

Here’s some detailed analysis.

  • 93 out of 118 queries ran faster on the Cobalt 100 VMs than on the x86 machines.
  • 25 queries ran faster on the x86 machines than the Cobalt 100 VMs.

Queries performance distribution(100 GB dataset)

  • This translates into ~79% of queries performing better on Cobalt 100 VMs.
  • Here are the top gainers on Cobalt 100 VMs.

Top performing queries on Cobalt 100 (100 GB dataset)
Top performance improvements (100 GB dataset)
Performance improvement distribution (100 GB dataset)

  • I analyzed the top 3 queries, and they have complex analytical operations in the form of CTE, windowing operations, and multiple sub-queries, which are generally memory-intensive.
  • On the other end of the spectrum, the top 5 queries fared the best on the x86 machines below.

Top-performing queries on x86 (100 GB dataset)

Benchmark Results for the 10GB Dataset

The performance trends observed in the smaller dataset align with the 100GB dataset benchmark results. The performance delta is a bit narrower, though. Details below.

  • Overall Performance: Similar to the larger dataset, Cobalt 100 VMs generally outperformed x86 VMs.
  • Cobalt 100 VMs are ~13% faster compared to the x86 machines. See the screenshot below.

Overall performance comparison for the 10GB dataset

  • 87 out of 118 queries ran faster on Cobalt VMs. The number was 93 for the 100GB dataset.
  • Here are the top queries on Cobalt 100 VMs.

Top performing queries on Cobalt 100 (10 GB dataset)

  • Some outliers ran faster on Cobalt 100 during the TPCDS 100 GB dataset benchmark but flipped during the 10GB benchmark. See below.

  • However, the runtimes for queries are very close to the 10GB benchmark.

Key Takeaways

  • Overall Performance Boost: Cobalt 100 VMs demonstrated a significant advantage in total runtime, potentially offering substantial efficiency gains for large-scale data processing tasks.
  • Query-Dependent Benefits: The performance improvements varied widely across different queries, indicating that the benefits of Cobalt 100 may be more pronounced for certain types of operations or data patterns.
  • Optimization Opportunities: The mixed results suggest that there may be opportunities for query optimization to take full advantage of Cobalt 100’s architecture.

Conclusion

Though it’s still early days for the Cobalt 100 VMs, in our initial benchmarking, Cobalt 100 consistently outperformed x86 in both 10GB and 100GB TPC-DS datasets, especially in larger datasets and I/O-intensive queries.

As organizations look to optimize performance in the cloud, Cobalt 100 VMs offer a compelling alternative for Databricks workloads. Their modern architecture provides not only performance benefits but also potentially better cost-efficiency.

As with any technology decision, it’s crucial to conduct your own benchmarks with representative workloads to determine the best fit for your organization’s needs. The promising results seen here suggest that Cobalt 100 VMs could be a game-changer for many data-intensive applications, offering a new level of performance for the most demanding analytics tasks.

Have you experimented with Cobalt 100 VMs in your workflows? I’d love to hear about your experiences and insights in the comments below!

要查看或添加评论,请登录

Rahul Soni的更多文章