Courtesy: Michał Prostko (Intel) and Izabella Raulin (Intel)
In this post, we explore the performance of MongoDB on Microsoft Azure examining various Virtual Machine (VM) sizes from the D-series as they are recommended for general-purpose needs.
Benchmarks were conducted on the following Linux VMs: Dpsv5, Dasv5, Dasv4, Dsv5, and Dsv4. They have been chosen to represent both the DS-Series v5 and DS-Series v4, showcasing a variety of CPU types. The scenarios included testing instances with 4 vCPUs, 8 vCPUs, and 16 vCPUs to provide comprehensive insights into MongoDB performance and performance-per-dollar across different compute capacities.
Our examination showed that, among instances with the same number of vCPUs, the Dsv5 instances consistently delivered the most favorable performance and the best performance-per-dollar advantage for running MongoDB.
MongoDB Leading in NoSQL Ranking
MongoDB stands out as the undisputed leader in the NoSQL Database category, as demonstrated by the DB-Engines Ranking. MongoDB emerges as the clear frontrunner in the NoSQL domain, with its closest competitors, namely Amazon DynamoDB and Databricks, trailing significantly in scores. Thus, MongoDB is supposed to maintain its leadership position.
MongoDB Adoption in Microsoft Azure
Enterprises utilizing Microsoft Azure can opt for a self-managed MongoDB deployment or leverage the cloud-native MongoDB Atlas service. MongoDB Atlas is a fully managed cloud database service that simplifies the deployment, management, and scaling of MongoDB databases. Naturally, this convenience comes with additional costs. Additionally, it restricts us, for example, we cannot choose the instance type to run the service on.
In this study, the deployment of MongoDB through self-managed environments within Azure’s ecosystem was deliberately chosen to retain autonomy and control over Azure’s infrastructure. This approach allowed for comprehensive benchmarking across various instances, providing insights into performance and the total cost of ownership associated only with running these instances.
Methodology
In the investigation into MongoDB’s performance across various Microsoft Azure VMs, the same methodology was followed as in our prior study conducted on the Google Cloud Platform. Below is a recap of the benchmarking procedures along with the tooling information necessary to reproduce the tests.
Benchmarking Software – YCSB
The Yahoo! Cloud Serving Benchmark (YCSB), an open-source benchmarking tool, is a popular benchmark for testing MongoDB’s performance. The most recent release of the YCSB package, version 0.17.0, was used.
The benchmark of MongoDB was conducted using a workload comprising 90% read operations and 10% updates to reflect, in our opinion, the most likely distribution of operations. To carry out a comprehensive measurement and ensure robust testing of system performance, we configured the YCSB utility to populate the MongoDB database with 10 million records and execute up to 10 million operations on the dataset. This was achieved by configuring the recordcount and operationcount properties within YCSB. To maximize CPU utilization on selected instances and minimize the impact of other variables such as disk and network speeds we configured each MongoDB instance with at least 12GB of WiredTiger cache. This ensured that the entire database dataset could be loaded into the internal cache, minimizing the impact of disk access. Furthermore, 64 client threads were set to simulate concurrency. Other YCSB parameters, if not mentioned below, remained as default.
Setup
Each test consisted of a pair of VMs of identical size: one VM running MongoDB v7.0.0 designated as the Server Under Test (SUT) and one VM running YCSB designed as the load generator. Both VMs ran in the Azure West US Region as on-demand instances, and the prices from this region were used to calculate performance-per-dollar indicators.
Scenarios
MongoDB performance on Microsoft Azure was evaluated by testing various Virtual Machines from the D-series, which are part of the general-purpose machine family. These VMs are recommended for their balanced CPU-to-memory ratio and their capability to handle most production workloads, including databases, as per Azure’s documentation.
The objective of the study is to compare performance and performance-per-dollar metrics across different processors for the last generation and its predecessor. Considering that the newer Dasv6 and Dadsv6 series are currently in preview, the v5 generation represents the latest generally available option. We selected five VM sizes that offer a substantively representative cross-section of choices in the general-purpose D-Series spectrum: Dsv5 and Dsv4 powered by Intel Xeon Scalable Processors, Dasv5 and Dasv4 powered by AMD EPYC processors, and Dpsv5 powered by Ampere Altra Arm-based processors. The testing scenarios included instances with 4, 8, and 16 vCPUs.
Challenges in VM type selection on Azure
In Microsoft Azure instances are structured in a manner where a single VM size can accommodate multiple CPU families. This means that different VMs created under the same VM Size can be provisioned on different CPU types. Azure does not provide a way to specify the desired CPU during instance creation, neither through the Azure Portal nor API. The CPU type can only be determined once the instance is created and operational from within the operating system. It turned out that it required multiple tries to get matching instances as we opted for an approach where both the SUT and the client instance have the same CPU type. What was observed is that larger instances (with more vCPUs) tended to have newer generations of CPU more frequently, while smaller instances were more likely to have the older ones. Consequently, for the smaller instances of Dsv5 and Dsv4 we have never come across VMs with 4th Generation Intel Xeon Scalable Processors.
More details about VM sizes used for testing are provided in Appendix A. For each scenario, a minimum of three runs were conducted. If the results showed variations exceeding 3%, an additional measurement was taken to eliminate outlier cases. This approach ensures the accuracy of the final value, which is derived from the median of these three recorded values.
Results
The measurements were conducted in March 2024, with Linux VMs running Ubuntu 22.04.4 LTS and kernel 6.5.0 in each case. To better illustrate the differences between the individual instance types, normalized values were computed relative to the performance of the Dsv5 instance powered by the 3rd Generation Intel Xeon Scalable Processor. The raw results are shown in Appendix A.
Whether both 16 vCPUs Dsv4 and Dsv5 VMs are powered by 3rd Generation Intel Xeon Scalable Processors 8370C and, moreover, they share the same compute cost of $654.08/month, the discrepancy in MongoDB workload performance scores is observed, favoring the Dsv5 instance. This difference can be attributed to the fact that the tested 16 vCPUs Dsv4, as a representation of the 4th generation of D-series, is expected to be more aligned with other representatives of its generation (see Table 1). Analyzing results for Dasv4 VMs vs Dasv5 VMs, powered by 3rd Generation AMD EPYC 7763v, similar outcomes can be noted – in each tested case, Dasv5-series VMs overperformed Dasv4-series VMs.
Observations:
- Dsv5 VMs, powered by 3rd Generation Intel Xeon Scalable Processor, offer both the most favorable performance and the best performance-per-dollaramong the other instances tested in each scenario (4vCPUs, 8vCPUs, and 16 vCPUs).
- Dasv5 compared to Dsv5 is less expensive, yet it provides lower performance. Therefore, the Total Cost of Ownership (TCO) is in favour of the Dsv5 instances.
- Dpsv5 VMs, powered by Ampere Altra Arm-based processors, have the lowest costs among the tested VM sizes. However, when comparing performance results, that type of VM falls behind, resulting in the lowest performance-per-dollar among the tested VMs.
Conclusion
The presented benchmark analysis covers MongoDB performance and performance-per-dollar across 4vCPUs, 8vCPUs, and 16 vCPUs instances representing general-purpose family VM sizes available on Microsoft Azure and powered by various processor vendors. Results show that among the tested instances, Dsv5 VMs, powered by 3rd Generation Intel Xeon Scalable Processors, provide the best performance for the MongoDB benchmark and lead in performance-per-dollar.
Appendix A