The performance comparison between the Cassandra version 4.1 and 5
I expect you know that Apache Cassandra is an open-source distributed NoSQL database designed to process large amounts of data across many servers without a single point of failure and this article is about comparing the performance change due to evolution.
The main goals
From the perspective of the performance and response time, we would like to see the differences between Cassandra version 4.1 and 5. We would also like to be sure, that the new features such as UCS with relevant configuration, Java 17, etc. will bring the expected benefits in standard scenarios.
We saw a nice comparison between Cassandra versions 3 and 4, and the question mark for us was whether the last Cassandra version would be about a big performance boost or cosmetic changes only.
NOTE: We used an official tool for measurement (‘cassandra-stress’) in the last version 5.0.2.
Management summary
We saw these key outputs from our testing for consistency level LOCAL_QUORUM:
The Cassandra version 5 has on average 38% better performance and 26% better response time for write operations, than Cassandra version 4.1
The Cassandra version 5 has on average 12% better performance and 9% better response time for read operations, than Cassandra version 4.1
The Cassandra version 5 perform much better (can cover higher throughput) than Cassandra version 4.1 on the same HW
We can say, the performance has really very nice progress and we are looking forward to the Cassandra version 5.1 with additional new features such as ACID, etc. You can see a sample of test details below.
Test outputs
Let us mention a few sample outputs from tests (it does not make sense to publish whole details but only samples to provide relevant imagination):
Details of write outputs
Details of read outputs
Test setup/Environments setting
We build two similar clusters for Cassandra testing with these specifications:
Hybrid Cluster, 2x data center
SW
Test setup/Key characteristics of tests
We used standard/official tooling for performance testing ‘cassandra-stress’ (from the official Cassandra 5.0.2 distribution) and tuned the testing with these settings.
Typical test commands in ‘cassandra-stress’:
./apache-cassandra-5.0.2/tools/bin/cassandra-stress write duration=5m cl=LOCAL_ONE no-warmup -node 10.129.xx.xx,10.129.xx.xx,10.129.xx.xx -mode user=perf password=xxx prepared protocolVersion=4 connectionsPerHost=24 maxPending=384 -schema "replication(strategy=NetworkTopologyStrategy,factor=3)" "compaction(strategy=SizeTieredCompactionStrategy,max_threshold=32,min_threshold=4)" -rate "threads<=100" -reporting output-frequency=5s > "./stress-output/$curr_date/$curr_date v4 write_LOCAL_QUORUM_STCS_100xTHR.txt"
./apache-cassandra-5.0.2/tools/bin/cassandra-stress read duration=5m cl=LOCAL_QUORUM no-warmup -node 10.129.xx.xx,10.129.xx.xx,10.129.xx.xx -mode user=perf password=xxx prepared protocolVersion=4 connectionsPerHost=24 maxPending=384 -rate "threads<=100" -reporting output-frequency=5s > "./stress-output/$curr_date/$curr_date v4 read_LOCAL_QUORUM_STCS_100xTHR.txt"
A few final notes
#performancecompare #nosql #newsql #cql #cap #base #cassandra5 #cassandra4 #scylla #astradb #spanner #dynamodb #cockroachdb #cosmosdb #yugabytedb #datastax #elassandra #kairosdb #instaclustr #highavailability #consistencylevel
System Architect at NetCracker and Apache Cassandra Committer
2 个月Jiri Steuer Thank you for the report! If possible could you share cassandra.yaml configuration used for the test? (or at least mention changes compared to a default config, for example: did you use Trie memtable for 5.0.x ?)
Open Source Technology Evangelist at Instaclustr by NetApp
3 个月nice graphs and results, thanks Paul (PS Have you considered submitting a talk to the Performance Engineering track at Community over Code? This would be a good fit)
Experienced Oracle & Cassandra Data Engineer and Database Administrator
3 个月Amazing study. Thank you for publishing detailed report. I wish you tried "easy-cass-stress" tool instead of cassandra-stress. https://github.com/rustyrazorblade/easy-cass-stress
Congratulations on a great work and thank you for your contributions to this field.
Architect??Data/App, MLOps+/AI/ML
3 个月#mlops #featurestore #vectordb