WHAT IS CASSANDRA
Ashish Ranjan
IT Recruiter- Talent Acquisition || B.TECH(EEE) || Tech & Non-Tech Hiring || Leadership Hiring || Corporate Hiring
Cassandra?is a?free and open-source,?distributed,?wide-column store,?NoSQL?database?management system designed to handle large amounts of data across many?commodity servers, providing high availability with no?single point of failure. Cassandra offers support for?clusters?spanning multiple datacenters,[2]?with asynchronous masterless replication allowing low latency operations for all clients. Cassandra was designed to implement a combination of Amazon's?Dynamo?distributed storage and replication techniques combined with Google's?Bigtable?data and storage engine model.[3]
History[edit]
Avinash Lakshman, one of the authors of?Amazon's Dynamo, and Prashant Malik initially developed Cassandra at?Facebook?to power the Facebook inbox search feature. Facebook released Cassandra as an open-source?project?on?Google code?in July 2008.[4]?In March 2009, it became an?Apache Incubator?project.[5]?On February 17, 2010, it graduated to a top-level project.[6]
Facebook developers named their database after the Trojan mythological prophet?Cassandra, with classical allusions to a curse on an?oracle.[7]
Releases[edit]
Releases after graduation include
VersionOriginal release dateLatest versionRelease dateStatus[16]0.62010-04-120.6.132011-04-18No longer maintained0.72011-01-100.7.102011-10-31No longer maintained0.82011-06-030.8.102012-02-13No longer maintained1.02011-10-181.0.122012-10-04No longer maintained1.12012-04-241.1.122013-05-27No longer maintained1.22013-01-021.2.192014-09-18No longer maintained2.02013-09-032.0.172015-09-21No longer maintained2.12014-09-162.1.222020-08-31No longer maintained2.22015-07-202.2.192020-11-04No longer maintained3.02015-11-093.0.282022-05-13Still supported3.112017-06-233.11.142022-05-13Still supported4.02021-07-264.0.72022-08-25Still supported4.12022-06-174.1.02022-12-13Latest releaseLegend:
Old version
Older version, still maintained
Latest version
Latest preview version
Main features[edit]
Distributed
Every node in the cluster has the same role. There is no single point of failure. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request.
Supports replication and multi data center replication
Replication strategies are configurable.[17]?Cassandra is designed as a distributed system, for deployment of large numbers of nodes across multiple data centers. Key features of Cassandra’s distributed architecture are specifically tailored for multiple-data center deployment, for redundancy, for failover and disaster recovery.
Scalability
Designed to have read and write throughput both increase linearly as new machines are added, with the aim of no downtime or interruption to applications.
Fault-tolerant
Data is automatically replicated to multiple nodes for?fault-tolerance.?Replication?across multiple data centers is supported. Failed nodes can be replaced with no downtime.
Tunable consistency
Cassandra is typically classified as an?AP system, meaning that availability and partition tolerance are generally considered to be more important than consistency in Cassandra,[18]?Writes and reads offer a tunable level of?consistency, all the way from "writes never fail" to "block for all replicas to be readable", with the?quorum level?in the middle.[19]
MapReduce support
Cassandra has?Hadoop?integration, with?MapReduce?support. There is support also for?Apache Pig?and?Apache Hive.[20]
Query language
Cassandra introduced the Cassandra Query Language (CQL). CQL is a simple interface for accessing Cassandra, as an alternative to the traditional?Structured Query Language?(SQL).
Eventual consistency
Cassandra manages eventual consistency of reads,?upserts?and deletes through?Tombstones.
Cassandra Query Language[edit]
Cassandra introduced the Cassandra Query Language (CQL). CQL is a simple interface for accessing Cassandra, as an alternative to the traditional?Structured Query Language?(SQL). CQL adds an abstraction layer that hides implementation details of this structure and provides native syntaxes for collections and other common encodings. Language drivers are available for Java (JDBC), Python (DBAPI2), Node.JS (Datastax), Go (gocql) and C++.[21]
The keyspace in Cassandra is a namespace that defines data replication across nodes. Therefore, replication is defined at the keyspace level. Below an example of keyspace creation, including a column family in CQL 3.0:[22]
CREATE KEYSPACE MyKeySpace