WHAT IS CASSANDRA

WHAT IS CASSANDRA

Cassandra?is a?free and open-source,?distributed,?wide-column store,?NoSQL?database?management system designed to handle large amounts of data across many?commodity servers, providing high availability with no?single point of failure. Cassandra offers support for?clusters?spanning multiple datacenters,[2]?with asynchronous masterless replication allowing low latency operations for all clients. Cassandra was designed to implement a combination of Amazon's?Dynamo?distributed storage and replication techniques combined with Google's?Bigtable?data and storage engine model.[3]

History[edit]

Avinash Lakshman, one of the authors of?Amazon's Dynamo, and Prashant Malik initially developed Cassandra at?Facebook?to power the Facebook inbox search feature. Facebook released Cassandra as an open-source?project?on?Google code?in July 2008.[4]?In March 2009, it became an?Apache Incubator?project.[5]?On February 17, 2010, it graduated to a top-level project.[6]

Facebook developers named their database after the Trojan mythological prophet?Cassandra, with classical allusions to a curse on an?oracle.[7]

Releases[edit]

Releases after graduation include

  • 0.6, released Apr 12 2010, added support for integrated caching, and?Apache Hadoop?MapReduce[8]
  • 0.7, released Jan 08 2011, added secondary indexes and online schema changes[9]
  • 0.8, released Jun 2 2011, added the Cassandra Query Language (CQL), self-tuning memtables, and support for zero-downtime upgrades[10]
  • 1.0, released Oct 17 2011, added integrated compression, leveled compaction, and improved read-performance[11]
  • 1.1, released Apr 23 2012, added self-tuning caches, row-level isolation, and support for mixed ssd/spinning disk deployments[12]
  • 1.2, released Jan 2 2013, added clustering across virtual nodes, inter-node communication, atomic batches, and request tracing[13]
  • 2.0, released Sep 4 2013, added lightweight transactions (based on the?Paxos?consensus protocol), triggers, improved compactions
  • 2.1 released Sep 10 2014[14]
  • 2.2 released July 20, 2015
  • 3.0 released November 11, 2015
  • 3.1 through 3.10 releases were monthly releases using a?tick-tock-like release model, with even-numbered releases providing both new features and bug fixes while odd-numbered releases will include bug fixes only.[15]
  • 3.11 released June 23, 2017 as a stable 3.11 release series and bug fix from the last tick-tock feature release.
  • 4.0 released July 26, 2021.
  • 4.1 released December 13, 2022.

VersionOriginal release dateLatest versionRelease dateStatus[16]0.62010-04-120.6.132011-04-18No longer maintained0.72011-01-100.7.102011-10-31No longer maintained0.82011-06-030.8.102012-02-13No longer maintained1.02011-10-181.0.122012-10-04No longer maintained1.12012-04-241.1.122013-05-27No longer maintained1.22013-01-021.2.192014-09-18No longer maintained2.02013-09-032.0.172015-09-21No longer maintained2.12014-09-162.1.222020-08-31No longer maintained2.22015-07-202.2.192020-11-04No longer maintained3.02015-11-093.0.282022-05-13Still supported3.112017-06-233.11.142022-05-13Still supported4.02021-07-264.0.72022-08-25Still supported4.12022-06-174.1.02022-12-13Latest releaseLegend:

Old version

Older version, still maintained

Latest version

Latest preview version

Main features[edit]

Distributed

Every node in the cluster has the same role. There is no single point of failure. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request.

Supports replication and multi data center replication

Replication strategies are configurable.[17]?Cassandra is designed as a distributed system, for deployment of large numbers of nodes across multiple data centers. Key features of Cassandra’s distributed architecture are specifically tailored for multiple-data center deployment, for redundancy, for failover and disaster recovery.

Scalability

Designed to have read and write throughput both increase linearly as new machines are added, with the aim of no downtime or interruption to applications.

Fault-tolerant

Data is automatically replicated to multiple nodes for?fault-tolerance.?Replication?across multiple data centers is supported. Failed nodes can be replaced with no downtime.

Tunable consistency

Cassandra is typically classified as an?AP system, meaning that availability and partition tolerance are generally considered to be more important than consistency in Cassandra,[18]?Writes and reads offer a tunable level of?consistency, all the way from "writes never fail" to "block for all replicas to be readable", with the?quorum level?in the middle.[19]

MapReduce support

Cassandra has?Hadoop?integration, with?MapReduce?support. There is support also for?Apache Pig?and?Apache Hive.[20]

Query language

Cassandra introduced the Cassandra Query Language (CQL). CQL is a simple interface for accessing Cassandra, as an alternative to the traditional?Structured Query Language?(SQL).

Eventual consistency

Cassandra manages eventual consistency of reads,?upserts?and deletes through?Tombstones.

Cassandra Query Language[edit]

Cassandra introduced the Cassandra Query Language (CQL). CQL is a simple interface for accessing Cassandra, as an alternative to the traditional?Structured Query Language?(SQL). CQL adds an abstraction layer that hides implementation details of this structure and provides native syntaxes for collections and other common encodings. Language drivers are available for Java (JDBC), Python (DBAPI2), Node.JS (Datastax), Go (gocql) and C++.[21]

The keyspace in Cassandra is a namespace that defines data replication across nodes. Therefore, replication is defined at the keyspace level. Below an example of keyspace creation, including a column family in CQL 3.0:[22]

CREATE KEYSPACE MyKeySpace        

要查看或添加评论,请登录

Ashish Ranjan的更多文章

  • WHAT IS AGILE

    WHAT IS AGILE

    In software development, agile practices (sometimes written "Agile")[1] include requirements discovery and solutions…

  • WHAT IS GCP

    WHAT IS GCP

    Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that runs on the same…

  • WHAT IS AGILE

    WHAT IS AGILE

    In software development, agile practices (sometimes written "Agile")[1] include requirements discovery and solutions…

  • WHAT IS UNITY 3D

    WHAT IS UNITY 3D

    Unity is a cross-platform game engine developed by Unity Technologies, first announced and released in June 2005 at…

  • WHAT IS SHELL SCRIPTING

    WHAT IS SHELL SCRIPTING

    A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter.[1] The various…

  • WHAT IS API

    WHAT IS API

    An application programming interface (API) is a way for two or more computer programs to communicate with each other…

  • WHAT IS JAVA DEVELOPER

    WHAT IS JAVA DEVELOPER

    Despite its age and legacy, Java remains one of the most popular programming languages to this day. According to a 2021…

  • WHAT IS POWER BI

    WHAT IS POWER BI

    Microsoft Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on…

  • WHAT IS PMO

    WHAT IS PMO

    A project management office (abbreviated to PMO) is a group or department within a business, government agency, or…

  • WHAT IS NETWORKING

    WHAT IS NETWORKING

    A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use…

社区洞察

其他会员也浏览了