#bigdata 29e?—?NoSQL with Base, Cassandra and MongDB
Database (credits pixabay)

#bigdata 29e?—?NoSQL with Base, Cassandra and MongDB

Relational Database (RDBMS) is a technology used on a large scale in commercial systems, banking, flight reservations, or applications where data is structured. SQL (Structured Query Language) is the query language oriented to these applications.

Database applications, which stand out in the consistency of data schemas, can be scaled, but are not designed for infinite scaling.

The need to analyze data in large volumes, from different sources and formats, has given rise to NoSQL (Not Only SQL) technology. They are not relational and not based on schemas (rules governing data or objects).

In essence, all NoSQL implementations are looking for the scaled handling of large volumes of unstructured data.

Credits Apache Foundation

NoSQL databases can grow endlessly and focus more on performance, allowing replication of data across multiple network nodes, reading, writing, and processing data at incredible speeds, using distributed parallel processing paradigms.

NoSQL can be used in real-time data analysis, such as personalization of sites from user behavior tracking, IoT (Internet of Things) such as vehicle telematics or mobile device telemetry.

NoSQL Types

The three main types of NoSQL are.

  1. Column Database (column-oriented)
  2. Key-Value Database (key / value oriented)
  3. Document Database (document-oriented)

1 — Column Database

A NoSQL database that stores data in tables and manages them by columns instead of rows is called the columnar database management system (CDBMS).

Columns are transformed into data files.

One of the benefits is that data can be compressed, allowing operations such as minimum, maximum, sum, counting, and averages to be executed quickly.

They can be auto-indexed, using less disk space than a relational database system containing the same data.

Apache HBase

It is a NoSQL-oriented Columns. It is popular because it was built to run on top of Hadoop with HDFS.

Credits Apache Foundation

It was designed from the concepts of the first columnar database developed by Google, called “BigTable.”

It is beneficial for real-time research, reading and accessing large volumes of data.

2 — Key-Value Database

A key/value oriented NoSQL stores data in collections of key/value pairs. For example, a student Id number may be the key, and the student’s name may be the value.

It is a dictionary, storing a value, such as an integer, and a string (JSON or Matrix file structure), along with the key to reference that value.

Apache Cassandra

Cassandra is a powerful NoSQL based key/value model.

Credits Apache Foundation

Facebook initially developed it in 2008, is hugely scalable and fault tolerant.

It was developed to solve Big Data analytical problems in real time involving Petabytes of data using MapReduce.

Cassandra can run without Hadoop, but it becomes powerful when connected to Hadoop and HDFS.

3 — Document Database (document-oriented)

Document-oriented NoSQL are similar to key/value documents.

They organize documents into collections analogous to relational tables, and research can be done based on values, not just key-based ones.

MongoDB

It is a document-oriented NoSQL, developed by MongoDB Inc., and distributed free of charge by the Apache Foundation.

Credits Apache Foundation

MongoDB stores JSON document data as if it were a schema, meaning fields may vary from one document to another, and the data structure may change over time.

It can be run individually without Hadoop, but it becomes powerful when connected to Hadoop and HDFS.

CURIOSITIES

  1. Traditional companies such as Microsoft, IBM, Oracle, and Amazon, offer relational database products and SQL services and dominate the database commercial applications market.
  2. The best-known open-source relational database is MySQL.
  3. Relational databases have advantages in two aspects: Schemas that allow the control and validation of data and Relationships that allow the connections between the different tables.
  4. NoSQL allows relationships by nesting documents. For example, a parent document could have a child document nested directly to it.
  5. Many NoSQL query engines natively support the ability to perform queries and associations based on complex, nested documents.

More information about this article

Article selected from the eBook “Big Data for Executives and Market Professionals.”

eBook in English: Amazon or Apple Store

eBook in Portuguese: Amazon or Apple Store

要查看或添加评论,请登录

José Antonio Ribeiro Neto的更多文章

社区洞察

其他会员也浏览了