?????? ???? ?????????????????? ????????????-???????????????? ?????? ???? ?????? ???????? ???????? ???????????????? ?????? ?????????

?????? ???? ?????????????????? ????????????-???????????????? ?????? ???? ?????? ???????? ???????? ???????????????? ?????? ?????????


We all know that the Column-oriented database is not the right vehicle when it comes to OLTP so how Cassandra is suitable now for OLTP ?

This is a common mistake to refer Cassandra as a column-oriented database. In fact, it is a column Family database

So what is the difference between these two?

Let's walkthrough to define these two concepts


Column-oriented databases refers to systems where data storage on disk occurs in a columnar fashion. This layout is very suitable for tasks requiring aggregation(OLAP), as it accesses directly the column without the need access to each row first.

Consider this list of users:


Traditional databases would store this data in rows:

1,alice12,Alice,<null>,[email protected];
2,b0b,Bob,Vasquez,[email protected];
3,ch5r71e,Charli,Yang,<null>;
        

Column-oriented databases would store the same list by column:

1,2,3;
alice12,b0b,ch5r71e;
Alice,Bob,Charli;
<null>,Vasquez,Yang;
[email protected],[email protected],<null>;        


But why does Cassandra fall under the category of a column family database rather than a column-oriented one?

If you take a look at the Readme file at Apache Cassandra git repo,it says

Apache Cassandra is a partitioned row store. Rows are organized into tables with a required primary key.
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.
Row store means that like relational databases, Cassandra organizes data by rows and columns.

And this what makes Cassandra is best for use cases that are write heavy with small, highly constrained queries (OLTP).


This is are the same for column family databases such as Hbase and so on.

Omar Hegazy

Data & Analytics Engineer @ Ejada | Snowflake Core Certified | GCP Certified Data Engineer | Tableau Certified | Alteryx Advanced Certified

1 年

Great job,Ali. Keep up the good work

Ahmed Salama

Data and Analytics Engineer at Ejada Systems ???? Snowflake SnowPro Core Certified | Alteryx Advanced Certified | Tableau Desktop Specialist Certified | Dataiku Advanced Certified

1 年

So proud of you my cleverest geek

要查看或添加评论,请登录

社区洞察

其他会员也浏览了