Unnoticed Gem - HBASE

     Last decade of technology and data world was dominated by Hadoop and NoSQL. Organizations were racing to adopt these technologies. They were trying their best to set up bigdata, data science environments.  Open source communities were fueling this rat-race by frequently adding new products to Hadoop eco-systems.  This cycle of hype and disruption might continue for next few years. There is lot to be talked about each Hadoop component and it’s raise and fall.    In this article I would like to focus on unnoticed gem – HBASE.

This all was triggered by a Google’s paper on BIGFILE and BIGTABLE.   HDFS evolved from BIGFILE, it is still center of attraction.  Surprisingly HBASE built on BIGTABLE couldn’t catch imagination of crowd.   

This information is captured from db-engines site.  https://db-engines.com/en/ranking_trend

What went wrong with HBASE?

  NoSQL databases like MongoDb and Cassandra provided alternatives to HBASE. Commercial supports to these database and marketing strategies proved them better. Many organization accepted Hive with HDFS as a better option in data exploration.  Lack of technical expertise required for HBASE impacted it’s popularity.  And slowly this gem was ignored and forgotten in many cases.

Is HBASE required at all?

 In my view, adopting HDFS and not setting a layer of HBASE on top of it looks like a big mistake.   HDFS is immutable – this makes it bad for warehouse and ETL use cases. Ironically, Organizations built their own smart solutions as a workaround for this. No one realize there is superior, readymade solution available in the form of HBASE.

 HBASE is the cheapest option to store billions of records that can be updated and retrieved randomly. It’s selective record retrieval is quite fast too. It can handle massive inserts and updates(transactions per second). These features can make it a perfect backend system for many use cases.

What’s next? – what makes me call it as a gem.

apache phoenix is trying to make HBASE much more simpler for SQL users.  HBASE with Phoenix can act as a platform for low latency queries and data discovery.  

 HBASE comes with rest API that makes it as a ready to plug-in backend for digital channels .  I was able to built a webpage in few minutes to retrieve and display data from HBASE.

 I am always a fan of AWS :). Recently I tried amazon EMR cluster with HBASE. AWS has made HBASE installation and configuration quite simple. Most remarkable feature is  -  possibility of using S3 for HBASE. I am pretty sure, in coming years we will see HBASE rising in popularity




Ankit Bhatnagar

Senior Technical Consultant IV at NCR Atleos

5 年

Thanks Sudhir

回复
Mitesh Makwana

Solutions architect/manager & leadership

5 年

interesting read Sudhir

要查看或添加评论,请登录

Sudhir Jangam的更多文章

  • Beyond Imitation: Crafting Data Strategies That Suit Your Unique Business Needs

    Beyond Imitation: Crafting Data Strategies That Suit Your Unique Business Needs

    In the age of data-driven decision-making, many companies look to tech giants like Google, Amazon, and Facebook as the…

    4 条评论
  • Pitfalls In Enterprise ML Strategy

    Pitfalls In Enterprise ML Strategy

    Each BI strategy presentation talks about machine learning and actionable insights. It looks magical and exciting on…

    2 条评论
  • Cloud Strategy - Myths and Realities

    Cloud Strategy - Myths and Realities

    Cloud is one of the biggest buzzword for years. Now it’s on CEOs top agenda, technology and business teams are blindly…

  • Geovisualization on COVID19

    Geovisualization on COVID19

    In today’s world enterprises are processing lots of data. That data is of no use if it can’t provide any actionable…

  • Build REST services on AWS

    Build REST services on AWS

    Building REST APIs that are secure, scalable and manageable is quite a challenging task. You can read my earlier blog…

  • Empowering machine learning architecture using D3Js.

    Empowering machine learning architecture using D3Js.

    We all know a saying “A picture is worth a thousand words”. This statement has never been more accurate than in…

  • REST APIs on AWS

    REST APIs on AWS

    I started this as a small project to build a RESTful API to serve data in RDBMS. Aim was to build RESTful API on AWS…

    5 条评论
  • Artificial Intelligence

    Artificial Intelligence

    With evolution of computing systems and reduction in hardware cost theories and concepts are getting into reality…

    3 条评论

社区洞察

其他会员也浏览了