Getting Started with Apache Zeppelin with Airbnb

Getting Started with Apache Zeppelin with Airbnb

Link to the full blog post here.

I've been playing around with Apache Zeppelin for a few months now (not so much playing as just frustration initially to get it working). After consistently using it a bit, I find it incredibly useful for data visualization and business intelligence purposes.

Apache Zeppelin is self described as "a web-based notebook that enables interactive data analytics". Imagine it as an IPython notebook for interactive visualizations but supporting more languages than just Python to munge your data for visualization. Ultimately after getting Pyspark working on it, I find it incredibly useful for displaying business data and analytics. Right now it only has a couple graph options which include bar graphs, line graphs, pie charts, and scatter plots. Currently it's also in incubation mode at Apache and open-sourced!

 

On a business and company level, I've found that it is probably the best way to introduce a new visualization tool when the interpreter language can be written in Spark SQL. SQL queries come naturally to all analysts and most product managers, so this can potentially introduce everyone to creating their own data visualizations if the data is loaded and formatted accordingly. Therefore anyone who knows SQL now can play around with visualizations with a lot more ease.

Ultimately it looks and functions a bit like Tableau minus the cost of thousands of dollars for a Tableau license. Yet for Zeppelin, as the data scales, hopefully the speed and functionality of Zeppelin scale linearly when running it on a cluster with Spark. I believe the end goal is to run huge amounts of data through it and potentially visualize billions of data-points with Spark doing most of the heavy lifting.

Well how does it work?! I'll show a quick demo of the install and then some initial code. Here's the link to the current Zeppelin github repo.

Read more at the full blog post here

要查看或添加评论,请登录

Jay Feng的更多文章

  • Cracking Data Analyst Behavioral Interviews

    Cracking Data Analyst Behavioral Interviews

    When analysts prepare for interviews, they tend to go heavy on technical skills. They’ll write SQL practice queries…

    2 条评论
  • A Chrome Extension to See How Much People Make on Linkedin

    A Chrome Extension to See How Much People Make on Linkedin

    (Note: I'll be working at Jobr at the end of January. Start swiping now for new job discovery!) Link to salary chrome…

    4 条评论
  • Practical Natural Language Processing for Getting Good Wifi in Hostels

    Practical Natural Language Processing for Getting Good Wifi in Hostels

    I was planning my trip to Amsterdam in January and was looking through hostels in Hostel World filtering for different…

    4 条评论
  • Most Frequented Crimes in San Francisco Normalized By Neighborhood

    Most Frequented Crimes in San Francisco Normalized By Neighborhood

    I saw a post a while back about drug use in different SF neighborhoods. The most basic example was how there was a…

    1 条评论
  • Basketball Analytics on Two For One Strategy

    Basketball Analytics on Two For One Strategy

    I was watching an NBA game when I heard Jeff Van Gundy comment on the 2 for 1 strategy in basketball. He said all the…

    3 条评论

社区洞察

其他会员也浏览了