ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Getting Started with Apache Zeppelin with Airbnb

Jay Feng

Land your dream data science job with Interview Query

å‘å¸ƒæ—¥æœŸ: 2015å¹´12æœˆ28æ—¥

I've been playing around with Apache Zeppelin for a few months now (not so much playing as just frustration initially to get it working). After consistently using it a bit, I find it incredibly useful for data visualization and business intelligence purposes.

Apache Zeppelin is self described as "a web-based notebook that enables interactive data analytics". Imagine it as an IPython notebook for interactive visualizations but supporting more languages than just Python to munge your data for visualization. Ultimately after getting Pyspark working on it, I find it incredibly useful for displaying business data and analytics. Right now it only has a couple graph options which include bar graphs, line graphs, pie charts, and scatter plots. Currently it's also in incubation mode at Apache and open-sourced!

On a business and company level, I've found that it is probably the best way to introduce a new visualization tool when the interpreter language can be written in Spark SQL. SQL queries come naturally to all analysts and most product managers, so this can potentially introduce everyone to creating their own data visualizations if the data is loaded and formatted accordingly. Therefore anyone who knows SQL now can play around with visualizations with a lot more ease.

Ultimately it looks and functions a bit like Tableau minus the cost of thousands of dollars for a Tableau license. Yet for Zeppelin, as the data scales, hopefully the speed and functionality of Zeppelin scale linearly when running it on a cluster with Spark. I believe the end goal is to run huge amounts of data through it and potentially visualize billions of data-points with Spark doing most of the heavy lifting.

Well how does it work?! I'll show a quick demo of the install and then some initial code. Here's the link to the current Zeppelin github repo.

Jay Fengçš„æ›´å¤šæ–‡ç«

Cracking Data Analyst Behavioral Interviews

2021å¹´12æœˆ30æ—¥

Cracking Data Analyst Behavioral Interviews

When analysts prepare for interviews, they tend to go heavy on technical skills. Theyâ€™ll write SQL practice queriesâ€¦

2 æ¡è¯„è®º
A Chrome Extension to See How Much People Make on Linkedin

2016å¹´1æœˆ19æ—¥

A Chrome Extension to See How Much People Make on Linkedin

(Note: I'll be working at Jobr at the end of January. Start swiping now for new job discovery!) Link to salary chromeâ€¦

4 æ¡è¯„è®º
Practical Natural Language Processing for Getting Good Wifi in Hostels

2015å¹´11æœˆ18æ—¥

Practical Natural Language Processing for Getting Good Wifi in Hostels

I was planning my trip to Amsterdam in January and was looking through hostels in Hostel World filtering for differentâ€¦

4 æ¡è¯„è®º
Most Frequented Crimes in San Francisco Normalized By Neighborhood

2015å¹´10æœˆ19æ—¥

Most Frequented Crimes in San Francisco Normalized By Neighborhood

I saw a post a while back about drug use in different SF neighborhoods. The most basic example was how there was aâ€¦

1 æ¡è¯„è®º
Basketball Analytics on Two For One Strategy

2015å¹´5æœˆ13æ—¥

Basketball Analytics on Two For One Strategy

I was watching an NBA game when I heard Jeff Van Gundy comment on the 2 for 1 strategy in basketball. He said all theâ€¦

3 æ¡è¯„è®º

See all articles

Getting Started with Apache Zeppelin with Airbnb

Jay Feng

Land your dream data science job with Interview Query

Jay Fengçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Free Resource: The Complete SQL Cheat Sheet

Week of May 13th

Getting Adventureworks to Fabric

Getting Started with Apache Polaris Locally Using Docker Compose and Register Your Iceberg Tables | Hands-on Labs for Begineers

Week of December 2nd

Different Ways of Creating a DataFrame in Spark

Whatâ€™s new in Iceberg 1.1

Snowflake Materialized View Query Auto-Rewrite

Quick Start of Spark DataFrame - High Level APIs of Apache Spark

Kotlin Class Types Series â€“ Part 2: Data Classes

Jay Fengçš„æ›´å¤šæ–‡ç«

Cracking Data Analyst Behavioral Interviews

A Chrome Extension to See How Much People Make on Linkedin

Practical Natural Language Processing for Getting Good Wifi in Hostels

Most Frequented Crimes in San Francisco Normalized By Neighborhood

Basketball Analytics on Two For One Strategy

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Free Resource: The Complete SQL Cheat Sheet

Week of May 13th

Getting Adventureworks to Fabric

Getting Started with Apache Polaris Locally Using Docker Compose and Register Your Iceberg Tables | Hands-on Labs for Begineers

Week of December 2nd

Different Ways of Creating a DataFrame in Spark

Whatâ€™s new in Iceberg 1.1

Snowflake Materialized View Query Auto-Rewrite

Quick Start of Spark DataFrame - High Level APIs of Apache Spark

Kotlin Class Types Series â€“ Part 2: Data Classes

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†