Spark RDDs Vs DataFrames vs SparkSQL - Part 3: Web Server Log Analysis

Spark RDDs Vs DataFrames vs SparkSQL - Part 3: Web Server Log Analysis

This is the third tutorial on the Spark RDDs Vs DataFrames vs SparkSQL blog post series. The first one is available here and the second one is here. In the first part, we saw how to retrieve, sort and filter data. In the second part, on the other hand, we saw how to work with multiple tables. In this tutorial, we will see how to analyze web server log . If you like this tutorial series, check also my other recent blog posts on Spark on Analyzing the Bible and the Quran using Spark and Spark DataFrames: Exploring Chicago Crimes. The data and the notebooks can be downloaded from my GitHub repository

Article for this blog post is available here.

All five parts, more than 100 pages, are available in pdf format here

要查看或添加评论,请登录

Fisseha Berhane, PhD的更多文章

社区洞察

其他会员也浏览了