Introducing English SDK for Spark
Created by the English SDK for Spark

Introducing English SDK for Spark

1B+ downloads of Spark per year.1

The popularity of Spark is not surprising given the many tools available for data engineering, data science, and data analytics that have Spark capabilities. I have personally used Spark to pull in terabytes of data through batching and streaming formats without having to change much of my code.

The biggest challenge when learning the Spark framework was the coding language and having to look up the slight differences between Pyspark and Python, or figuring out how to transform a data frame and plot it. Fortunately, this will soon be less of an issue with the English SDK for Spark.

In this tutorial, I will show you how to implement the English SDK for Spark with a California Housing sample dataset. This is just a small glimpse into what the future holds for this tool.

The code can be found?here?and was run in Google Colaboratory with the provided sample dataset.

Read more below.

#spark #datascience #ml #dataengineering #data

An Vanna

Head of Database & Data Governance | Exadata X9M | Oracle OCP 11g | ADB Cloud Specialist 2019

1 年

Thank you for sharing

回复
Colin Manko

Data Engineer | Python Developer | Data & Software Design

1 年

Im concerned about the esoteric bugs. Though it’s a great step for making spark more useable. Thanks for sharing!

Manoj Kumar

Data Analytics Manager @KPMG UK | Generative AI enthusiast

1 年
  • 该图片无替代文字
回复
KRISHNAN N NARAYANAN

Sales Associate at American Airlines

1 年

Thanks for sharing

Udaykiran Noti

Software engineer

1 年

Cf

回复

要查看或添加评论,请登录

Sarah Floris, MS的更多文章

社区洞察

其他会员也浏览了