Why Apache Spark?

Amir Maleki

Senior Data Solution Designer at Australian Energy Market Operator (AEMO)

发布日期: 2019年2月12日

+ 关注

Let's review and polish the title. Maybe the main question here is "Why not Apache Spark?"

When the main focus is working with BIG data, streaming data, we need Speed, Ease of use, Cover all the needs, and Run on different source of data. I will focus on these four main purposes to see Spark could be a good solution or ...

Speed

Run workloads 100x faster. (Wow)

Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.

Ease of Use

Write applications quickly in Java, Scala, Python, R, and SQL. (Amazing)

Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells.

Generality

Combine SQL, streaming, and complex analytics.

Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.

Runs Everywhere

Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources.

You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

[ Source: apache.org ]

要查看或添加评论，请登录

Amir Maleki的更多文章

15 Rules to Becoming a Data Scientist

2019年2月5日

15 Rules to Becoming a Data Scientist

Who are Data Scientists? They are big data wranglers. Data scientists take an enormous mass of unstructured data and…
7 Tips For First-time Facilitators

2017年10月25日

7 Tips For First-time Facilitators

1. Learn from others.
10 Essential Business Requirements Gathering Techniques

2017年9月21日

10 Essential Business Requirements Gathering Techniques

When it comes to business requirements gathering, various techniques are available. Every techniques comes with its own…
Creating the BACCM - Business Analysis Core Concept Model

2017年9月19日

Creating the BACCM - Business Analysis Core Concept Model

The Business Analyst Core Concept Model?(BACCM?) consists of six Core Concepts, related through a dynamic conceptual…
Apache Hadoop YARN (Yet Another Resource Negotiator)

2017年9月7日

Apache Hadoop YARN (Yet Another Resource Negotiator)

Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. YARN is one of the key…
The Value of a High Performing Business Analyst

2017年8月15日

The Value of a High Performing Business Analyst

“The hottest job in IT right now might be the least "T" of them all: business analyst”. Computerworld Tech hotshots:…
Business Analyst Value Added

2017年8月13日

Business Analyst Value Added

How a Business Analyst Increases the Potential Benefits The business analyst can also help the project team increase…
Big Data analyses depend on starting with clean data points

2017年6月22日

Big Data analyses depend on starting with clean data points

Popularly referred to as “Big Data,” mammoth sets of information about almost every aspect of our lives have triggered…
What Are The Top Five Skills Data Scientists Need?

2017年6月22日

What Are The Top Five Skills Data Scientists Need?

It’s a bit hard to summarize the whole field of data science into five skills (especially since the job “data…
Opinion: IT comes in from the cold – building apps for a brighter future

2017年6月22日

Opinion: IT comes in from the cold – building apps for a brighter future

We’ve all heard stories of harassed IT staff who spend all day trying to keep the metaphorical lights on. In fact, you…

See all articles

Speed

Ease of Use

Generality

Runs Everywhere

Amir Maleki的更多文章

15 Rules to Becoming a Data Scientist

7 Tips For First-time Facilitators

10 Essential Business Requirements Gathering Techniques

Creating the BACCM - Business Analysis Core Concept Model

Apache Hadoop YARN (Yet Another Resource Negotiator)

The Value of a High Performing Business Analyst

Business Analyst Value Added

Big Data analyses depend on starting with clean data points

What Are The Top Five Skills Data Scientists Need?

Opinion: IT comes in from the cold – building apps for a brighter future

社区洞察