课程: Apache Spark Deep Learning Essential Training
今天就学习课程吧!
今天就开通帐号,24,100 门业界名师课程任您挑!
The origins of Spark and Databricks - Spark DataFrames教程
课程: Apache Spark Deep Learning Essential Training
The origins of Spark and Databricks
- [Instructor] Spark started in 2009 as a research project in the UC Berkeley RAD Lab. The researchers in the lab had been previously working on Hadoop MapReduce and observed that MapReduce was inefficient for iterative and interactive computing jobs. So, right from the beginning Spark was designed to be fast for interactive queries and iterative algorithms. It brought in ideas like support for in-memory storage and efficient fault recovery. Research papers were published about Spark at academic conferences and soon after its creation it was already 10 to 20 times faster than MapReduce for certain jobs. In Matei's, 2009 paper they say that while Spark is still currently a working prototype the performance results they were getting were very encouraging. Even at that time Spark could outperform machine learning workloads by a factor of 10 and you can see this on page five of their paper. As part of their experiments into Sparks performance, they performed a logistic regression job…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。