登录查看更多内容

Self-Learn Yourself Apache Spark in 21 Blogs – #8

Kumar Chinnakali

Reimagining contact center as a hands-on architect bridging users, clients, developers, and business executives in their context.

发布日期: 2016年5月8日

Self-Learn Yourself Apache Spark in 21 Blogs – #8

In this blog let us discuss on How to loading data, what is Lambdas, How to do Transforming Data and more on Transformations. And want to have quick read on the other blogs in this learning series.

Apache Spark can load from any input sources like HDFS, S3, Casandra, RDBMS, Parquet, Avro, and also in memory. Let’s see how we can use it in command line,

Memory Loading Methods

parallelize
makeRDD
range

External Loading Methods

TextFiles
wholeTextFiles
sequenceFile(“file:///Data/SampleSequenceFile”, classOf[Text], classOf[IntWrittable])
objectFile
hadoopFile
newAPIHadoopFile
hadoopRDD

Now lets’ discuss what is a Lambdas expression, which is already used in above few examples. And which is used in future examples too. The lambda expression also known as anonymous functions. Below is the Lambda expression,

rdd.flatMap(line => line.split(“ “))

Let us know discuss on how to convert the named method to lambda expression,

NamedMethod:

def addOne(item: Int) = {

item+1

}

Val intList = List(1,2)

For(item <- intList) yield {

addOne(item)

}

Lambda:

def addOne(item: Int) = {

item+1

}

Val intList = List(1,2)

intList.map(X => {

addOne(x)

})

Still it can fine-tuned like this,

Val intList = List(1,2)

intList.map(item => item+1)

One more note Scala can multiline lambdas via user brackets.

Now let’s discuss on how to do transformation to have meaning full information’s.

Keep Reading...

要查看或添加评论，请登录

Kumar Chinnakali的更多文章

Dad, What's Net Zero? Exploring a Greener World with Dad's Help!" ????

2023年9月16日

Dad, What's Net Zero? Exploring a Greener World with Dad's Help!" ????

Yesterday, my 13-year-old daughter Anuja Kumar looked at me, her eyes filled with curiosity, and asked, 'Dad, what…
Pioneering the Future of Telecom with Bio-Inspired Computation

2023年9月16日

Pioneering the Future of Telecom with Bio-Inspired Computation

Bio-Inspired Computation in Telecommunications refers to the application of principles and algorithms inspired by…
?? Closing the Loop: Navigating Data's End of Life in the Net Zero Era

2023年9月3日

?? Closing the Loop: Navigating Data's End of Life in the Net Zero Era

Introduction In a world increasingly concerned with sustainability and environmental responsibility, even our digital…
?? Book Review: Practical Sustainability by Corey Glickman and Jeff Kavanaugh - A Gateway to Net Zero Insights!

2023年8月26日

?? Book Review: Practical Sustainability by Corey Glickman and Jeff Kavanaugh - A Gateway to Net Zero Insights!

?? "Practical Sustainability: Circular Commerce, Smarter Spaces, And Happier Humans" is a true gem that illuminates the…

2 条评论
Unraveling ESG Frameworks & Standards and the Power of Rankers & Raters: Navigating the Path to Net Zero

2023年8月20日

Unraveling ESG Frameworks & Standards and the Power of Rankers & Raters: Navigating the Path to Net Zero

In the realm of sustainable finance and responsible investing, the terms "ESG Frameworks & Standards" and "Rankers &…

1 条评论
From LCA to LCSA: A Journey Towards Holistic Sustainability in the Net Zero Era!

2023年8月5日

From LCA to LCSA: A Journey Towards Holistic Sustainability in the Net Zero Era!

In this article, we'll explore how Life Cycle Analysis (LCA) is evolving into Life Cycle Sustainability Analysis (LCSA)…

3 条评论
The Sustainability Showdown: ESG vs. ECG - Exploring New Dimensions of Responsible Business Practices

2023年7月16日

The Sustainability Showdown: ESG vs. ECG - Exploring New Dimensions of Responsible Business Practices

In today's ever-evolving business landscape, sustainability and ethical practices are taking center stage. As…

2 条评论
Unlocking Climate Action: The Power of Carbon Offsetting

2023年6月25日

Unlocking Climate Action: The Power of Carbon Offsetting

?? Introduction: In the fight against climate change, the need for immediate action is more pressing than ever. One…

1 条评论
Beyond GDP: Exploring the Inclusive Wealth Index for Sustainable Development and Well-Being.

2023年6月11日

Beyond GDP: Exploring the Inclusive Wealth Index for Sustainable Development and Well-Being.

The Inclusive Wealth Index (IWI) is an indicator that attempts to measure the sustainable development and well-being of…

1 条评论
Reducing Data Waste for a More Sustainable Future: Strategies for Efficient Data Management and Net Zero Impact

2023年5月29日

Reducing Data Waste for a More Sustainable Future: Strategies for Efficient Data Management and Net Zero Impact

Data waste refers to the inefficient or unnecessary use of data, resulting in the accumulation of large amounts of…

1 条评论

See all articles

Self-Learn Yourself Apache Spark in 21 Blogs – #8

Kumar Chinnakali

Reimagining contact center as a hands-on architect bridging users, clients, developers, and business executives in their context.

Kumar Chinnakali的更多文章

社区洞察

其他会员也浏览了

Apache Spark :: HiveWarehouseSession (CRUD) with Hive 3 Managed Tables

Governing Apache Ranger

Practical Apache Spark in 10 minutes. Part 3?-?DataFrames and?SQL

Understanding file formats within the Fabric Lakehouse

Hive vs Spark

Partitioning and Bucketing in Apache Spark

Transformation and Actions in Apache Spark:

Efficiently Processing Large Datasets in Apache Spark: Exploring Memory Considerations

Enhancing Your Spark Application: Key Steps for Optimal Performance

The Power of the Functions in Spark SQL

Kumar Chinnakali的更多文章

Dad, What's Net Zero? Exploring a Greener World with Dad's Help!" ????

Pioneering the Future of Telecom with Bio-Inspired Computation

?? Closing the Loop: Navigating Data's End of Life in the Net Zero Era

?? Book Review: Practical Sustainability by Corey Glickman and Jeff Kavanaugh - A Gateway to Net Zero Insights!

Unraveling ESG Frameworks & Standards and the Power of Rankers & Raters: Navigating the Path to Net Zero

From LCA to LCSA: A Journey Towards Holistic Sustainability in the Net Zero Era!

The Sustainability Showdown: ESG vs. ECG - Exploring New Dimensions of Responsible Business Practices

Unlocking Climate Action: The Power of Carbon Offsetting

Beyond GDP: Exploring the Inclusive Wealth Index for Sustainable Development and Well-Being.

Reducing Data Waste for a More Sustainable Future: Strategies for Efficient Data Management and Net Zero Impact

社区洞察

其他会员也浏览了

Apache Spark :: HiveWarehouseSession (CRUD) with Hive 3 Managed Tables

Governing Apache Ranger

Practical Apache Spark in 10 minutes. Part 3?-?DataFrames and?SQL

Understanding file formats within the Fabric Lakehouse

Hive vs Spark

Partitioning and Bucketing in Apache Spark

Transformation and Actions in Apache Spark:

Efficiently Processing Large Datasets in Apache Spark: Exploring Memory Considerations

Enhancing Your Spark Application: Key Steps for Optimal Performance

The Power of the Functions in Spark SQL