登录查看更多内容

Why Java parallel streams perform poor?

Surinder Kumar Mehra

Principal Engineer at Arcesium

发布日期: 2019年3月5日

Java 8 brings us lots of cool new features such as lambda functions, new Date time APIs, streams, functional interfaces etc. I have been using java 8 for quite some time now. One topic which usually comes up in tech meetings is about when to use serial streams and parallel streams. I read some articles why parallel streams perform poor than serial streams sometimes. Lets talk about this

Parallel streams use FORK/JOIN framework released in Java 7 which creates a thread pool equal to no of cores in the system.

FORK operation splits the task to smaller tasks. Smaller tasks are again split further until a task cannot be split anymore.

JOIN operation collects the results of each split task and merge results to return final result.

Now since it is multi threaded system, there is an overhead handling multiple threads and task allocation and context switching.

Parallel streams work exactly like this. Diagram above explains it

Before using parallel stream for any source, we should always analyse how complex is it to jump to middle element of stream.

Say for example, we are splitting a stream built from collection

Array: easy to split in half(using index)

ArrayList: easy again using index

HashSet/TreeSet : can be split in half in medium complexity

LinkedList: HARD to split in half because we must traverse first half to split it

In some cases, it may be good to convert LinkedList to an array and then use parallel stream

Now lets analyse the different type of operations and their performance on parallel streams

Parallel Friendly operations:

These operations perform well in parallel streams.e.g.

-> filter

-> map

-> flatMap

These operations work on each element separately and partial results need only be concatenated

Parallel Unfriendly Operations:

Not very efficient in parallel streams. e.g

-> limit(n) : It needs to know how many elements are already consumed and should current element be considered or not

-> takewhile(predicate): was the predicate violated by previous element or not(java 9)

-> dropwhile(predicate):

Their result depends on the past, cannot process different chunks of elements separately

Parallel Unfriendly Intermediate Operations:

Operations like sort() and distinct() can work on different chunks of data but merging will require some reprocessing

sort(): merging part will need to sort collective data from different chunks again

distinct(): same as sort(), merging will need some reprocessing. It can be sped up by un-ordering the stream of data

Standard terminal operations are Parallel Friendly provided the functional arguments are stateless

e.g. foreach, count, allMatch, reduce, sum, max, min

Collectors:

Standard collectors are parallel friendly: e.g. toList, toSet etc.

Grouping collectors are relatively efficient : e.g. toMap, groupingBy

So now we have an idea that different sources and different operations varies in their parallel behavior. So if parallel streams are performing poor for you, it could be their inappropriate use i.e. using parallel streams for inappropriate sources or parallel unfriendly operations.

要查看或添加评论，请登录

Surinder Kumar Mehra的更多文章

The triangle for GC performance metrics - What makes a good GC algorithm

2024年12月28日

The triangle for GC performance metrics - What makes a good GC algorithm

Garbage collection in real world means collection of items which are discarded from use and should be removed to clean…

3 条评论
Fixing connection refused with JMX port issue after gradle 7.3.x upgrade

2024年9月13日

Fixing connection refused with JMX port issue after gradle 7.3.x upgrade

Recently we upgraded our project to Gradle 7.3.
Code formatters as pre commit git hook

2024年7月8日

Code formatters as pre commit git hook

Coding guidelines play important role in keeping code in standard form. These could include naming conventions…
Beautify test outputs using test-logger

2024年6月7日

Beautify test outputs using test-logger

As a software developer, it is a part of our job to write unit tests for softwares we build to ensure each unit of code…

1 条评论
Minimizing the scope of running unit tests in a mono repo

2023年9月15日

Minimizing the scope of running unit tests in a mono repo

Context Running units tests in mono repo can be time consuming if not done properly. For instance, if we make a change…
My experience with Apache Ignite : Introduction to mostly used features

2023年6月10日

My experience with Apache Ignite : Introduction to mostly used features

Introduction I have been using apache ignite for few years now and have used its several features for use cases I came…

6 条评论
Java Collections.sort( ) and Arrays.sort( ) under the hood

2020年6月24日

Java Collections.sort( ) and Arrays.sort( ) under the hood

Java Collections sort api is probably most frequently used api by most developers but how many of us know which…

3 条评论
Performance of Sorting algorithms

2020年6月21日

Performance of Sorting algorithms

Performance of an algorithm is analyzed by time it takes to process data and space it uses through the execution. Time…
Writing Java POJO in clean and concise manner using Lombok

2020年6月5日

Writing Java POJO in clean and concise manner using Lombok

Imagine you are working on Java application which require several Plain Old Java Objects(POJO). Each POJO has several…

2 条评论
Application Performance Monitoring in Datadog

2019年6月8日

Application Performance Monitoring in Datadog

Most of us at times struggled with application performance issues and our hunt for right tools to help us in…

2 条评论

See all articles

Why Java parallel streams perform poor?

Surinder Kumar Mehra

Principal Engineer at Arcesium

Surinder Kumar Mehra的更多文章

社区洞察

其他会员也浏览了

Java 23

Lombok makes Java cool again

Java performance improvement — Java 8+ streams vs loops and lists vs arrays

From Java 8 to Java 15 in Ten Minutes

Versions of Java

Akka with Java 21: Less is More

The Ultimate Guide for Generics in Java

Java Record

How to Solve OutOfMemoryError: GC Overhead Limit Exceeded

Java 21 in Practice: Exploring Key Features

Surinder Kumar Mehra的更多文章

The triangle for GC performance metrics - What makes a good GC algorithm

Fixing connection refused with JMX port issue after gradle 7.3.x upgrade

Code formatters as pre commit git hook

Beautify test outputs using test-logger

Minimizing the scope of running unit tests in a mono repo

My experience with Apache Ignite : Introduction to mostly used features

Java Collections.sort( ) and Arrays.sort( ) under the hood

Performance of Sorting algorithms

Writing Java POJO in clean and concise manner using Lombok

Application Performance Monitoring in Datadog

社区洞察

其他会员也浏览了

Java 23

Lombok makes Java cool again

Java performance improvement — Java 8+ streams vs loops and lists vs arrays

From Java 8 to Java 15 in Ten Minutes

Versions of Java

Akka with Java 21: Less is More

The Ultimate Guide for Generics in Java

Java Record

How to Solve OutOfMemoryError: GC Overhead Limit Exceeded

Java 21 in Practice: Exploring Key Features