登录查看更多内容

Array Manipulation Hackerrank problem analysis and solution optimization (prefix sum + difference array).

Yovo Manolov

Software Engineer

发布日期: 2024年6月6日

Hi, guys, today I want to analyze the Array Manipulation problem from Hackerrank. We will take a look at a suboptimal solution, which is naturally easiest to figure out. Then we will discuss, analyze, and understand an optimized version that solves this problem which uses prefix sum and a difference array to achieve the needed time complexity.

For starters, this is the link to the problem: https://www.hackerrank.com/challenges/crush/problem?h_l=interview%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=arrays

Short problem description:

So, we have "n" which is the length of the array that we will populate, taking the data from the double linkedList queries parameter.

    public static long arrayManipulation(int n, List<List<Integer>> queries) {
    // Write your code here
    }

For example, the linked list contains the following data:

queries = [[1,5,3],[4,8,7],[6,9,1]]

each sublist has 3 integers e.g. [1,5,3]

(1) is the one-based index, from which we need to start populating the n-length array.
(5) is also a one-based index where we need to stop populating the array.
(3) that's the value that we will use to populate the array.

When I say populate In the task the population is an addition operation.

The expected return value of the method after all iterations is to return the max element value from the array with length n

In the example above that's the value 10.

Suboptimal solution:

So my solution was not optimal but it demonstrates the basic logic that needs to be applied to solve the problem:

   public static long arrayManipulation(int n, List<List<Integer>> queries) {
              
        long maxVal = Integer.MIN_VALUE;
        
        int[] calcArray = new int[n];  // the array in which the data will be calculated
        
        for(List<Integer> q : queries){  //queries length  can be up to 2*10^5 
            int fromIndex = q.get(0);        
            int toIndex = q.get(1);
            int valueToApply= q.get(2);
            
             // here I use the same array to do the addition operation as well as 
            // finding the max value in the array.
            for(int j=fromIndex-1 ;j < toIndex  ;j++){
                calcArray[j] += valueToApply;    
                if(maxVal < calcArray[j]) maxVal = calcArray[j];
            }        
        }
        
        return maxVal;
    }

Worst case time complexity of this solution is O(m*n)

where:

m - number of queries

n - is the scope (fromIndex-1 - toIndex-1)

Space complexity:

The array:

 calcArray is of size n, so it requires O(n) space.

The Queries:

queries list contains 'm' sublists each of 3 integers. The list requires O(m) space.

So the overall space complexity is O(n+m)

The optimised approach

The optimized approach requires a technique named "Prefix sum"

So what is prefix sum:

Input: arr[] = {10, 20, 10, 5, 15}
Output: prefixSum[] = {10, 30, 40, 45, 60}

Explanation: While traversing the array, update the element by adding it with its previous element. 

prefixSum[0] = 10, 
prefixSum[1] = prefixSum[0] + arr[1] = 30, 
prefixSum[2] = prefixSum[1] + arr[2] = 40 and so on.

How do we combine the prefix sum approach with diffArray to achieve time complexity of O(n+m) for the final updated array, instead of O(n*m) for the previous example

领英推荐

C++23: A New Way of Error Handling with std::expected

Rainer Grimm 1 年前

LeetCode Curated SQL Solutions and Discussion - Week 3…

Ankur Ranjan 2 年前

Optimizing JVM performance with 'jstat'

Prateek Jain 2 个月前

The purpose of the "difference array" in the prefix sum approach is to efficiently handle range updates on an array, instead of directly updating each element in the specified range of each query. The differenceArray allows us to make constant-time updates, which can be processed to get the final array values.

Let's take a look at a step-by-step explanation of how the difference array works:

Initialization:

First, we create a difference array of size n+1 to handle the range updates at the boundaries.

Applying updates:

For a query that adds a value v from index a to b:

Increment diffArray[a-1] by v (starting the increment at the index a).

Decrement diffArray[b] by v (ending the increment after the index b).

Calculating final values:

We use the prefix sum approach described above applied to the difference array to calculate the final array.

The prefix sum of the diffArray at any point gives us the net effect of all updates up to that index.

Let's see a simple example of the diff array and prefix sum methods combined:

If we have the following conditions:

n = 5  // size of the array

queries = [
    [1, 2, 100],
    [2, 5, 100],
    [3, 4, 100]
]

1. we initialize the diffArray which must have a size of n+1 = 6

diffArray = [0, 0, 0, 0, 0, 0]

2. We apply the first query [1,2,100]

//Increment diffArray[a -1] by v  (index 0 by 100)
// Decrement diffArray[b] by v (decr index 2 by 100)

diffArray = [100, 0, -100, 0, 0, 0]

3. We apply the second query [2,5,100]

//Increment diffArray[a-1] by v    ( index 1 by 100 ) 
// Decrement diffArray[b] by v  (decr index 5 by 100 )

diffArray = [100, 100, -100, 0, 0, -100]

4. We apply the third query [3,4, 100]

//Increment diffArray[a-1] by v    ( index 2 by 100 )  from -100 to 0
// Decrement diffArray[b] by v (decr index 4 by 100 ) from 0 to -100

diffArray = [100, 100, 0, 0, -100, -100]

5. Calculate the final values using the prefix sum:

currentSum = 0
finalArray[0] = currentSum += diffArray[0] ;
finalArray[1] = currentSum += diffArray[1] ;
finalArray[2] = currentSum += diffArray[2] ;
finalArray[3] = currentSum += diffArray[3] ;
finalArray[4] = currentSum += diffArray[4] ;

result: 
finalArray = [100, 200, 200, 200, 100]

Using the difference array, each query update is done in O(1) time complexity, and calculating the final values from the difference array takes O(n) time, making the overall time complexity O(n+m), which makes this approach significantly more time efficient than directly updating the array for each query. The space complexity is O(n) for the diff array and O(m) for the queries.

Thus, the optimized space complexity remains O(n+m)

This is an example of the prefix sum + difference array problem solution :

public static long arrayManipulation(int n, List<List<Integer>> queries) {
    long[] diffArray = new long[n + 1];  // Using a diff array for range updates

    // Process each query and update the diffArray
    for (List<Integer> q : queries) {
        int fromIndex = q.get(0) - 1;
        int toIndex = q.get(1);
        int valueToApply = q.get(2);

        diffArray[fromIndex] += valueToApply;
        if (toIndex < n) {
            diffArray[toIndex] -= valueToApply;
        }
    }

    // Find the maximum value using the prefix sum array
    long maxVal = Long.MIN_VALUE;
    long currentSum = 0;
    for (int i = 0; i < n; i++) {
        currentSum += diffArray[i];
        if (currentSum > maxVal) {
            maxVal = currentSum;
        }
    }

    return maxVal;
}

Implementation tips for Java:

the parallel prefix implementation could be replaced by the

prefixedSumArray = Arrays.parallelPrefix(diffArray, Math::addExact);

also extracting the max value could be replaced by

Collections.max(Arrays.asList(arr));

If you reached the end, I'm glad the article was interesting for you : )

要查看或添加评论，请登录

Yovo Manolov的更多文章

Special string again (string manipulation) + time and space complexity analysis

2024年7月28日

Special string again (string manipulation) + time and space complexity analysis

Summary The problem is the following: Given a string s of length n, determine the number of special substrings within…

1 条评论
Fraudulent activity problem from Hackerrank - from the sorting problems. Analysis of three different solutions. ( min-max heap, counting sort)

2024年6月25日

Fraudulent activity problem from Hackerrank - from the sorting problems. Analysis of three different solutions. ( min-max heap, counting sort)

Reference link to the problem: https://www.hackerrank.
Sherlock and anagrams (hacker rank) from the Dictionaries and Hashmaps section + extensive time and space complexity analysis.

2024年6月8日

Sherlock and anagrams (hacker rank) from the Dictionaries and Hashmaps section + extensive time and space complexity analysis.

Hi, guys, today we will discuss the problem "Sherlock and anagrams" The link to the problem is the following:…
New Year Chaos - HackerRank algorithmic problem - clarifying comments

2024年6月4日

New Year Chaos - HackerRank algorithmic problem - clarifying comments

Hi, guys, I want to discuss an algorithmic problem that's probably easy for most of you, but initially, I found it hard…

Array Manipulation Hackerrank problem analysis and solution optimization (prefix sum + difference array).

Yovo Manolov

Software Engineer

Short problem description:

Suboptimal solution:

The optimised approach

领英推荐

Yovo Manolov的更多文章

社区洞察

其他会员也浏览了

The Type-Traits Library: std::is_base_of

An Infinite Data Stream with Coroutines in C++20

Unpacking a UPX Packed Sample with IDA

Advanced State Management in React

Difference b/w Cell and RefCell

"Magic behind the numbers"

Using Geocode by Awesome Table

First things first – Timing graph

Introduction to PropertyGraphIndex

Handling dates

Short problem description:

Suboptimal solution:

The optimised approach

领英推荐

Yovo Manolov的更多文章

Special string again (string manipulation) + time and space complexity analysis

Fraudulent activity problem from Hackerrank - from the sorting problems. Analysis of three different solutions. ( min-max heap, counting sort)

Sherlock and anagrams (hacker rank) from the Dictionaries and Hashmaps section + extensive time and space complexity analysis.

New Year Chaos - HackerRank algorithmic problem - clarifying comments

社区洞察

其他会员也浏览了

The Type-Traits Library: std::is_base_of

An Infinite Data Stream with Coroutines in C++20

Unpacking a UPX Packed Sample with IDA

Advanced State Management in React

Difference b/w Cell and RefCell

"Magic behind the numbers"

Using Geocode by Awesome Table

First things first – Timing graph

Introduction to PropertyGraphIndex

Handling dates