Array Manipulation Hackerrank problem analysis and solution optimization (prefix sum + difference array).

Array Manipulation Hackerrank problem analysis and solution optimization (prefix sum + difference array).

Hi, guys, today I want to analyze the Array Manipulation problem from Hackerrank. We will take a look at a suboptimal solution, which is naturally easiest to figure out. Then we will discuss, analyze, and understand an optimized version that solves this problem which uses prefix sum and a difference array to achieve the needed time complexity.

For starters, this is the link to the problem: https://www.hackerrank.com/challenges/crush/problem?h_l=interview%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=arrays

Short problem description:


Screenshot from HackerRank


So, we have "n" which is the length of the array that we will populate, taking the data from the double linkedList queries parameter.

    public static long arrayManipulation(int n, List<List<Integer>> queries) {
    // Write your code here
    }        

For example, the linked list contains the following data:

queries = [[1,5,3],[4,8,7],[6,9,1]]

each sublist has 3 integers e.g. [1,5,3]

  • (1) is the one-based index, from which we need to start populating the n-length array.
  • (5) is also a one-based index where we need to stop populating the array.
  • (3) that's the value that we will use to populate the array.


When I say populate In the task the population is an addition operation.

The expected return value of the method after all iterations is to return the max element value from the array with length n

In the example above that's the value 10.


Suboptimal solution:

So my solution was not optimal but it demonstrates the basic logic that needs to be applied to solve the problem:


   public static long arrayManipulation(int n, List<List<Integer>> queries) {
              
        long maxVal = Integer.MIN_VALUE;
        
        int[] calcArray = new int[n];  // the array in which the data will be calculated
        
        for(List<Integer> q : queries){  //queries length  can be up to 2*10^5 
            int fromIndex = q.get(0);        
            int toIndex = q.get(1);
            int valueToApply= q.get(2);
            
             // here I use the same array to do the addition operation as well as 
            // finding the max value in the array.
            for(int j=fromIndex-1 ;j < toIndex  ;j++){
                calcArray[j] += valueToApply;    
                if(maxVal < calcArray[j]) maxVal = calcArray[j];
            }        
        }
        
        return maxVal;
    }        

Worst case time complexity of this solution is O(m*n)

where:

m - number of queries

n - is the scope (fromIndex-1 - toIndex-1)

Space complexity:

  • The array:

 calcArray is of size n, so it requires O(n) space.        

  • The Queries:

queries list contains 'm' sublists each of 3 integers. The list requires O(m) space.        

So the overall space complexity is O(n+m)


The optimised approach

The optimized approach requires a technique named "Prefix sum"

So what is prefix sum:

Input: arr[] = {10, 20, 10, 5, 15}
Output: prefixSum[] = {10, 30, 40, 45, 60}

Explanation: While traversing the array, update the element by adding it with its previous element. 

prefixSum[0] = 10, 
prefixSum[1] = prefixSum[0] + arr[1] = 30, 
prefixSum[2] = prefixSum[1] + arr[2] = 40 and so on.        

How do we combine the prefix sum approach with diffArray to achieve time complexity of O(n+m) for the final updated array, instead of O(n*m) for the previous example

  • The purpose of the "difference array" in the prefix sum approach is to efficiently handle range updates on an array, instead of directly updating each element in the specified range of each query. The differenceArray allows us to make constant-time updates, which can be processed to get the final array values.


Let's take a look at a step-by-step explanation of how the difference array works:

  • Initialization:

First, we create a difference array of size n+1 to handle the range updates at the boundaries.

  • Applying updates:

For a query that adds a value v from index a to b:

Increment diffArray[a-1] by v (starting the increment at the index a).

Decrement diffArray[b] by v (ending the increment after the index b).

  • Calculating final values:

We use the prefix sum approach described above applied to the difference array to calculate the final array.

The prefix sum of the diffArray at any point gives us the net effect of all updates up to that index.


Let's see a simple example of the diff array and prefix sum methods combined:

If we have the following conditions:

n = 5  // size of the array

queries = [
    [1, 2, 100],
    [2, 5, 100],
    [3, 4, 100]
]        

1. we initialize the diffArray which must have a size of n+1 = 6

diffArray = [0, 0, 0, 0, 0, 0]        

2. We apply the first query [1,2,100]

//Increment diffArray[a -1] by v  (index 0 by 100)
// Decrement diffArray[b] by v (decr index 2 by 100)

diffArray = [100, 0, -100, 0, 0, 0]        

3. We apply the second query [2,5,100]

//Increment diffArray[a-1] by v    ( index 1 by 100 ) 
// Decrement diffArray[b] by v  (decr index 5 by 100 )

diffArray = [100, 100, -100, 0, 0, -100]        

4. We apply the third query [3,4, 100]

//Increment diffArray[a-1] by v    ( index 2 by 100 )  from -100 to 0
// Decrement diffArray[b] by v (decr index 4 by 100 ) from 0 to -100

diffArray = [100, 100, 0, 0, -100, -100]        

5. Calculate the final values using the prefix sum:

currentSum = 0
finalArray[0] = currentSum += diffArray[0] ;
finalArray[1] = currentSum += diffArray[1] ;
finalArray[2] = currentSum += diffArray[2] ;
finalArray[3] = currentSum += diffArray[3] ;
finalArray[4] = currentSum += diffArray[4] ;

result: 
finalArray = [100, 200, 200, 200, 100]
        

Using the difference array, each query update is done in O(1) time complexity, and calculating the final values from the difference array takes O(n) time, making the overall time complexity O(n+m), which makes this approach significantly more time efficient than directly updating the array for each query. The space complexity is O(n) for the diff array and O(m) for the queries.

Thus, the optimized space complexity remains O(n+m)


This is an example of the prefix sum + difference array problem solution :

public static long arrayManipulation(int n, List<List<Integer>> queries) {
    long[] diffArray = new long[n + 1];  // Using a diff array for range updates

    // Process each query and update the diffArray
    for (List<Integer> q : queries) {
        int fromIndex = q.get(0) - 1;
        int toIndex = q.get(1);
        int valueToApply = q.get(2);

        diffArray[fromIndex] += valueToApply;
        if (toIndex < n) {
            diffArray[toIndex] -= valueToApply;
        }
    }

    // Find the maximum value using the prefix sum array
    long maxVal = Long.MIN_VALUE;
    long currentSum = 0;
    for (int i = 0; i < n; i++) {
        currentSum += diffArray[i];
        if (currentSum > maxVal) {
            maxVal = currentSum;
        }
    }

    return maxVal;
}        

Implementation tips for Java:

the parallel prefix implementation could be replaced by the

prefixedSumArray = Arrays.parallelPrefix(diffArray, Math::addExact);        

also extracting the max value could be replaced by

Collections.max(Arrays.asList(arr));        

If you reached the end, I'm glad the article was interesting for you : )

要查看或添加评论,请登录

Yovo Manolov的更多文章

社区洞察

其他会员也浏览了