登录查看更多内容

Algorithm For Counting Inversions

Harith Bakhrani

Backend Engineer | DevOps Engineer | Cloud | GCP | Kubernetes | Docker | AWS | Python | Remote

发布日期: 2019年10月30日

+ 关注

Pre-requisites:

Good knowledge of programming
Familiarity with recursion
Familiarity with algorithms

What are Inversions?

Inversion in a list of numbers indicates how far a list is from being sorted. Let us say we have a variable: arr = [1,2,4,3,5,6], in the variable arr we have one inversion of the numbers 4 and 3 because they are in an unsorted order. Thus, the formal definition will be:

Given an array arr, an inversion would occur if arr[i] > arr[j] and i < j.

Just to cement what an inversion is, I would re-define it using other words:

If an element on the left is greater than an element on the right, then that is an inversion.

Why Count Inversions?

At this point, you might be wondering what are some real-life implications of counting inversions, why do we need to count inversions?

There are a couple of scenarios where this might help. One would be to have a numerical similarity measure that quantifies how close two lists are to each other. So, for example, I can create two lists that contain some products that two different people have ranked, beginning from the top-ranked to the least-ranked. From these two lists, I can then calculate inversions which would quantify how similar/dissimilar these two people are to one another. Finding people with similar preferences helps some websites to implement collaborative filtering.

The Worst-Case Scenario

What will the worst-case scenario look like for this algorithm?

In the worst-case scenario, the numbers would be totally inverted, so instead of having a list of numbers as: [ 1 , 2 , 3 , 4 , 5 , 6 ], we would have it as [ 6 , 5 , 4 , 3 , 2 , 1 ]. The mathematical formula for calculating the number of inversions given a list of length n and assuming that the numbers are totally inverted is: n(n-1) / 2.

To Code

Using Brute Force

Using nested loops, we can loop through the list two times. Below is pseudocode for counting the number of inversions in a list:

initialize arr to [1,2,4,3,5,6]
initialize i to 0
initialize inversions to 0

while i is less than length of arr
  initialize j to i+1
  
  while j is less than length of arr
    if arr[i] is greater than arr[j]
      then increment inversions by 1

print out inversions

As you can see we are going through the entire list two times, thus our time complexity for our “brute force” algorithm would be O(n2).

As Software Engineers, the question we need to ask is, can we do better? YES, we can do better!

Divide & Conquer!

If you are familiar with merge sort, then this would be easy for you, if you are not, then worry not, things would make sense as you progress!

Divide and conquer algorithms are used to continuously divide the list into two and then do whatever you would like to do. In the case of merge sort, we recursively divide the list into two and sort it out. Here is the pseudocode for merge sort:

function mergeSort(array) {
  if length of array is equal to 1
    return array

  initialize firstHalf to mergeSort(first half of the array)
  initialize secondHalf to mergeSort(second half of the array)
  initialize sortedArray to an empty array
  initialize i to 0
  initialize j to 0

  while i < length of firstHalf and j < length of secondHalf
    if firstHalf[i] > secondHalf[j]
      append secondHalf[j] to sortedArray
    else
      append firstHalf[i] to sortedArray

  append the remaining elements of firstHalf to sortedArray
  append the remaining elements of secondHalf to sortedArray

  return sortedArray
}

Let me give you a visual tour of what happens during the merge sort. At first, the elements get recursively divided into two until only one element is left:

When only one element is left, it is returned (it does not need sorting, of course). The elements are then merged in sorted order and thus we get our sorted list.

Elements get merged to form the final sorted list

Wait, hold on a second! Where did our count inversions algorithm go???

Having known merge sort, we can now count inversions by keeping track of inversions variable:

function countInversions(array) {
  if length of array is equal to 1
    return array

  midPoint = length of array \ 2

  firstHalf, inversions1 = mergeSort(first half of the array)
  secondHalf, inversions2 = mergeSort(second half of the array)

  initialize sortedArray to an empty array
  initialize i to 0
  initialize j to 0
  initialize inversions to 0

  while i < length of firstHalf and j < length of secondHalf
    if firstHalf[i] > secondHalf[j]
      append secondHalf[j] to sortedArray
      inversions = inversions + (midPoint - i)
    else
      append firstHalf[i] to sortedArray

  append the remaining elements of firstHalf to sortedArray
  append the remaining elements of secondHalf to sortedArray

  totalInversions = inversions1 + inversions2 + inversions

  return sortedArray, totalInversions
}

Voila! There you go! We have a countInversions function that takes in an array as its parameter and spits out a sorted array together with the total number of inversions.

You might be having one question left though: why are we adding the length of the remaining elements in the firstHalf array to the inversions variable (inversions = inversions + (midPoint — i))? Let us assume we have reached a step where we have two lists:

The first check would be between 2 and 1, 2 is greater than 1 and so 1 would come first before 2. The number of inversions is two because the numbers 2 and 3 are both greater than 1. Thus, all the remaining elements in the first list would be less than 1 and their count would add up to the number of inversions we would have!

Practice!

Phew! Too much of theory and pseudocode there! How about you go and try to implement it your self on HackerRank: https://www.hackerrank.com/challenges/ctci-merge-sort/problem?h_l=interview&playlist_slugs%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=sorting

Enjoy ??

Yip Thy Diep Ta

3 年

1 次回应

Elijah Rwothoromo

Sr. Software Developer | Engineer

5 年

This beautifully written! Thank you!

1 次回应

查看更多评论

要查看或添加评论，请登录

Harith Bakhrani的更多文章

Video Edit - Handbrake

2016年9月7日

Video Edit - Handbrake

In the name of Allah , the Entirely Merciful, the Especially Merciful. Hello World! Just a few days back, I landed my…

Algorithm For Counting Inversions

Harith Bakhrani

Backend Engineer | DevOps Engineer | Cloud | GCP | Kubernetes | Docker | AWS | Python | Remote

Pre-requisites:

What are Inversions?

Why Count Inversions?

The Worst-Case Scenario

To Code

Using Brute Force

Divide & Conquer!

Practice!

Harith Bakhrani的更多文章

社区洞察

其他会员也浏览了

A Programmer's Toolkit: Navigating the World of Algorithms

std::format in C++20

Functors, Monads and Their Role in Category Theory: A Short Practical Exploration in Go

Algorithms

The Algorithm

Selecting the Ideal Programming Language for Machine Learning: Comparing C++ and Python

R programming for data science ( A very good course on weekends)

Magic of Hashtables

What is Vectorization in GenAI? Explained!!

Top Programming Languages for AI & Machine Learning in 2019

Pre-requisites:

What are Inversions?

Why Count Inversions?

The Worst-Case Scenario

To Code

Using Brute Force

Divide & Conquer!

Practice!

Harith Bakhrani的更多文章

Video Edit - Handbrake

社区洞察

其他会员也浏览了

A Programmer's Toolkit: Navigating the World of Algorithms

std::format in C++20

Functors, Monads and Their Role in Category Theory: A Short Practical Exploration in Go

Algorithms

The Algorithm

Selecting the Ideal Programming Language for Machine Learning: Comparing C++ and Python

R programming for data science ( A very good course on weekends)

Magic of Hashtables

What is Vectorization in GenAI? Explained!!

Top Programming Languages for AI & Machine Learning in 2019