The Dijkstra You Never Knew

The Dijkstra You Never Knew

THE DIJKSTRA YOU NEVER KNEW: (#BITBYBIT 1)


Here's a question: What's the time complexity of Dijkstra's algorithm (the one we use for single source shortest paths)? (E is the number of edges, V is the number of vertices)

A.) (E+V)logE

B.) (E+V)logV

C.) E + V^2

D.) E + VlogV

I bet most of you would have guessed either option A or B. Well, you know what? Every option is correct in its own way. It depends on the implementation.

First, let's break Dijkstra's algorithm into two steps:

Step 1: You find a vertex 'X' with the minimum achievable distance (also making sure that it wasn't visited before).

Step 2: You relax all its edges (i.e., d[i] = min(d[i], d[x] + wt)).



Breaking down each implementation:

Now, the time complexity of your implementation would exactly depend on these two steps. Let's break down each implementation one by one.

A) (E+V)logE: This complexity can be achieved using a priority queue (based on binary heaps). Since we cannot delete nodes in such a structure (only popping from the top is allowed), there can be a maximum of E nodes in the priority queue at a given time. Thus, a pop or insert operation would take log E time.

  • Step 1: V log E (overall), as we would pop V vertices in total.
  • Step 2: E log E (overall), as we can perform the insertion operation a total of E times.

Thus, using a binary heap/priority queue would lead to a complexity of (E+V) log E.

B) Using a C++ set based on red-black trees: Such a structure allows both insertion and deletion (allows popping from non-top positions). The explanation remains the same as the priority queue implementation, just that, since we delete edges from time to time, only a maximum of V elements can stay in the set at a given time. Thus, the complexity becomes (E+V) log V.

C) Traversing the distance array to find the minimum: This may seem like a stupid idea, as we can use structures that can do this in log V. But as they say, everything has a purpose in this universe.

  • Step 1: We find the minimum distance vertex by traversing the distance array in O(V). As we will have to do this a total of V times, the total complexity would be O(V^2).
  • Step 2: We relax edges. Every operation would be O(1), as we're just changing values in an array. The overall contribution of this operation would be O(E).

Thus, this implementation takes O(E + V^2).

D) The absolute best implementation is using Fibonacci heaps. (I know you've never heard of this structure. In fact, it isn't even available in C++ STL, but it is present in the Almighty Boost library - it's the secret of my energy).

It is a complicated structure that can achieve Step 1 in O(V log V) and Step 2 in O(E). Thus, giving the overall best complexity of O(E + V log V).


Which one should I use then?

So, now you may be scratching your head, and I know your brain is screaming, "Which implementation should I use then?" Here are some rules of thumb:

  1. The theoretical best is using Fibonacci heaps in every case. But you will either have to use the Boost library (once again, the secret of my energy) or implement it yourself. Which means in interviews, competitions, you will have to resort to other options.
  2. Set vs. Priority Queue (PQ): On paper, we see that set wins, but the comparison factor being log V vs. log E. However, when you understand these structures better, you will learn that the set has a significantly higher constant factor. Thus, in theory, set wins, but practically sometimes PQ may win over set. So which one should you use? It is up to you. I prefer set over PQ as when I delete elements from the set, it feels as if I'm deleting the negativity from my life.
  3. And coming to the O(V^2 + E) implementation (yes, the one that you thought was useless), it is actually useful though. Do you remember what E is in the worst case? Yes, it is V^2. Thus, in the worst case, the set implementation and PQ implementation converge to (V^2) log V, while this solution converges to V^2. Thus, when the graph is dense (E ~ V^2), this solution may save your TLEs. (But almost every question you see would have sparse graph constraints: V <= 1e5, E <= V).

CONCLUSION:

Worst case:

(> implies better than, i.e., lesser time complexity)

Fibonacci heap > Iterating over the distance array > Set implementation > PQ

Average case:

Fibonacci heap > Set > PQ > Iterating over the distance array


Author's Note: This article is a part of my new series, #BitByBit , where we will break down concepts for coding interviews. I'm simultaneously running another article series called #UnlockingTheCode , where I share roadmaps and resources to cover topics related to data structures.

#BitByBit #UnlockingTheCode

#CodingInterviews #DataStructures #AlgorithmAnalysis #ProgrammingConcepts #CodingTips #TechInterviews #SoftwareEngineering


要查看或添加评论,请登录

社区洞察

其他会员也浏览了