Mastering HashSet in C++: Unraveling the Power of unordered_set

Mastering HashSet in C++: Unraveling the Power of unordered_set

In C++, the term “HashSet” is often confused with unordered_set, but they are essentially the same thing. C++ does not have a direct HashSet class like in some other programming languages (e.g., Java), but the unordered_set class serves the same purpose. Let's dive into the differences between unordered_set and set, and learn how to use unordered_set (or "HashSet") in C++ with examples.

What is unordered_set in?C++?

The unordered_set in C++ is a container that stores unique elements, just like a mathematical set. The key difference from the set is that unordered_set does not maintain any order of the elements. It stores elements in an arbitrary order, and it uses hashing to achieve faster insertion, deletion, and lookup times.

Time Complexity:

  • unordered_set: O(1) average for insert, delete, and search.
  • set: O(log n) due to the internal use of binary search trees (like Red-Black Trees).

Key Properties of unordered_set:

  1. Unique Elements: It only stores unique elements, meaning no duplicates.
  2. Hashing: It uses a hash table for storing elements, providing average constant time complexity (O(1)) for lookups, insertions, and deletions.
  3. No Specific Order: Elements are not stored in any particular order.
  4. Faster Operations: Due to the use of hashing, unordered_set is usually faster than set for most operations.

Difference Between set and unordered_set

Order:

  • set: Elements are ordered based on their values.
  • unordered_set: Elements are stored arbitrarily, based on their hash values.

Underlying Data Structure:

  • set: Uses a balanced binary search tree (usually a Red-Black Tree).
  • unordered_set: Uses a hash table.

Using unordered_set in C++: A Step-by-Step Guide

#include<bits/stdc++.h>
#include<unordered_set>

using namespace std; 

int main(){
  // Declare an unordered_set of integers
  unordered_set<int> hashSet;
  
  // Inserting elements
  hashSet.insert(10);
  hashSet.insert(20);
  hashSet.insert(30);
  
  // Duplicate elements are ignored
  hashSet.insert(10);
  
  // Print the elements (order isd not guaranteed)
  cout << "Element in Hashset: ";
  for(const auto &element : hashSet){
    cout<< element<< " ";
  }
  cout<<endl;

  // Checking if an element exist
  int value = 20;
  if(hashSet.find(value) != hashSet.end()){
    cout<<value<<" is present in the hashset."<<endl;
  }
  else{
    cout<<value<<" is not present in the hashset."<<endl;
  }
  
  // erasing an element
  hashSet.erase(10);

  // Size of the unordered_set
  cout<<"Size of hashSet: "<< hashSet.size() << endl;
  return 0;
}        

Now, let’s try to solve some LeetCode questions to get more clear understanding of unordered_set and set.

Example1. Unique Number of Occurrences

Given an array of integers arr, return true if the number of occurrences of each value in the array is unique or false otherwise.

Example 1:
Input: arr = [1,2,2,1,1,3]
Output: true
Explanation: The value 1 has 3 occurrences, 2 has 2 and 3 has 1. 
No two values have the same number of occurrences.

Example 2:
Input: arr = [1,2]
Output: false

Example 3:
Input: arr = [-3,0,1,-3,1,1,1,-3,10,0]
Output: true        
class Solution {
public:
    bool uniqueOccurrences(vector<int>& arr) {
        // Create an unordered_map to store the frequency of each element in arr
        unordered_map<int, int> umap; 
        
        // Step 1: Count the occurrences of each element in the array
        for(int i = 0; i < arr.size(); i++) {
            // Increment the frequency of arr[i] in the map
            umap[arr[i]]++;  
        }
        
        // Create a set to store unique frequencies of the elements
        set<int> freq;  

        // Step 2: Check if all frequencies are unique
        // Iterate through each (key, value) pair in the unordered_map
        for(auto it : umap) {  
            if(freq.find(it.second) != freq.end()) {
                // If the frequency is already in the set, return false  
                return false;
            }
            // Insert the frequency into the set
            freq.insert(it.second);  
        }
        // If no duplicate frequencies were found, return true
        return true;  
    }
};

// Time Complexity: O(n log n) in the worst case
// Space Complexity: O(n)        

Example2. Valid Sudoku

Determine if a 9 x 9 Sudoku board is valid. Only the filled cells need to be validated according to the following rules:

  1. Each row must contain the digits 1-9 without repetition.
  2. Each column must contain the digits 1-9 without repetition.
  3. Each of the nine 3 x 3 sub-boxes of the grid must contain the digits 1-9 without repetition.

Note:

  • A Sudoku board (partially filled) could be valid but is not necessarily solvable.
  • Only the filled cells need to be validated according to the mentioned rules.

Example 1:
Input: board = 
[["5","3",".",".","7",".",".",".","."]
,["6",".",".","1","9","5",".",".","."]
,[".","9","8",".",".",".",".","6","."]
,["8",".",".",".","6",".",".",".","3"]
,["4",".",".","8",".","3",".",".","1"]
,["7",".",".",".","2",".",".",".","6"]
,[".","6",".",".",".",".","2","8","."]
,[".",".",".","4","1","9",".",".","5"]
,[".",".",".",".","8",".",".","7","9"]]
Output: true

Example 2:
Input: board = 
[["8","3",".",".","7",".",".",".","."]
,["6",".",".","1","9","5",".",".","."]
,[".","9","8",".",".",".",".","6","."]
,["8",".",".",".","6",".",".",".","3"]
,["4",".",".","8",".","3",".",".","1"]
,["7",".",".",".","2",".",".",".","6"]
,[".","6",".",".",".",".","2","8","."]
,[".",".",".","4","1","9",".",".","5"]
,[".",".",".",".","8",".",".","7","9"]]
Output: false
Explanation: Same as Example 1, except with the 5 in the top left corner 
being modified to 8. Since there are two 8's in the top left 3x3 sub-box, 
it is invalid.        
class Solution {
public:
    bool isValidSudoku(vector<vector<char>>& board) {
        int m = board.size(); // rows
        int n = board[0].size(); // columns

        unordered_set<string> hashSet;
        for(int i = 0; i < m; i++){
            for(int j = 0; j < n; j++){
                if(board[i][j] != '.'){
                    // Construct unique identifiers for rows, columns, and boxes
                    string rowStr = string(1, board[i][j]) + " in row " + to_string(i);
                    string colStr = string(1, board[i][j]) + " in col " + to_string(j);  // Corrected column index
                    string boxStr = string(1, board[i][j]) + " in box " + to_string(i / 3) + "-" + to_string(j / 3);

                    // Check if the element already exists in the row, column, or box
                    if(hashSet.find(rowStr) != hashSet.end() || hashSet.find(colStr) != hashSet.end() || hashSet.find(boxStr) != hashSet.end()){
                        return false;  // Sudoku rule violation
                    }

                    // Insert the current element's row, column, and box information into the set
                    hashSet.insert(rowStr);
                    hashSet.insert(colStr);
                    hashSet.insert(boxStr);
                }
            }
        }
        return true;
    }
};        

Most Important Logic to notice here is that, we create three strings to uniquely identify each value in the board with respect to:

  1. The row it’s in: rowStr.
  2. The column it’s in: colStr.
  3. The 3x3 sub-box it’s in: boxStr.

  • string(1, board[i][j]) creates a string of length 1, containing the character at board[i][j] (the number in the cell).
  • to_string(i) converts the row index i to a string, and to_string(j) does the same for the column index j.
  • The string boxStr identifies the 3x3 sub-box, calculated by dividing the row and column indices by 3 (i / 3 and j / 3). This helps group the grid into the 3x3 sections required by Sudoku rules.

For example:

  • If board[0][0] contains '5', the strings will be:
  • rowStr = "5 in row 0",
  • colStr = "5 in col 0",
  • boxStr = "5 in box 0-0"

Then check if any of the three strings (rowStr, colStr, boxStr) already exist in the set hashSet

Time and Space Complexity:

  • Time Complexity: The solution iterates through all 81 cells in the 9x9 Sudoku board, and for each non-empty cell, it performs constant-time operations (hashing, finding, and inserting into the unordered_set). Therefore, the overall time complexity is O(81) = O(1), since the board size is fixed at 9x9.
  • Space Complexity: The space complexity is O(81) = O(1) as well, since in the worst case, the unordered_set will store at most 81 strings (one for each unique digit placed in a row, column, and 3x3 box).


要查看或添加评论,请登录

RISHABH SINGH的更多文章

社区洞察

其他会员也浏览了