Internal Working of HashSet in Java

What is HashSet?

  • HashSet is a collection that stores only unique elements.
  • It is based on HashMap (but only stores keys).
  • It allows null values but does not maintain insertion order.
  • Provides O(1) time complexity for add(), remove(), and contains().

Internal Working of HashSet

  • Internally backed by HashMap.
  • Elements of HashSet are stored as keys in HashMap.
  • The values in HashMap are dummy objects (PRESENT).
  • Uses hash codes for efficient storage.

How add() Works Internally?

  1. Calls hashCode() to determine bucket index in HashMap.
  2. If the bucket is empty, the element is added.
  3. If a collision occurs, it:Uses Linked List (Java 7) orBalanced Tree (Java 8, if more than 8 elements in a bucket).
  4. Duplicates are ignored.

Internal Code

  • Uses HashMap.put(key, value), where PRESENT is a static dummy object.

Performance & Time Complexity

  • Java 8 optimizations:If a bucket contains more than 8 elements, it converts from Linked List to a Balanced Tree (O(log n)).

Handling Collisions in HashSet

  • Java uses separate chaining to handle collisions.
  • If multiple elements have the same hash, they are stored in a linked list.
  • Java 8 introduced Red-Black Trees (TreeNodes) for better efficiency.

Common Mistakes & Pitfalls

? Overriding equals() without hashCode()

  • If you override equals(), always override hashCode() to avoid duplicate issues. ? Not using immutable objects in HashSet
  • Avoid modifying elements after adding them to HashSet to prevent unexpected behavior.

Real-World Use Cases of HashSet

? Removing duplicates from a list – Fast, unordered duplicate removal for large datasets. ? Fast lookups in caching – O(1) time complexity for checking existence. ? Preventing duplicate usernames in registration – Ensures uniqueness with fast O(1) lookups.

Final Thoughts

  • HashSet is fast, efficient, and widely used in Java.
  • No duplicates and O(1) performance make it ideal for sets.
  • Always handle collisions and override hashCode() & equals() properly.
  • Use immutable objects to prevent unexpected behavior.


Example: Collision in HashSet

import java.util.HashSet;

import java.util.Objects;

class Employee {

String name;

int id;

public Employee(String name, int id) {

this.name = name;

this.id = id;

}

@Override

public int hashCode() {

return id % 10; // Forces collision when id has the same last digit

}

@Override

public boolean equals(Object obj) {

if (this == obj) return true;

if (obj == null || getClass() != obj.getClass()) return false;

Employee employee = (Employee) obj;

return id == employee.id && Objects.equals(name, employee.name);

}

@Override

public String toString() {

return "Employee{name='" + name + "', id=" + id + "}";

}

}

public class HashSetCollisionExample {

public static void main(String[] args) {

HashSet<Employee> employees = new HashSet<>();

Employee e1 = new Employee("Alice", 101);

Employee e2 = new Employee("Bob", 111); // Same hash (101 % 10 == 111 % 10)

Employee e3 = new Employee("Charlie", 121); // Same hash

employees.add(e1);

employees.add(e2);

employees.add(e3);

System.out.println("Employees in HashSet:");

for (Employee e : employees) {

System.out.println(e);

}

// Checking if an object with the same values exists

System.out.println("Contains Alice (101)? " + employees.contains(new Employee("Alice", 101)));

}

}


要查看或添加评论,请登录

社区洞察

其他会员也浏览了