Java HashSet class implements the Set interface, backed by a hash table which is actually a HashMap instance. No guarantee is made as to the iteration order of the hash sets which means that the class does not guarantee the constant order of elements over time. This class permits the null element. The class also offers constant time performance for the basic operations like add, remove, contains, and size assuming the hash function disperses the elements properly among the buckets, which we shall see further in the article.
Java HashSet Features
A few important features of HashSet are mentioned below:
- Implements Set Interface.
- The underlying data structure for HashSet is Hashtable.
- As it implements the Set Interface, duplicate values are not allowed.
- Objects that you insert in HashSet are not guaranteed to be inserted in the same order. Objects are inserted based on their hash code.
- NULL elements are allowed in HashSet.
- HashSet also implements Serializable and Cloneable interfaces.
Declaration of HashSet
public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable, Serializable
where E is the type of elements stored in a HashSet.
HashSet Java Example
Java
// Java program to illustrate the concept // of Collection objects storage in a HashSet import java.io.*; import java.util.*; class CollectionObjectStorage { public static void main(String[] args) { // Instantiate an object of HashSet HashSet<ArrayList> set = new HashSet<>(); // create ArrayList list1 ArrayList<Integer> list1 = new ArrayList<>(); // create ArrayList list2 ArrayList<Integer> list2 = new ArrayList<>(); // Add elements using add method list1.add( 1 ); list1.add( 2 ); list2.add( 1 ); list2.add( 2 ); set.add(list1); set.add(list2); // print the set size to understand the // internal storage of ArrayList in Set System.out.println(set.size()); } } |
1
Before storing an Object, HashSet checks whether there is an existing entry using hashCode() and equals() methods. In the above example, two lists are considered equal if they have the same elements in the same order. When you invoke the hashCode() method on the two lists, they both would give the same hash since they are equal.
Note: HashSet does not store duplicate items, if you give two Objects that are equal then it stores only the first one, here it is list1.
The Hierarchy of HashSet is as follows:
Internal Working of a HashSet
All the classes of the Set interface are internally backed up by Map. HashSet uses HashMap for storing its object internally. You must be wondering that to enter a value in HashMap we need a key-value pair, but in HashSet, we are passing only one value.
Storage in HashMap: Actually the value we insert in HashSet acts as a key to the map Object and for its value, java uses a constant variable. So in the key-value pair, all the values will be the same.
Implementation of HashSet in Java doc
private transient HashMap map; // Constructor - 1 // All the constructors are internally creating HashMap Object. public HashSet() { // Creating internally backing HashMap object map = new HashMap(); } // Constructor - 2 public HashSet(int initialCapacity) { // Creating internally backing HashMap object map = new HashMap(initialCapacity); } // Dummy value to associate with an Object in Map private static final Object PRESENT = new Object();
If we look at the add() method of the HashSet class:
public boolean add(E e) { return map.put(e, PRESENT) == null; }
We can notice that add() method of the HashSet class internally calls the put() method of backing the HashMap object by passing the element you have specified as a key and constant “PRESENT” as its value. remove() method also works in the same manner. It internally calls the remove method of the Map interface.
public boolean remove(Object o) { return map.remove(o) == PRESENT; }
HashSet not only stores unique Objects but also a unique Collection of Objects like ArrayList<E>, LinkedList<E>, Vector<E>,..etc.
Constructors of HashSet class
To create a HashSet, we need to create an object of the HashSet class. The HashSet class consists of various constructors that allow the possible creation of the HashSet. The following are the constructors available in this class.
1. HashSet()
This constructor is used to build an empty HashSet object in which the default initial capacity is 16 and the default load factor is 0.75. If we wish to create an empty HashSet with the name hs, then, it can be created as:
HashSet<E> hs = new HashSet<E>();
2. HashSet(int initialCapacity)
This constructor is used to build an empty HashSet object in which the initialCapacity is specified at the time of object creation. Here, the default loadFactor remains 0.75.
HashSet<E> hs = new HashSet<E>(int initialCapacity);
3. HashSet(int initialCapacity, float loadFactor)
This constructor is used to build an empty HashSet object in which the initialCapacity and loadFactor are specified at the time of object creation.
HashSet<E> hs = new HashSet<E>(int initialCapacity, float loadFactor);
4. HashSet(Collection)
This constructor is used to build a HashSet object containing all the elements from the given collection. In short, this constructor is used when any conversion is needed from any Collection object to the HashSet object. If we wish to create a HashSet with the name hs, it can be created as:
HashSet<E> hs = new HashSet<E>(Collection C);
Below is the implementation of the above topics:
Java
// Java program to Demonstrate Working // of HashSet Class // Importing required classes import java.util.*; // Main class // HashSetDemo class GFG { // Main driver method public static void main(String[] args) { // Creating an empty HashSet HashSet<String> h = new HashSet<String>(); // Adding elements into HashSet // using add() method h.add( "India" ); h.add( "Australia" ); h.add( "South Africa" ); // Adding duplicate elements h.add( "India" ); // Displaying the HashSet System.out.println(h); System.out.println( "List contains India or not:" + h.contains( "India" )); // Removing items from HashSet // using remove() method h.remove( "Australia" ); System.out.println( "List after removing Australia:" + h); // Display message System.out.println( "Iterating over list:" ); // Iterating over hashSet items Iterator<String> i = h.iterator(); // Holds true till there is single element remaining while (i.hasNext()) // Iterating over elements // using next() method System.out.println(i.next()); } } |
[South Africa, Australia, India] List contains India or not:true List after removing Australia:[South Africa, India] Iterating over list: South Africa India
Methods in HashSet
METHOD |
DESCRIPTION |
---|---|
add(E e) | Used to add the specified element if it is not present, if it is present then return false. |
clear() | Used to remove all the elements from the set. |
contains(Object o) | Used to return true if an element is present in a set. |
remove(Object o) | Used to remove the element if it is present in set. |
iterator() | Used to return an iterator over the element in the set. |
isEmpty() | Used to check whether the set is empty or not. Returns true for empty and false for a non-empty condition for set. |
size() | Used to return the size of the set. |
clone() | Used to create a shallow copy of the set. |
Performing Various Operations on HashSet
Let’s see how to perform a few frequently used operations on the HashSet.
1. Adding Elements in HashSet
To add an element to the HashSet, we can use the add() method. However, the insertion order is not retained in the HashSet. We need to keep a note that duplicate elements are not allowed and all duplicate elements are ignored.
Example
Java
// Java program to Adding Elements to HashSet // Importing required classes import java.io.*; import java.util.*; // Main class // AddingElementsToHashSet class GFG { // Method 1 // Main driver method public static void main(String[] args) { // Creating an empty HashSet of string entities HashSet<String> hs = new HashSet<String>(); // Adding elements using add() method hs.add( "Geek" ); hs.add( "For" ); hs.add( "Geeks" ); // Printing all string el=ntries inside the Set System.out.println( "HashSet elements : " + hs); } } |
HashSet elements : [Geek, For, Geeks]
2. Removing Elements in HashSet
The values can be removed from the HashSet using the remove() method.
Example
Java
// Java program Illustrating Removal Of Elements of HashSet // Importing required classes import java.io.*; import java.util.*; // Main class // RemoveElementsOfHashSet class GFG { // Main driver method public static void main(String[] args) { // Creating an HashSet<String> hs = new HashSet<String>(); // Adding elements to above Set // using add() method hs.add( "Geek" ); hs.add( "For" ); hs.add( "Geeks" ); hs.add( "A" ); hs.add( "B" ); hs.add( "Z" ); // Printing the elements of HashSet elements System.out.println( "Initial HashSet " + hs); // Removing the element B hs.remove( "B" ); // Printing the updated HashSet elements System.out.println( "After removing element " + hs); // Returns false if the element is not present System.out.println( "Element AC exists in the Set : " + hs.remove( "AC" )); } } |
Initial HashSet [A, B, Geek, For, Geeks, Z] After removing element [A, Geek, For, Geeks, Z] Element AC exists in the Set : false
3. Iterating through the HashSet
Iterate through the elements of HashSet using the iterator() method. Also, the most famous one is to use the enhanced for loop.
Example
Code block
A, B, Geek, For, Geeks, Z, A, B, Geek, For, Geeks, Z,
Time Complexity of HashSet Operations: The underlying data structure for HashSet is hashtable. So amortize (average or usual case) time complexity for add, remove and look-up (contains method) operation of HashSet takes O(1) time.
Performance of HashSet
HashSet extends Abstract Set<E> class and implements Set<E>, Cloneable, and Serializable interfaces where E is the type of elements maintained by this set. The directly known subclass of HashSet is LinkedHashSet.
Now for the maintenance of constant time performance, iterating over HashSet requires time proportional to the sum of the HashSet instance’s size (the number of elements) plus the “capacity” of the backing HashMap instance (the number of buckets). Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
- Initial Capacity: The initial capacity means the number of buckets when the hashtable (HashSet internally uses hashtable data structure) is created. The number of buckets will be automatically increased if the current size gets full.
- Load Factor: The load factor is a measure of how full the HashSet is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
Number of stored elements in the table Load Factor = ----------------------------------------- Size of the hash table
Example: If internal capacity is 16 and the load factor is 0.75 then the number of buckets will automatically get increased when the table has 12 elements in it.
Effect on performance:
Load factor and initial capacity are two main factors that affect the performance of HashSet operations. A load factor of 0.75 provides very effective performance with respect to time and space complexity. If we increase the load factor value more than that then memory overhead will be reduced (because it will decrease internal rebuilding operation) but, it will affect the add and search operation in the hashtable. To reduce the rehashing operation we should choose initial capacity wisely. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operation will ever occur.
Note: The implementation in a HashSet is not synchronized, in the sense that if multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set. If no such object exists, the set should be “wrapped” using the Collections.synchronizedSet method. This is best done at creation time, to prevent accidental unsynchronized access to the set as shown below:
Set s = Collections.synchronizedSet(new HashSet(…));
Methods Used with HashSet
1. Methods inherited from class java.util.AbstractSet
Method |
Description |
---|---|
equals() | Used to verify the equality of an Object with a HashSet and compare them. The list returns true only if both HashSet contains the same elements, irrespective of order. |
hashcode() | Returns the hash code value for this set. |
removeAll(collection) | This method is used to remove all the elements from the collection which are present in the set. This method returns true if this set changes as a result of the call. |
2. Methods inherited from class java.util.AbstractCollection
METHOD |
DESCRIPTION |
---|---|
addAll(collection) |
This method is used to append all of the elements from the mentioned collection to the existing set. The elements are added randomly without following any specific order. |
containsAll(collection) |
This method is used to check whether the set contains all the elements present in the given collection or not. This method returns true if the set contains all the elements and returns false if any of the elements are missing. |
retainAll(collection) | This method is used to retain all the elements from the set which are mentioned in the given collection. This method returns true if this set changed as a result of the call. |
toArray() | This method is used to form an array of the same elements as that of the Set. |
toString() | The toString() method of Java HashSet is used to return a string representation of the elements of the HashSet Collection. |
3. Methods declared in interface java.util.Collection
METHOD |
DESCRIPTION |
---|---|
parallelStream() | Returns a possibly parallel Stream with this collection as its source. |
removeIf(Predicate<? super E> filter) | Removes all of the elements of this collection that satisfy the given predicate. |
stream() | Returns a sequential Stream with this collection as its source. |
toArray(IntFunction<T[]> generator) | Returns an array containing all of the elements in this collection, using the provided generator function to allocate the returned array. |
4. Methods declared in interface java.lang.Iterable
METHOD |
DESCRIPTION |
---|---|
forEach(Consumer<? super T> action) | Performs the given action for each element of the Iterable until all elements have been processed or the action throws an exception. |
5. Methods declared in interface java.util.Set
METHOD |
DESCRIPTION |
---|---|
addAll(Collection<? extends E> c) | Adds all of the elements in the specified collection to this set if they’re not already present (optional operation). |
containsAll(Collection<?> c) | Returns true if this set contains all of the elements of the specified collection. |
equals(Object o) | Compares the specified object with this set for equality. |
hashCode() | Returns the hash code value for this set. |
removeAll(Collection<?> c) | Removes from this set all of its elements that are contained in the specified collection (optional operation). |
retainAll(Collection<?> c) | Retains only the elements in this set that are contained in the specified collection (optional operation). |
toArray() | Returns an array containing all of the elements in this set. |
toArray(T[] a) | Returns an array containing all of the elements in this set; the runtime type of the returned array is that of the specified array. |
FAQs in HashSet in Java
Q1. What is HashSet in Java?
Answer:
HashSet is a type of class, which extends AbstractSet and implements Set interfaces.
Q2. Why is HashSet used?
Answer:
HashSet is used for avoiding duplicate data and to find value with the fast method.
Q3. Differences between HashSet and HashMap.
Answer:
Basis |
HashSet |
HashMap |
---|---|---|
Implementation | HashSet implements a Set interface. | HashMap implements a storesMap interface. |
Duplicates | HashSet doesn’t allow duplicate values. | HashMap stores the key and value pairs and it does not allow duplicate keys. If the key is duplicate then the old key is replaced with the new value. |
Number of objects during storing objects | HashSet requires only one object add(Object o). | HashMap requires two objects put(K key, V Value) to add an element to the HashMap object. |
Dummy value | HashSet internally uses HashMap to add elements. In HashSet, the argument passed in add(Object) method serves as key K. Java internally associates a dummy value for each value passed in add(Object) method. | HashMap does not have any concept of dummy value. |
Storing or Adding a mechanism | HashSet internally uses the HashMap object to store or add the objects. | HashMap internally uses hashing to store or add objects |
Faster | HashSet is slower than HashMap. | HashMap is faster than HashSet. |
Insertion | HashSet uses the add() method for adding or storing data. | HashMap uses the put() method for storing data. |
Example | HashSet is a set, e.g. {1, 2, 3, 4, 5, 6, 7}. | HashMap is a key -> value pair(key to value) map, e.g. {a -> 1, b -> 2, c -> 2, d -> 1}. |
Q4. Differences between HashSet and TreeSet in Java.
Answer:
Basis |
HashSet |
TreeSet |
---|---|---|
Speed and internal implement the, throw action | For operations like search, insert, and delete. It takes constant time for these operations on average. HashSet is faster than TreeSet. HashSet is Implemented using a hash table. | TreeSet takes O(Log n) for search, insert and delete which is higher than HashSet. But TreeSet keeps sorted data. Also, it supports operations like higher() (Returns least higher element), floor(), ceiling(), etc. These operations are also O(Log n) in TreeSet and not supported in HashSet. TreeSet is implemented using a Self Balancing Binary Search Tree (Red-Black Tree). TreeSet is backed by TreeMap in Java. |
Ordering | Elements in HashSet are not ordered. | TreeSet maintains objects in Sorted order defined by either the Comparable or Comparator method in Java. TreeSet elements are sorted in ascending order by default. It offers several methods to deal with the ordered set like first(), last(), headSet(), tailSet(), etc. |
Null Object | HashSet allows the null object. | TreeSet doesn’t allow null Object and throws NullPointerException, Why, is because TreeSet uses compareTo() method to compare keys, and compareTo() will throw java.lang.NullPointerException. |
Comparison | HashSet uses the equals() method to compare two objects in the Set and for detecting duplicates. | TreeSet uses compareTo() method for the same purpose. If equals() and compareTo() are not consistent, i.e. for two equal objects equals should return true while compareTo() should return zero, then it will break the contract of the Set interface and will allow duplicates in Set implementations like TreeSet |
This article is contributed by Dharmesh Singh. Please write comments if you find anything incorrect, or if you want to share more information about the topic discussed above.