

You may have heard of a Hash or HashMap, but have you ever heard of a HashSet? It’s a special type of HashMap that implements the Set interface. Residing in the java.util package, Set extends the Collection interface and represents an unordered collection of objects which does not allow the storage of duplicate values. In this programming tutorial, we will learn all about the HashSet. It is one of the most popular Set implementations in Java, as well as an integral part of the Collections framework.
Read: Introduction to Hashing in Java
Creating a HashSet in Java
In order to create a Java HashSet developers must import first the java.util.HashSet package.
There are four ways to create a HashSet in Java:
- HashSet(): Constructs a new, empty set; the backing HashMap instance has default initial capacity of 16 and load factor of 0.75.
- HashSet(Collection<? extends E> c): Constructs a new set containing the elements in the specified collection.
- HashSet(int initialCapacity): Constructs a new, empty set; the backing HashMap instance has the specified initial capacity and default load factor (0.75).
- HashSet(int initialCapacity, float loadFactor): Constructs a new, empty set; the backing HashMap instance has the specified initial capacity and the load factor.
Here are examples of each of the above constructors:
Set<String> hashset = new HashSet<>(); // Creating a HashSet from a Collection List<String> list = Arrays.asList("One", "Two", "Three", "Four"); Set<String> set = new HashSet<>(list); // HashSet with initial capacity HashSet<Integer> numbers = new HashSet<>(5); // HashSet with initial capacity and load factor HashSet<Integer> numbers = new HashSet<>(8, 0.8);
Determining the Ideal Initial Capacity and Load Factor
It is important to understand the effects of the initial capacity and load factor on HashSet performance as they are the two main factors affecting the performance of HashSet operations.
The initial capacity sets the number of buckets when the internal hashtable is created. The number of buckets will be automatically increased if the current size gets full.
The load factor specifies the HashSet’s fullness threshold at which its capacity is automatically increased. Once the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hashtable’s internal data structures are rebuilt in a process known as is rehashing that gives the hashtable approximately twice the number of buckets. For example, if a HashSet has an internal capacity of 16, with a load factor of 0.75 then the number of buckets will automatically get increased when the table has 12 elements in it.
Iterating over the HashSet requires time that’s proportional to the sum of the HashSet instance’s size (the number of elements) plus the capacity of the backing HashMap instance (the number of buckets). As such, it is vitally important not to set the initial capacity too high or the load factor too low if iteration performance is important.
The default load factor of 0.75 provides a good overall starting point with regards to performance. Increasing the load factor value from there will reduce memory overhead but, it will also affect add and search operations in the hashtable. Therefore, to reduce the rehashing operations, we should choose the initial capacity carefully. If the initial capacity is greater than the maximum number of entries divided by the load factor, then no rehash operation will ever occur.
Read: Introduction to Hashtable and HashMap in Java
Adding and Removing Elements in HashSet
The HashSet class offers two methods for adding elements to the set:
- add() – inserts the specified element to the set
- addAll() – inserts all the elements of the specified collection to the set
Likewise for removing elements in a HashSet:
- remove() – removes the specified element from the set
- removeAll() – removes all the elements from the set
Here is some short example code that shows the above Java HashSet methods in action:
import java.util.HashSet; import java.util.List; import java.util.Arrays; public class Main { public static void main(String[] args) { HashSet<Integer> numbers = new HashSet<>(); numbers.add(2); numbers.add(5); numbers.add(6); System.out.println("HashSet: " + numbers); List<Integer> moreNumbers = Arrays.asList(7, 8, 9, 1); numbers.addAll(moreNumbers); System.out.println("HashSet: " + numbers); // Using remove() method boolean numberRemoved = numbers.remove(5); System.out.println("Is 5 removed? " + numberRemoved); boolean numbersRemoved = numbers.removeAll(numbers); System.out.println("Are all elements removed? " + numbersRemoved); } }
Notice that the two methods for removing elements both return a boolean value indicating whether or not the removal was successful. We can see their results below:
Advanced Set Operations in Java
The HashSet class includes several methods for performing various set operations, such as:
- Union of Sets, via the addAll() method.
- Intersection of sets, via the retainAll() method.
- Difference between two sets, via the removeAll() method.
- Check if a set is a subset of another set, via the containsAll() method.
Here is a program that incorporates all of the above set operations:
import java.util.HashSet; public class Main { public static void main(String[] args) { System.out.println("UNION:"); HashSet<String> guitars = new HashSet<>(); guitars.add("Fender"); guitars.add("Gibson"); guitars.add("Jackson"); System.out.println("guitars = " + guitars); HashSet<String> moreGuitars = new HashSet<>(); moreGuitars.add("Washburn"); moreGuitars.add("Yamaha"); System.out.println("moreGuitars = " + moreGuitars); // Union of two set guitars.addAll(moreGuitars); System.out.println("guitars now contains " + guitars); System.out.println("\ nINTERSECTION:"); guitars = new HashSet<>(); guitars.add("Fender"); guitars.add("Gibson"); System.out.println("guitars = " + guitars); moreGuitars = new HashSet<>(); moreGuitars.add("Fender"); moreGuitars.add("Yamaha"); System.out.println(" moreGuitars = " + moreGuitars); // Intersection of two sets moreGuitars.retainAll(guitars) ; System.out.println(" Intersection is: " + moreGuitars); System.out.println("\ nDIFFERENCE:"); guitars = new HashSet<>(); guitars.add("Fender"); guitars.add("Gibson"); guitars.add("Jackson"); System.out.println("guitars = " + guitars); moreGuitars = new HashSet<>(); moreGuitars.add("Washburn"); moreGuitars.add("Gibson"); moreGuitars.add("Jackson"); System.out.println(" moreGuitars = " + moreGuitars); // Difference between guitars and moreGuitars HashSets guitars.removeAll(moreGuitars) ; System.out.println("Difference : " + guitars); System.out.println("\nSUBSET:" ); guitars = new HashSet<>(); guitars.add("Fender"); guitars.add("Gibson"); guitars.add("Jackson"); System.out.println("guitars = " + guitars); moreGuitars = new HashSet<>(); moreGuitars.add("Gibson"); moreGuitars.add("Jackson"); System.out.println(" moreGuitars = " + moreGuitars); // Check if guitars is a subset of guitars boolean result = guitars.containsAll(guitars); System.out.println("Is moreGuitars HashSet a subset of guitars HashSet? " + result); } }
Here is the complete program output, organized by set operation type:
Final Thoughts on the Java HashSet
In this Java programming tutorial we learned all about the Java HashSet, a class for working with unordered collections of unique objects. Jam-packed with useful functionality, it is easy to see why the HashSet is one of the most popular Set implementations in Java.