Sets in Swift Explained

By Aasif Khan | December 22, 2021 6:24 pm | 5-min read

Sets in Swift are powerful. They’re similar to arrays and dictionaries, but also very different… The Set collection type is an interesting aspect of Swift programming. Let’s find out how you can use it!

In this post, you’ll learn:

How to use Set in Swift
Why Set is such a powerful collection type
How time complexity and hash tables affect Set
How to compare sets, to find differences and similarities
What ingredients a cappuccino and a latte have in common…

Working with Sets in Swift
Sets vs. Arrays vs. Dictionaries
Using Sets with Custom Objects
Comparing Sets
Further Reading

Working with Sets in Swift

Let’s first discuss how you can work with Set in Swift. Creating empty sets, adding and removing items – it’s very similar to working with arrays and dictionaries.

Here’s a quick example:

let fruit:Set = [“apple”, “banana”, “strawberry”, “jackfruit”]
print(fruit)

In the above code, we’re creating a new set called fruit using literal syntax. The type Set is explicitly provided. Otherwise we’d create an array, because of type inference.

The type of fruit is actually Set or set-of-strings. Just like arrays and dictionaries, Set is a generic struct. A set can contain any type that conforms to the Hashable protocol. (More on this later.)

You can add an item to a set like this:

var fruit = Set()
fruit.insert(“pineapple”)
print(fruit)

In the above code, an empty set is created using initializer syntax, with Set(). You’re providing the type of the Set, followed by parentheses (). Then, an item “pineapple” is inserted into the set.

Removing an item from a set is simple, too. Like this:

var fruit:Set = [“apple”, “banana”, “strawberry”, “jackfruit”]
fruit.remove(“banana”)
print(fruit)

The items in a set need to be unique! You can’t add the same two items to a Set. Like this:

var fruit:Set = [“apple”, “banana”, “orange”]
let result = fruit.insert(“orange”)
print(fruit)

print(result)

In the above code, the “orange” string is not inserted in the set. The return value from the insert(_:) function, which is assigned to the result constant, tells us exactly that:

(inserted: false, memberAfterInsert: “orange”)
And just like arrays and dictionaries, a Set has a number of helpful functions, such as:

isEmpty returns true when the set has no items
count returns the number of items in the set
first returns the first item in the set
randomElement() returns a random item from the set
first(where:) returns items that satisfy a given predicate

Let’s move on, to find out what makes Set special…

Sets vs. Arrays vs. Dictionaries

Sets are similar to arrays and dictionaries, but they’re also very different.

An array associates numeric indices, i.e. 0…n, with values. Arrays have a fixed, consecutive order. A good example is an list of your favorite movies, ordered from most-favorite to least-favorite.
A dictionary associates keys with values, and keys need to be hashable, such as String. Dictionaries aren’t ordered. A good example is a highscore: a list of people’s names and their respective highscores.

A set is a one-dimensional list of items, just like an array. It doesn’t have numeric indexes (like arrays), but values that need to conform to Hashable (like dictionaries). It’s almost like the values are the keys. This characteristic makes Set so special!

Imagine we want to figure out if The Matrix (from 1999) is one of your favorite movies. We could design an algorithm to find out. Like this:

let favorites = [“Rocky”, “The Matrix”, “Lord of the Rings”]

for movie in favorites
{
if movie == “The Matrix” {
print(“Yes, The Matrix is a favorite of yours!”)
break
}
}

Note: In the above code, favorites is an array — not a Set.

The above code searches for “The Matrix” by comparing that string to each of the movies in the favorites array. At worst, it’ll have to iterate over each of the items in favorites to find a match.

We say that the time complexity of the above algorithm is O(n), so-called linear time. That’s OK for 3 movies, but imagine how long it takes as the list of movies gets arbitrarily large. You may need to search millions of movies one by one!

Computer scientists came up with a solution to this problem, called a hash table. Dictionaries use a hash table for keys, and sets use them for values.

For sets, the hash table mechanism turns each of the set’s values into a unique string, called a hash. Based on some arithmetic, this hash corresponds to a known memory address of an item in the set.

As a result, we can find any item using its hash, without needing to search through every item in the list. We know the value, so we know the hash, so we know the memory address. This lookup operation has a time complexity of O(1), called constant time. This idea is revolutionary, because it means we can efficiently find any item in the collection!

Why not use an array? Looking up an item in an array is also an O(1) operation. You need to know an array’s index number to look up an item, which isn’t always practical.

Why not use a dictionary? Looking up an item in a dictionary, by its key, is an O(1) operation. A Set is similar to a dictionary in this regard, except that you don’t have to provide hashable keys yourself. In a dictionary, a pair’s key isn’t based on its value.

Let’s get back to the favorite movies example. Finding out if an arbitrary movie is one of your favorite movies, is simple:

let movies:Set = [“Rocky”, “The Matrix”, “Lord of the Rings”]

if movies.contains(“Rocky”) {
print(“Rocky is one of your favorite movies!”)
}

In the above code, we’re creating that same movies collection as a Set of strings. With the contains(_:) function we check if the string “Rocky” exists in the set. When that’s true, some text is printed as output.

The contains(_:) function uses the hash table mechanism. It doesn’t use a loop to search! It calculates the hash value of the string “Rocky”, and uses that value to directly access the item in the collection. If there’s an item for that hash value, contains(_:) returns true. If it doesn’t find anything, we can be certain it doesn’t exist in the set.

The big difference is the mechanism used to find items in the collection. An array will need to inspect every item in the array one by one. A set can directly access the value based on its hash, without the need to inspect the entire collection.

The time complexity of a set’s contains(_:) function is O(1) (documentation). The time complexity of an array’s contains(where:) function is O(n) (documentation).

Using Sets with Custom Objects

But… what if you want to keep a list of favourite movies, with meta data, and find out if a movie is contained in that list? Movie names aren’t enough, then.

First, let’s create a simple Movie struct, with a few properties. Like this:

struct Movie
{
var name:String
var year:Int
var rating:Int
}

This movie needs initializers, so let’s add those too:

init(name: String, year: Int, rating: Int) {
self.name = name
self.year = year
self.rating = rating
}

init(withName name: String) {
self.name = name
self.year = 0
self.rating = 0
}

The first init() function produces an object of type Movie with all 3 of its properties name, year and rating. This initializer is typically added by default, unless you add your own initializers to a struct (or class).

The second initializer init(withName:) is just there for convenience. It doesn’t construct a “complete” movie, just a stub that only has a name. The properties year and rating are set to 0. We’ll use it later on.

Before we can use Movie in a set, the struct needs to conform to the Hashable protocol. We’ll need to make 3 adjustments.

First, change the struct’s definition to include the Hashable protocol. Like this:

struct Movie: Hashable
{

Second, we’ll need to add a function == to the Movie struct. Like this:

static func == (lhs: Movie, rhs: Movie) -> Bool {
return lhs.name == rhs.name
}

The above function looks complicated, but it isn’t! Here’s what happens:

The == function is part of the Hashable protocol. It’s used to compare two objects of Movie, to determine if they are equal. (Yes, == is a valid function name!)
If they’re equal, the function returns true. If they’re not, the function returns false.
The two Movie objects we’re comparing are called lhs and rhs, which stand for left-hand-side and right-hand-side. Remember math? Every equation has those two sides.
We’re determining equality by comparing the name properties of the two Movie objects. We’re literally comparing the names of the movies. If the names are the same, the objects are the same.

Comparing movie names is a rudimentary approach to determine if two Movie objects are equal. This excludes movies with the same name, but different release years, for example. A solution would be to compare both name and year in the == function.

Third, we need to implement the hash(into:) function. This function is also part of the Hashable protocol. Here’s how:

func hash(into hasher: inout Hasher) {
hasher.combine(name)
}

With this function we can provide data to the hash function. This hash function is used to calculate a hash table for the set. The function is passed a reference to a Hasher object.

We’ll add the movie name to the hasher, using the combine(_:) function. It’s important that both the == and the hash(into:) function use the same properties.

At this point, a Movie object has a property called hashValue. We can also test that two Movie objects are equal. Like this:

let a = Movie(name: “Rocky”, year: 1976, rating: 5)
let b = Movie(withName: “Rocky”)

print(a == b)
// Output: true

print(a.hashValue)
// Output: something like “5474910254413230307”

Alright, let’s move on! We’re now going to create a set of movies. Like this:

let movies:Set = [
Movie(name: “Rocky”, year: 1976, rating: 5),
Movie(name: “The Matrix”, year: 1999, rating: 10),
Movie(name: “Lord Of The Rings”, year: 2001, rating: 8),
Movie(name: “The Secret Life Of Walter Mitty”, year: 2013, rating: 7),
Movie(name: “Deadpool”, year: 2016, rating: 3)
]

With this set, we can do a few cool things. What about finding out if a movie is a favorite? Here’s how:

if movies.contains(Movie(withName: “Rocky”)) {
print(“Yes! Rocky is a favorite.”)
}

In the above code, we’re creating a new Movie object with the Movie(withName:) initializer. Because of how we implemented Hashable, we can use Movie objects to test membership of the set. The above code checks if Rocky exists in the movies set. This is an O(1) operation.

And what about finding your most favorite movie? Here’s how:

if let movie = movies.sorted(by: { $0.rating > $1.rating }).first {
print(“My favorite movie is: \(movie.name)”)
}

The above code uses the sorted(by:) function to sort movies by its rating property (highest-to-lowest). The result of sorted(by:) is an array, and we’re getting the first (i.e., highest rated) item in the array with the first property. This entire operation has an O(log n) time complexity.

What if we want to find the rating of a particular movie? Here’s how:

if let index = movies.firstIndex(of: Movie(withName: “Lord Of The Rings”)) {
let lotr = movies[index]
print(“I rated LOTR with ranking = \(lotr.rating)”)
}

In the above code we’re getting the index of the movie Lord Of The Rings with the firstIndex(of:) function. Just as with contains(_:), the movie is found with its index (a hash value). This is an O(1) operation. In the conditional, we’re using subscript syntax to get the Movie object from movies, and print its rating.

It’s sometimes counter-intuitive to use a Set in conjuction with firstIndex(of:). Why not just put them in an array or a dictionary? Most object collections have unique indices anyway, such as objectId in Parse Server and $key for Firebase.

The reason we’re using Set here is because we don’t care about the order of movies, and we want to find a movie by its name in O(1) time. Searching arrays and dictionaries by a property takes O(n) time. We could have used the movie name as a dictionary key, but that has zero advantages over using Set. (See “Why not use …” above.)

When your collection requires a fixed order, use array. If you want the O(1) advantages of using Set, you can create a separate index with a set. Put the property you want to search for in this Set, and make sure it “mirrors” the array with the actual objects exactly. Use the set to test if an item is contained in the collection, and the array (or dictionary) to get the relevant object. This is effectively a search index.

OK, one last thing. Here’s how you select the movies with a rating of 8 or greater:

let top = movies.filter({ $0.rating >= 8 })

for movie in top {
print(movie.name)
}

The above code isn’t unique to Set – any collection or sequence can be filtered with filter(_:). What’s interesting is that the type of top is the same as movies, i.e. Set. That’s efficient!

Let’s move on!

Comparing Sets

The superpower of Set is that it can efficiently determine if a given element is part of a set. We call this attribute “membership”. Is an element a member of a set?

Because of this, Set is useful to compare collections with each other, and find similarities and differences between sets. There are a few distinct comparisons we can make between two sets:

A union is a combination of sets. It’s easiest to remember as “A plus B”.

A subtraction removes items from set A that exist in set B. It’s easiest to remember as “A minus B”.

An intersection is a set with items that exist in both sets. It’s easiest to remember as common items shared between A and B.

A symmetric difference is a set with items that exist in either A or B, but not in both. They’re items that aren’t shared between A and B.

Here, check out the ingredients in a few typical Italian coffees:

let cappuccino:Set = [“espresso”, “milk”, “milk foam”]
let americano:Set = [“espresso”, “water”]
let machiato:Set = [“espresso”, “milk foam”]
let latte:Set = [“espresso”, “milk”]

Can we find some differences and similarities between coffees? Let’s find out!

Union

First, let’s make a union of machiato and latte. Here’s how:

let result = machiato.union(latte)
print(result)
// Output: [“espresso”, “milk foam”, “milk”]

Coincidentally, these are the exact ingredients of a cappuccino! Can we test this equality? Yes! Here’s how:

if machiato.union(latte) == cappuccino {
print(“machiato + latte has the same ingredients as cappuccino…”)
}

For the love of coffee: The amount of espresso, milk and foam differs between a cappuccino, latte and machiato. In this tutorial, we’re just comparing ingredients.

Subtraction

Second, let’s subtract americano from cappuccino. What’s a cappuccino without espresso? This:

let result = cappuccino.subtracting(americano)
print(result)
// Output: [“milk foam”, “milk”]

It doesn’t matter that there’s no water in cappuccino, because you can’t subtract what isn’t there anyway. And keep in mind that a – b isn’t the same as b – a. Like this:

let result = americano.subtracting(cappuccino)
print(result)
// Output: [“water”]
Intersection

Third, let’s make an intersection of latte and cappuccino. What ingredients are shared between the two coffees? These:

let result = latte.intersection(cappuccino)
print(result)
// Output: [“espresso”, “milk”]

A cappuccino and a latte both have espresso and milk. Can we also find what’s shared between all coffees?

let result = latte.intersection(cappuccino).intersection(machiato).intersection(americano)
print(result)
// Output: [“espresso”]

Awesome! Be careful though with chaining these operations, because the order matters. Removing “espresso” from machiato would break the chain, for example, even though “espresso” is then shared in 3 of 4 coffees.

Symmetric difference

Fourth, the difference between the coffees. What’s the difference between a americano and latte? It’s this:

let result = latte.symmetricDifference(americano)
print(result)
// Output: [“milk”, “water”]

You could say that milk and water aren’t shared between these two coffees. It’s called symmetric difference because items are selected from both sets. An asymmetric difference would just be subtracting!

Practical Use Cases

The above examples aren’t practical, unless you’re making an app that points out differences between Italian coffees. What are some real-world use cases for comparing sets?

When you’ve collected data per weekday, you could use a union to combine the data for Tuesday and Wednesday in one set.
When your social media app facilitates unfriending of users, you could use subtract to remove followers for a particular user.
When you want to know which followers users A and B have in common, you could use an intersection.
When you’re at war and you want to find spies that aren’t double agents, you could use a symmetric difference. (Think about it!)

Just like other functions in Swift, the comparative family of functions for Set has variants that perform operations in place or by returning a new value. In-place algorithms don’t take up extra space, but they mutate your original values. In Swift, most functions that operate in-place use strong verbs, such as formIntersection(), sort() and subtract().