Back to blog

How To Find Strings with Regular Expressions in Swift


Aasif Khan
By Aasif Khan | Last Updated on April 7th, 2022 6:42 am | 5-min read

How do you use regular expressions in Swift? In this tutorial, you’ll learn how to find strings of text with regular expression. We’ll work our way from problem to solution. Instead of giving you a ready-made code example, I’ll walk you through coding it on your own.

When you’re coding your app from scratch, you’ll need to write your own bits of code. You can find tutorials of what you’re trying to accomplish on Google, but what if your code needs to be more complex than that?

Ultimately, the complete code of your app is made out of smaller pieces of code. When you know how to write small pieces of code on your own, you can assemble those smaller pieces into a bigger app. That’s all there is to it!

Why Code Swift from Scratch?

Let’s look at an example of coding Swift from scratch. You want to change a button’s text. So, you Google that, find out how, and add the code to your app:

button.setTitle(“Create a new account”, for: .normal)

Easy, right?

But as you continue to build your app from scratch you encounter increasingly complex problems. How do you tackle those challenges?

Let’s say you are building a hypothetical social media app. And you want to extract hashtags from a bit of text.

Your function needs to turn this:

“I made this wonderful pic last #christmas… #instagram #nofilter #snow #fun”

into this:

[“instagram”, “nofilter”, “snow”, “fun”]

How do you code that from scratch?

Coding The Function to Find Text in a String

First things first. What does the function we’re about to write, do? It extracts bits of text – hashtags – from a string. Always ask yourself what the end goal of your code is before you start coding.

You then determine the input and output of the function. What data goes in, and what comes out? For our hashtags function, the data that goes in is a string that contains hashtags and the data that comes out is an array of strings with individual hashtags.

We can now determine the definition of our function, like this:

func hashtags(text: String) -> [String] {

Here’s what happens in the function definition:

  1. You define a function with the name hashtags and signature hashtags(text:)
  2. It has one input parameter called text of type String. This is the text that contains the hashtags. The input for our function!
  3. It has one return value of type [String], which is an array of strings. This is the output of our function!

See how we turned the function’s end goal, input and output into the Swift function definition? Awesome.

There’s one last thing we have to do before we can continue with the body of the hashtags() function. How does this function fit into the bigger whole of our app’s code?

An app isn’t just a pile of functions. Your app needs a structure, otherwise it becomes unreadable. Will you add your new function to a class, to a model, to a view controller, to an extension, to a …?

Always think about the structure of your code before you write it. When you’re done coding something, ask yourself: “Does it make sense to put this bit of code in that spot?”

We’ll decide to put our hashtags() function into a Swift extension of the String struct. We’ll effectively add our own function to the existing String type. That way we can use it on any instance of String anywhere in our code.

The function definition now becomes this:

extension String
{
func hashtags() -> [String] {

}
}

Note that the function definition doesn’t have that text parameter anymore. Because the function belongs to the String struct, it can be called on any string. That string instance is our input and we can reference it within the function by using self.

Finding Strings with Regular Expressions in Swift

Let’s take a look at our input text again. It’s the string that contains hashtags:

“I made this wonderful pic last #christmas… #instagram #nofilter #snow #fun”

How are we going to get those hashtags out?

Consider for a second how you would find those hashtags with pen and paper. You’d scan the line character by character, from left to right. Every time you encountered a # you’d mark the word until the next whitespace character. Right?

You do the same thing in code by using a regular expression or “regex”. You use regular expressions to find occurrences of text in a string using a search pattern.

This is our search pattern: #[a-z0-9]+

Don’t worry, it’s simpler than it looks! We’re looking for strings that start with # followed by one or more alphanumeric characters. The stuff between the [ and ] form a range. The + means “one or more characters in this range”.

In other words: look for text that starts with #, followed by one or more alphanumeric characters.

In Swift we need an instance of NSRegularExpression to search the string for hashtags, like this:

let regex = try? NSRegularExpression(pattern: “#[a-z0-9]+”, options: .caseInsensitive)

In the above code, this happens:

  • Declare a constant called regex and assign it an instance of NSRegularExpression
  • The call is marked with try? so when NSRegularExpression(pattern:options:) throws an error, regex is nil
  • The search pattern #[a-z0-9]+ is provided as the first argument
  • The second argument is an option that makes the regular expression case-insensitive, so it will find hashtags with both uppercase and lowercase characters

Can you also extract hashtags without a regular expression, with a different implementation? Yes, of course! The cool thing about programming is that you can solve a problem in a limitless number of ways.

Another way would be to iterate over every character in the string, and when you encounter a # character, to append the next character to a “current hashtag” until you encounter a space character. You save the hashtags in an array and return those at the end of the function.

Iterating the Regex Search Results

Now that we have the regular expression, lets put it to work. Like this:

let string = self as NSString

regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length))

In the above example, this happens:

  • First we type cast self to NSString, which is the bridged class of Swift’s String struct, and assign to constant string. This is a requirement for using NSRegularExpression in Swift.
  • Then we call the matches(in:options:range:) function on regex, which returns the found matches in a string based on the regular expression
  • The first argument is self, which is the string instance that our hashtags() function is called on
  • The second argument options is empty
  • The third argument range is provided an instance of NSRange that contains the full range of our input string

Something funny is going on here… Why use NSString and NSRange? Without going into too much detail – it’s because we’re also using NSRange and NSRegularExpression.

As you know, in Swift you can use classes written in Objective-C. Some of those classes are “bridged”, such as String and NSString. That means that in most cases you can treat a String instance as NSString and vice-versa, even though they are different classes. Swift and Objective-C become more interoperable thanks to bridging.

The matches(in:options:range) function accepts a String instance for the in parameter, but wants an NSRange instance for the range parameter. Due to the way strings are handled differently in Swift and Objective-C, you will need to provide the NSRange instance with a string length from NSString, and not from String.

If you provide NSRange with self.length (i.e. a Swift String struct) instead of string.length (i.e. Objective-C’s NSString class) you will run into trouble. Your code compiles and runs OK, but when the input string contains emoji, the string lengths of a String and NSString will differ!

This is, roughly speaking, because NSString uses fixed-width encoding and String uses variable-width encoding. Read more about strings in Swift here.

In other words, you just avoided a nasty string bug by properly casting to NSString. Don’t use the length of a String (without “NS”) with NSRange!

OK, now that’s out of the way, let’s use the result of matches(in:options:range:). Its result has type [NSTextCheckingResult]. An instance of NSTextCheckingResult has a property range, which returns the range of the found text. In other words, we can iterate over the found results and turn them into strings with the substring(with:) function. Like this:

regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length)).map {
string.substring(with: $0.range)
}

In the above example we’re iterating over the results of matches(…) with a higher-order function called map(_:). Inside the closure we’re using the range property of $0 to create a substring of the original input text.

In other words, the expression string.substring(with: $0.range) contains an individual hashtag!

In a closure you can use shorthand names such as $0, $1 and $2 for the parameters of the closure. The map(_:) function iterates over a collection by calling a closure, and that closure’s only parameter is the current item in the collection. You can use it with the $0 shorthand. Read more about shorthands in closures here: The Ultimate Guide to Closures in Swift.

Now that we have access to the individual hashtags, let’s transform them a bit. We want to remove the # symbol from the hashtag and turn the hashtag string to lowercase characters. Like this:

string.substring(with: $0.range).replacingOccurrences(of: “#”, with: “”).lowercased()

In the above code, this happens:

  • We take the result of substring(with:), which is of type String, and call replacingOccurrences(of:with:) on it, which replaces occurrences of # with an empty string, effectively removing the # character
  • We then chain that call with a call to lowercased(), which returns an instance of String that is converted to lowercase characters

Now, let’s put it all together!

Finding Hashtags: Putting It All Together

Pfew, that’s quite some code! Let’s put it all together. Here’s the entire hashtags() function within its extension:

extension String
{
func hashtags() -> [String]
{
if let regex = try? NSRegularExpression(pattern: “#[a-z0-9]+”, options: .caseInsensitive)
{
let string = self as NSString

return regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length)).map {
string.substring(with: $0.range).replacingOccurrences(of: “#”, with: “”).lowercased()
}
}

return []
}
}

What’s it do?

  1. First we define our extension of the String struct, effectively allowing us to add functions to Swift’s default String struct.
  2. We define a function hashtags() that takes no input, and outputs an array of strings.
  3. We define an instance of NSRegularExpression with the regular expression #[a-z0-9]+ that searches for hashtags. We use conditional binding to only continue if regex is not nil.
  4. We then cast self to NSString to avoid getting into trouble with the string’s length
  5. We then call matches(in:options:range:) on regex. We provide it with self, which is the receiving String instance that we called hashtags() on. We also provide it with the full range of the input string.
  6. On the result of matches(…) we call higher-order function map(_:), which iterates over the individual search results of the regular expression.
  7. Within the closure of map(_:) we first extract the hashtag by calling substring(with:), then remove the # symbol and turn the hashtag into lowercase characters.
  8. Ultimately, we return the result of that entire expression, because the type of map(_:) is now inferred to be [String] or array-of-strings. Oh, and when the if let conditional binding isn’t executed, we simply return an empty array.

That last bit may be confusing, because the code is pretty dense. The key is in the map(_:) function. As you know, it iterates over the collection of regular expression matches.

Within the closure of map(_:) we return the value of the expression string.substring(…).lowercased() which is of type String. That means that the return type of map(_:) is [String] – and that’s exactly what we need.

You could say that the map(_:) function turns a collection of NSTextCheckingResult into a collection of String!

When we try our function, here’s the result:

let text = “I made this wonderful pic last #chRistmas… #instagram #nofilter #snow #fun”
let hashtags = text.hashtags()
print(hashtags)
// Output: [“christmas”, “instagram”, “nofilter”, “snow”, “fun”]

You can imagine that when I first wrote this hashtags() function, I didn’t write it as you’re seeing it now. I started with a function, then added it to an extension, then made that mistake with String and NSString. In the end I realised I could use map(_:) to make the function more expressive. I added lowercased() to make the resulting hashtags more uniform. The biggest take-away you can have from this tutorial is that functions are written step-by-step, refined and reshaped until they’re almost perfect, or at least good enough. No one gets it right at the first try!

Further Reading

So, there you have it! How to extract hashtags from string using regular expressions in Swift – from scratch. We didn’t know our exact function implementation before we started, but we managed to see it through from idea to code.

This is what we did:

  1. Determine the end goal, input and output of our function, and write its function definition
  2. We picked regular expressions to extract the hashtags
  3. We iterated over the results of the regex, shaping it into what we needed – an array of String
  4. We took a step-by-step approach, refining the function as we went along

Awesome! Congrats on making it this far and good luck with your own from-scratch adventures. If you have a question, make sure to leave a comment below.

Want to learn more? Check out these resources:


Aasif Khan

Head of SEO at Appy Pie

App Builder

Most Popular Posts