Home About

[TLDR] Using Regex in Swift

Written on December 2, 2023

This is the TLDR version of the article. View the full version here

Assume the following example input.

let input = """
CREDIT    03/02/2022    Payroll                   $200.23
CREDIT    03/03/2022    Sanctioned Individual A   $2,000,000.00
DEBIT     03/03/2022    Totally Legit Shell Corp  $2,000,000.00
DEBIT     03/05/2022    Beanie Babies Forever     $57.33
"""

Before iOS 16

Use NSRegularExpression from Foundation.

let regex = try NSRegularExpression(pattern: #"(CREDIT|DEBIT)\s+"#)

Use methods on the regex object to perform regex operations on strings.

regex.enumerateMatches(
  in: input,
  range: NSRangeFromString(input)
) { result, flags, stopPointer in
  // Do something with result

  // Stop early if needed
  let shouldStop = false
  if shouldStop {
    stopPointer.pointee = false
  }
}

The focus of this article is on modern Swift regex APIs. To learn more about how to use NSRegularExpression fully, read the docs.

For iOS 16+

Use the new Regex type.

Create regex patterns from string literals.

let regex = try Regex(#"(CREDIT|DEBIT)\s+"#)

Regex is generic over what its output type is. The default is AnyRegexOutput. Specify an explicit output type on the regex to parse out captured groups.

let regex = try Regex(#"(CREDIT|DEBIT)\s+(\d+)"#, as: (Substring, Substring, Substring).self)

The output type should

Use String APIs to perform regex operations on strings.

let match = input.firstMatch(of: regex)
let allMatches = input.matches(of: regex)

Output will be a Regex.Match object. Treat it as a tuple object to get captured values out.

let wholeMatchedSubstring = match?.0        // "CREDIT    03"
let transactionTypeSubstring = match?.1     // "CREDIT"
let monthSubstring = match?.2               // "03"

let transactionTypes = allMatches.map(\.1)  // ["CREDIT", "CREDIT", "DEBIT", "DEBIT"]

Regex Literals

Use regex literals to define regex patterns quicker and more succinctly. To do so, wrap the raw regex pattern with forward slashes (/).

let regex = /(CREDIT|DEBIT)\s+(\d+)/

Forward slashes that are part of the pattern must be escaped with back slashes (\). To avoid this, use extended delimiters.

let regex = #/(CREDIT|DEBIT)\s+(\d+)/(\d+)/(\d+)/#

This produces a Regex object, same as using the normal Regex initializer. Except this time the compiler can validate the regex expression at compile time and figure out the right output generic type for us.

Usage of regex patterns generated through regex literals is the same as using any other Regex instance.

For regex patterns defined with extended delimiters, you can multiline them for better readability and to comment them. Newlines and white spaces are ignored for multiline regex literals.

let regex = #/
(CREDIT|DEBIT)     # transaction type
\s+
(\d+)/(\d+)/(\d+)  # date
/#

Regex literals also allow us to provide names for our capture groups.

let regex = #/
(?<transactionType> CREDIT|DEBIT)          # transaction type
\s+
(?<month> \d+)/(?<day> \d+)/(?<year> \d+)  # date
/#

For a regex with named capture groups, use the capture group name instead of the tuple indices.

let match = input.firstMatch(of: regex)
let month = match?.month  // "03"
let year = match?.year    // "2022"
let day = match?.day      // "02"

`RegexBuilder`

Use the first party RegexBuilder framework for an even better regex crafting experience.

Define a regex pattern using result builders

import RegexBuilder

let regex = Regex {
  // Transaction type
  Capture {
    ChoiceOf {
      "CREDIT"
      "DEBIT"
    }
  }

  OneOrMore {
    CharacterClass.whitespace
  }

  // Date
  Capture {
    OneOrMore {
      CharacterClass.digit
    }
  }
  "/"
  Capture {
    OneOrMore {
      CharacterClass.digit
    }
  }
  "/"
  Capture {
    OneOrMore {
      CharacterClass.digit
    }
  }
}

Make the regex more readable and clean by extrapolating repetitive code and composing regex patterns together.

let digitCapture = Regex {
  Capture {
    OneOrMore {
      CharacterClass.digit
    }
  }
}

let regex = Regex {
  // Transaction type
  Capture {
    ChoiceOf {
      "CREDIT"
      "DEBIT"
    }
  }

  OneOrMore {
    CharacterClass.whitespace
  }

  // Date
  digitCapture
  "/"
  digitCapture
  "/"
  digitCapture
}

The produced value is still just a Regex with its output type automatically set according to the result of the builder. Use it with the same String APIs as regex literals.

For quick and short regex patterns, pass the builder directly into the String API methods.

let match = input.firstMatch {
  Capture {
    ChoiceOf {
      "CREDIT"
      "DEBIT"
    }
  }
}
let transactionType = match?.1  // "CREDIT"

Use References to create named capture groups

let transactionTypeRef = Reference(Substring.self)
let dayRef = Reference(Substring.self)
let monthRef = Reference(Substring.self)
let yearRef = Reference(Substring.self)

func digitCapture(as ref: Reference<Substring>) -> some RegexComponent {
  Capture(as: ref) {
    OneOrMore {
      CharacterClass.digit
    }
  }
}

let regex = Regex {
  Capture(as: transactionTypeRef) {
    ChoiceOf {
      "CREDIT"
      "DEBIT"
    }
  }
  OneOrMore {
    CharacterClass.whitespace
  }
  digitCapture(as: monthRef)
  "/"
  digitCapture(as: dayRef)
  "/"
  digitCapture(as: yearRef)
}

Access the captured contents by using the reference objects like keys to the match dictionary.

let match = input.firstMatch(of: regex)
let transactionType = match?[transactionTypeRef]  // "CREDIT"
let year = match?[yearRef]                        // "2022"
let day = match?[dayRef]                          // "02"
let month = match?[monthRef]                      // "03"

The type of captured data above is Substring?. Use TryCapture blocks Capture to have those Substrings automatically transformed into custom data.

enum TransactionType: String {
  case credit = "CREDIT"
  case debit = "DEBIT"
}

let transactionTypeRef = Reference(TransactionType.self)
// Other references...

let regex = Regex {
  TryCapture(as: transactionTypeRef) {
    ChoiceOf {
      "CREDIT"
      "DEBIT"
    }
  } transform: { substring in
    TransactionType(rawValue: String(substring))
  }
  // More regex...
}

// transactionType is now a custom type, rather than a Substring.
let transactionType: TransactionType? = match?[transactionTypeRef]

Check the sources below for more info.

Sources