Tips for parsing Swift code with SwiftSyntax

I recently worked on a project with a goal to consolidate and standardize localization efforts in a sizeable iOS application written in Swift. This app uses a 3rd party localization service that supports Over-The-Air (OTA) localization. In short, it means that the live app can fetch the latest localizations from the server and there is no need to resubmit the app to the App Store to update the localizations. As a result of this feature, there are no .strings files in the project. All the localization keys and localized values are stored in a proprietary formatted database file used by the 3rd party localization service SDK.

The app has been in development for over 6 years, and presumably, many keys present in the database weren’t used anymore. Having a growing number of unused keys lingering in the database increases the localization costs when entering new markets. So one of the goals was to clean up the unused keys. And the challenge was to figure out which localization keys are currently being used in the app. The only way to know for sure is to find all localization keys referenced in the code. And given that the project has been in development for so long, there was a handful of different patterns for referencing localization keys.

Here is where SwiftSyntax enters the stage. I needed a reliable Swift code parsing tool to collect these localization keys and SwiftSyntax emerged from the research as the best tool for the job.

What is SwiftSyntax and how to use it?

SwiftSyntax is a library for parsing, inspecting, generating, and transforming Swift source code. If you are not familiar with it, the basics of the tool are pretty well covered in these two articles:

SwiftSyntax by Mattt
An overview of SwiftSyntax by Luciano Almeida

To explain how SwiftSyntax works in a single sentence, it builds an abstract syntax tree of the Swift source code and then traverses that tree while letting you react whenever it encounters a type of tree node which interests you.

And to provide an example, let’s take a UIViewController that shows a localized string in its root view.

class ViewController: UIViewController {
    @IBOutlet private weak var label: UILabel!

    override func viewDidLoad() {
        super.viewDidLoad()
        label.text = NSLocalizedString("Hello World!", comment: "")
    }
}

The goal here is to collect the Hello World! string from this code. We can achieve that by implementing a subclass of the SyntaxVisitor class and override one of its visit(_:) methods. The SyntaxVisitor class is the component that walks the syntax tree and the visit(_:) methods provide the hook for your code to run whenever a certain type of node is encountered. The NSLocalizedStringKeyCollector subclass specifically overrides an overload of the visit(_:) method which accepts a FunctionCallExprSyntax node as an argument.

final class NSLocalizedStringKeyCollector: SyntaxVisitor {
    var result: [String] = []

    override func visit(_ node: FunctionCallExprSyntax) -> SyntaxVisitorContinueKind {
        guard
            let identifierExpr = node.calledExpression.as(IdentifierExprSyntax.self),
            identifierExpr.identifier.text == "NSLocalizedString",
            let firstArgumentExpr = node.argumentList.first?.expression.as(StringLiteralExprSyntax.self),
            let localizationKey = firstArgumentExpr.segments.first?.as(StringSegmentSyntax.self)?.description
        else {
            return .visitChildren
        }
        self.result.append(localizationKey)
        return .skipChildren
    }
}

We can use the NSLocalizedStringKeyCollector class to walk the syntax tree and collect any localization keys specified as string literals passed directly into the NSLocalizedString function.

let code = readCodeAsStringFromFile() // reads our ViewController code from a swift file
let collector = NSLocalizedStringKeyCollector()
if let syntaxTree = try? SyntaxParser.parse(source: code) {
    collector.walk(syntaxTree)
    print(collector.result) // prints ["Hello World!"]
}

Tip 1: `Syntax.syntaxNodeType` will help you understand the tree structure

The implementation of the NSLocalizedStringKeyCollector goes into the children of the FunctionCallExprSyntax node looking for a specific structure of nodes. To understand the different types of nodes your code is built from, you will either use autocomplete to explore individual node’s properties, or click-through into the types themselves to explore them. The challenge here is that the syntax node’s properties will often have a generic Syntax or ExprSyntax type, and you will need to know what is the specific node type of the property value. There are many types of nodes and unless you are already very familiar with the full set, it’s going to be hard to guess which specific type you are dealing with.

...
guard
    let identifierExpr = node.calledExpression.as(IdentifierExprSyntax.self),
    identifierExpr.identifier.text == "NSLocalizedString",
    let firstArgumentExpr = node.argumentList.first?.expression.as(StringLiteralExprSyntax.self),
    let localizationKey = firstArgumentExpr.segments.first?.as(StringSegmentSyntax.self)?.description
else {
...

As it’s visible from this snippet, to dig deeper into the tree structure, there are several steps where we need to know the specific node type we want to “cast” into with the .as() method. And this is where the syntaxNodeType property comes in handy. An easy way to find out the type of a node is to print the value of the syntaxNodeType property to the console. But digging through the tree structure node by node can be tedious. Luckily, there is a really cool tool, Swift Abstract Syntax Tree visualizer, which automates this process completely.

Tip 2: Generalize the tree walking logic

I had a handful of syntax visitors and each of them had a purpose of finding some kind of a result, either a string or a certain node type. For each of these visitors, I needed to implement a method that would reset their result, walk the tree, and return the result. After creating a couple of these, I decided to create a generic SyntaxNodeProcessor component that extracts the common logic and provides a unified interface.

class SyntaxNodeProcessor<Result, SyntaxNodeType: SyntaxProtocol>: SyntaxVisitor {
    var result: Result?

    func resetResult() {
        result = nil
    }

    func process(_ node: SyntaxNodeType) -> Result? {
        resetResult()
        walk(node)
        return result
    }
}

The usage of this component doesn’t decrease the amount of code significantly, but it provides consistency and reduces the cognitive load when writing new visitor types. If you have a number of syntax visitors in your code, you might find SyntaxNodeProcessor class useful as well.

The new implementation of the NSLocalizedStringKeyCollector based on the SyntaxNodeProcessor would look like the following.

final class NSLocalizedStringKeyCollector: SyntaxNodeProcessor<[String], SourceFileSyntax> {
    override func resetResult() {
        result = []
    }

    override func visit(_ node: FunctionCallExprSyntax) -> SyntaxVisitorContinueKind {
        guard
            let identifierExpr = node.calledExpression.as(IdentifierExprSyntax.self),
            identifierExpr.identifier.text == "NSLocalizedString",
            let firstArgumentExpr = node.argumentList.first?.expression.as(StringLiteralExprSyntax.self),
            let localizationKey = firstArgumentExpr.segments.first?.as(StringSegmentSyntax.self)?.description
        else {
            return .visitChildren
        }
        self.result.append(localizationKey)
        return .skipChildren
    }
}

And here is how the call site looks when the instance of the reimplemented NSLocalizedStringKeyCollector is used in a loop.

let collector = NSLocalizedStringKeyCollector()
for file in files {
    let code = readCodeAsString(from: file)
    if let syntaxTree = try? SyntaxParser.parse(source: code) {
        print(collector.process(syntaxTree))
    }
}

Conclusion

Parsing Swift code with SwiftSyntax turned out to be much easier than I expected. If you need to parse Swift code, you should consider SwiftSyntax. And if you end up using it, I hope you find the tips in this post useful.

What is SwiftSyntax and how to use it?

Tip 1: Syntax.syntaxNodeType will help you understand the tree structure

Tip 2: Generalize the tree walking logic

Conclusion

Tip 1: `Syntax.syntaxNodeType` will help you understand the tree structure