URL extraction for hostname, domain, subdomain and tld in Swift

URL extraction for hostname, domain, subdomain and tld in Swift


2 min read

How difficult can it be to parse an HTTP link to identify and extract individual components like its subdomain or top-level domain?

Giving the example http://super.duper.domain.co.uk how would you know that super.duper is the subdomain and co.uk is the top-level domain?

You won't solve this with a regular expression.

You need to parse the URL manually, but more importantly, you need to know which top-level domains exist.

Mozilla initiated an open-source catalog of effective top-level domains called the Public Suffix List. The list is maintained as a community resource, and updates can occur as often as every few days.

Kojiro Futamura created TLDExtract to solve the problem within Swift projects. A Swift library that bundles the Public Suffix List (PSL), understands Punycode, and therefore makes extracting information a breeze.

import TLDExtract

let extractor = TLDExtract()

let urlString: String = "http://super.duper.domain.co.uk"
guard let result: TLDResult = extractor.parse(urlString) else { return }

print(result.rootDomain)        // Optional("domain.co.uk")
print(result.topLevelDomain)    // Optional("co.uk")
print(result.secondLevelDomain) // Optional("domain")
print(result.subDomain)         // Optional("super.duper")

The code base of TLDExtract is very good, but the bundled PSL is outdated as the project stopped releasing updates.

The code allowed you to fetch PSL updates from within your app, but what if your app is mainly operating offline? Also, the implementation used synchronous Data(contentsOf: url) API, which is a blocking operation.

I decided to create a fork to keep this project up-to-date. My key focus is:

  • Always up-to-date

    • leveraging GitHub actions to regularly create new package versions bundling the latest Public Suffix List (PSL) - perfect for offline use

    • modern async function to invoke a network request fetching the latest PSL from the remote server ad-hoc.

  • Swift Package Manager (SPM) as the exclusive distribution channel

  • No package dependencies

I hope this helps the Swift community. Let me know if you have suggestions to improve the project further.

Did you find this article valuable?

Support Marco Eidinger by becoming a sponsor. Any amount is appreciated!