Skip to main content

Command Palette

Search for a command to run...

URL extraction for hostname, domain, subdomain and tld in Swift

Updated
2 min read
URL extraction for hostname, domain, subdomain and tld in Swift
M

I am a Software Engineer working on open source and enterprise mobile SDKs for iOS and MacOS developers written in Swift. From 🇩🇪 and happily living in 🇺🇸

How difficult can it be to parse an HTTP link to identify and extract individual components like its subdomain or top-level domain?

Giving the example http://super.duper.domain.co.uk how would you know that super.duper is the subdomain and co.uk is the top-level domain?

You won't solve this with a regular expression.

You need to parse the URL manually, but more importantly, you need to know which top-level domains exist.

Mozilla initiated an open-source catalog of effective top-level domains called the Public Suffix List. The list is maintained as a community resource, and updates can occur as often as every few days.

Kojiro Futamura created TLDExtract to solve the problem within Swift projects. A Swift library that bundles the Public Suffix List (PSL), understands Punycode, and therefore makes extracting information a breeze.

import TLDExtract

let extractor = TLDExtract()

let urlString: String = "http://super.duper.domain.co.uk"
guard let result: TLDResult = extractor.parse(urlString) else { return }

print(result.rootDomain)        // Optional("domain.co.uk")
print(result.topLevelDomain)    // Optional("co.uk")
print(result.secondLevelDomain) // Optional("domain")
print(result.subDomain)         // Optional("super.duper")

The code base of TLDExtract is very good, but the bundled PSL is outdated as the project stopped releasing updates.

The code allowed you to fetch PSL updates from within your app, but what if your app is mainly operating offline? Also, the implementation used synchronous Data(contentsOf: url) API, which is a blocking operation.

I decided to create a fork to keep this project up-to-date. My key focus is:

  • Always up-to-date

    • leveraging GitHub actions to regularly create new package versions bundling the latest Public Suffix List (PSL) - perfect for offline use

    • modern async function to invoke a network request fetching the latest PSL from the remote server ad-hoc.

  • Swift Package Manager (SPM) as the exclusive distribution channel

  • No package dependencies

I hope this helps the Swift community. Let me know if you have suggestions to improve the project further.

More from this blog

Dev blog post potpourri by senior software engineer Marco Eidinger

149 posts

Hello 👋🏻 , I am a Software Engineer working on open source and enterprise mobile SDKs for iOS and MacOS developers written in Swift. From 🇩🇪 and happily living in 🇺🇸