How difficult can it be to parse an HTTP link to identify and extract individual components like its subdomain or top-level domain?
Giving the example http://super.duper.domain.co.uk
how would you know that super.duper
is the subdomain and co.uk
is the top-level domain?
You won't solve this with a regular expression.
You need to parse the URL manually, but more importantly, you need to know which top-level domains exist.
Mozilla initiated an open-source catalog of effective top-level domains called the Public Suffix List. The list is maintained as a community resource, and updates can occur as often as every few days.
Kojiro Futamura created TLDExtract
to solve the problem within Swift projects. A Swift library that bundles the Public Suffix List (PSL), understands Punycode, and therefore makes extracting information a breeze.
import TLDExtract
let extractor = TLDExtract()
let urlString: String = "http://super.duper.domain.co.uk"
guard let result: TLDResult = extractor.parse(urlString) else { return }
print(result.rootDomain) // Optional("domain.co.uk")
print(result.topLevelDomain) // Optional("co.uk")
print(result.secondLevelDomain) // Optional("domain")
print(result.subDomain) // Optional("super.duper")
The code base of TLDExtract
is very good, but the bundled PSL is outdated as the project stopped releasing updates.
The code allowed you to fetch PSL updates from within your app, but what if your app is mainly operating offline? Also, the implementation used synchronous Data(contentsOf: url)
API, which is a blocking operation.
I decided to create a fork to keep this project up-to-date. My key focus is:
Always up-to-date
leveraging GitHub actions to regularly create new package versions bundling the latest Public Suffix List (PSL) - perfect for offline use
modern
async
function to invoke a network request fetching the latest PSL from the remote server ad-hoc.
Swift Package Manager (SPM) as the exclusive distribution channel
No package dependencies
I hope this helps the Swift community. Let me know if you have suggestions to improve the project further.