Given a list of URL, the task is to sort the URL in the list based on the top-level domain.
A top-level domain (TLD) is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. Example – org, com, edu.
This is mostly used in a case where we have to scrap the pages and sort URL according to top-level domain. It is widely used in open-source projects and serves as handy snippet for use.
Input : url = ["https://www.isb.edu", "www.google.com", "http://cyware.com", "https://www.gst.in", "https://www.coursera.org", "https://www.create.net", "https://www.ontariocolleges.ca"] Output : ['https://www.ontariocolleges.ca', 'www.google.com', 'http://cyware.com', 'https://www.isb.edu', 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org'] Explanation: The Tld for the above list is in sorted order ['.ca','.com','.com','.edu','.in','.net','.org']
Below are some ways to do the above task.
Method 1: Using sorted
You can split the input and then use sorting to sort according to TLD.
#Python code to sort the URL in the list based on the top-level domain. #Url list initialization #Function to sort in tld order def tld( Input ): return Input .split( '.' )[ - 1 ] #Using sorted and calling function Output = sorted ( Input ,key = tld) #Printing output print ( "Initial list is :" ) print ( Input ) print ( "sorted list according to TLD is" ) print (Output) |
Initial list is : ['https://www.isb.edu', 'www.google.com', 'http://cyware.com', 'https://www.gst.in', 'https://www.coursera.org', 'https://www.create.net', 'https://www.ontariocolleges.ca'] Sorted list according to TLD is : ['https://www.ontariocolleges.ca', 'www.google.com', 'http://cyware.com', 'https://www.isb.edu', 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Method 2: Using Lambda
The most concise and readable way to sort the URL in the list based on the top-level domain is using lambda.
#Python code to sort the URL in the list based on the top-level domain. #Url list initialization #Using lambda and sorted Output = sorted ( Input ,key = lambda x: x.split( '.' )[ - 1 ]) #Printing output print ( "Initial list is :" ) print ( Input ) print ( "sorted list according to TLD is" ) print (Output) |
Initial list is : ['https://www.isb.edu', 'www.google.com', 'http://cyware.com', 'https://www.gst.in', 'https://www.coursera.org', 'https://www.create.net', 'https://www.ontariocolleges.ca'] Sorted list according to TLD is : ['https://www.ontariocolleges.ca', 'www.google.com', 'http://cyware.com', 'https://www.isb.edu', 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Method 3: Using reversed
Reversing the input and splitting it and then applying a sort to sort URL according to TLD
#Python code to sort the URL in the list based on the top-level domain. #Url list initialization #Internal function for reversed def internal(string): return list ( reversed (string.split( '.' ))) #Using sorted and calling internal for reversed Output = sorted ( Input , key = internal) #Printing output print ( "Initial list is :" ) print ( Input ) print ( "sorted list according to TLD is" ) print (Output) |
Initial list is : ['https://www.isb.edu', 'www.google.com', 'http://cyware.com', 'https://www.gst.in', 'https://www.coursera.org', 'https://www.create.net', 'https://www.ontariocolleges.ca'] Sorted list according to TLD is : ['https://www.ontariocolleges.ca', 'www.google.com', 'http://cyware.com', 'https://www.isb.edu', 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']