Web scraping is a technique to fetch data from websites. While surfing on the web, many websites don’t allow the user to save data for personal use. One way is to manually copy-paste the data, which both tedious and time-consuming. Web Scraping is the automation of the data extraction process from websites. This event is done with the help of web scraping software known as web scrapers.
In this article, we are going to write Python scripts to scrape the Railways Station code using their city name.
Examples:
Input: new-delhi Output: NDLS Input: Patna Output: PNBE
Module needed
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
- requests: Request allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests
Let’s see the stepwise execution of the script.
Step 1: Import all dependence
Python3
# import module import requests from bs4 import BeautifulSoup |
Step 2: Create a URL get function
Python3
# user define function # Scrape the data def getdata(url): r = requests.get(url) return r.text |
Step 3:Now merge the City name into URL and pass the URL into the getdata() function and Convert that data into HTML code.
Python3
# input by geek station = "new-delhi" # url # pass the url # into getdata function htmldata = getdata(url) soup = BeautifulSoup(htmldata, 'html.parser' ) # display html code print (soup) |
Output:
Step 4: Traverse the Station code from the HTML document.
Python3
# traverse the station code data = [] for item in soup.find( "table" , class_ = "extrtable" ).find_all( 'b' ): data.append(item.get_text()) print (data[ - 1 ]) |
Output:
NDLS