Suppose you want to travel using Indian railways and want to look for the trains between specified stations. Doing this manually can be very hectic. So in this article, we will write a script that will automatically fetch data from railyatri and will tell the name of the trains along with their code between the specified stations.
Module needed
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
- requests: Request allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests
Let’s see the stepwise execution of the script.
Step 1: Import all dependence
Python3
# import module import requests from bs4 import BeautifulSoup |
Step 2: Create a URL get function
Python3
# user define function # Scrape the data def getdata(url): r = requests.get(url) return r.text |
Step 3: Now merge the station name and station code into URL and pass the URL into the getdata() function and Convert that data into HTML code.
Python3
# input by geek from_Station_code = "NDLS" from_Station_name = "DELHI" To_station_code = "PNBE" To_station_name = "PATNA" # url url = "https://www.railyatri.in/booking/trains-between-stations?from_code=" + from_Station_code + "&from_name=" + from_Station_name + "+JN+&journey_date=+Wed&src=tbs&to_code=" + \ To_station_code + "&to_name=" + To_station_name + \ "+JN+&user_id=-1603228437&user_token=355740&utm_source=dwebsearch_tbs_search_trains" # pass the url # into getdata function htmldata = getdata(url) soup = BeautifulSoup(htmldata, 'html.parser' ) # display html code print (soup) |
Output:
Step 4: Now find the required tag from the HTML code and traverse the result.
Python3
# find the Html tag # with find() # and convert into string data_str = "" for item in soup.find_all( "div" , class_ = "col-xs-12 TrainSearchSection" ): data_str = data_str + item.get_text() result = data_str.split( "\n" ) print ( "Train between " + from_Station_name + " and " + To_station_name) print ("") # Display the result for item in result: if item ! = "": print (item) |
Output:
Full implementation.
Python3
# import module import requests from bs4 import BeautifulSoup # user define function # Scrape the data def getdata(url): r = requests.get(url) return r.text # input by geek from_Station_code = "GAYA" from_Station_name = "GAYA" To_station_code = "PNBE" To_station_name = "PATNA" # url url = "https://www.railyatri.in/booking/trains-between-stations?from_code=" + from_Station_code + "&from_name=" + from_Station_name + "+JN+&journey_date=+Wed&src=tbs&to_code=" + \ To_station_code + "&to_name=" + To_station_name + \ "+JN+&user_id=-1603228437&user_token=355740&utm_source=dwebsearch_tbs_search_trains" # pass the url # into getdata function htmldata = getdata(url) soup = BeautifulSoup(htmldata, 'html.parser' ) # find the Html tag # with find() # and convert into string data_str = "" for item in soup.find_all( "div" , class_ = "col-xs-12 TrainSearchSection" ): data_str = data_str + item.get_text() result = data_str.split( "\n" ) print ( "Train between " + from_Station_name + " and " + To_station_name) print ("") # Display the result for item in result: if item ! = "": print (item) |
Output: