Wednesday, December 25, 2024
Google search engine
HomeLanguagesScrape Google Search Results using Python BeautifulSoup

Scrape Google Search Results using Python BeautifulSoup

In this article, we are going to see how to Scrape Google Search Results using Python BeautifulSoup.

Module Needed:

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • requests:  Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests

Approach:

  • Import the beautifulsoup and request libraries.
  • Make two strings with the default Google search URL, ‘https://google.com/search?q=’ and our customized search keyword.
  • Concatenate these two strings to get our search URL.
  • Fetch the URL data using requests.get(url), store it in a variable, request_result.
  • Create a string and store the result of our fetched request, using request_result.text.
  • Now we use BeautifulSoup to analyze the extracted page. We can simply create an object to perform those operations but beautifulsoup comes with a lot of in-built features to scrape the web. We have created a soup object first using beautifulsoup from the request-response
  •  We can do soup.find.all(h3) to grab all major headings of our search result, Iterate through the object and print it as a string.

Example 1: Below is the implementation of the above approach.

Python3




# Import the beautifulsoup 
# and request libraries of python.
import requests
import bs4
  
# Make two strings with default google search URL
# our customized search keyword.
# Concatenate them
text= "neveropen"
  
# Fetch the URL data using requests.get(url),
# store it in a variable, request_result.
request_result=requests.get( url )
  
# Creating soup from the fetched request
soup = bs4.BeautifulSoup(request_result.text,
                         "html.parser")
print(soup)


Output:

Let’s We can do soup.find.all(h3) to grab all major headings of our search result, Iterate through the object and print it as a string.

Python3




# soup.find.all( h3 ) to grab 
# all major headings of our search result,
heading_object=soup.find_all( 'h3' )
  
# Iterate through the object 
# and print it as a string.
for info in heading_object:
    print(info.getText())
    print("------")


Output:

Example 2: Below is the implementation. In the form of extracting the city temperature using Google search:

Python




# import module
import requests 
import bs4 
  
# Taking thecity name as an input from the user
city = "Imphal"
  
# Generating the url  
  
# Sending HTTP request 
request_result = requests.get( url )
  
# Pulling HTTP data from internet 
soup = bs4.BeautifulSoup( request_result.text 
                         , "html.parser" )
  
# Finding temperature in Celsius.
# The temperature is stored inside the class "BNeawe". 
temp = soup.find( "div" , class_='BNeawe' ).text 
    
print( temp ) 


Output:

Dominic Rubhabha-Wardslaus
Dominic Rubhabha-Wardslaushttp://wardslaus.com
infosec,malicious & dos attacks generator, boot rom exploit philanthropist , wild hacker , game developer,
RELATED ARTICLES

Most Popular

Recent Comments