In this article, we will discuss how to scrape data like Names, Ratings, Descriptions, Reviews, addresses, Contact numbers, etc. from google maps using Python.
Modules needed:
- Selenium: Usually, to automate testing, Selenium is used. We can do this for scraping also as the browser automation here helps with interacting javascript involved with clicks, scrolls, movement of data between multiple frames, etc.,
- Web/Chrome Driver: Selenium requires a web driver to interface with the chosen browser. Web drivers interact with a remote web server through a wire protocol which is common to all. You can check out and install the web drivers of your browser choice, to establish a connection with the web driver.
Download chrome Driver from https://chromedriver.chromium.org/home then copy/cut the file from your download and paste it at your C drive, i.e C:\chromedriver_win32\chromedriver.exe. Please make sure you are following this path otherwise it will through a path error.
Syntax:
For obtaining the title of the place:
review_titles=browser.find_element_by_class_name(“x3AX1-LfntMc-header-title-title”)
print(review_titles.text)
For obtaining the rating of the place:
stars=browser.find_element_by_class_name(“aMPvhf-fI6EEc-KVuj8d”)
print(“The stars of restaurant are:”,stars.text)
For obtaining the description of the place:
description=browser.find_element_by_class_name(“uxOu9-sTGRBb-T3yXSc”)
print(description.text)
For obtaining the address of the place:
address=browser.find_elements_by_class_name(“CsEnBe”)[0]
print(“Address: “,address.text)
For obtaining the contact number of the place:
phone = browser.find_elements_by_class_name(“CsEnBe”)[-2]
print(“Contact Number: “, phone.text)
For obtaining the reviews of the place:
review=browser.find_elements_by_class_name(“OXD3gb”)
print(“———————— Reviews ——————–“)
for j in review:
print(j.text)
Example 1:
Scrap Title and rating.
Python3
# Import the library Selenium from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains # Make browser open in background options = webdriver.ChromeOptions() options.add_argument( 'headless' ) # Create the webdriver object browser = webdriver.Chrome( executable_path = "C:\chromedriver_win32\chromedriver.exe" , options = options) # Obtain the Google Map URL url = ["https: / / www.google.com / maps / place / \ Papa + John's + Pizza / @ 40.7936551 , - 74.0124687 , 17z / data = ! 3m1 ! 4b1 !\ 4m5 ! 3m4 ! 1s0x89c2580eaa74451b : 0x15d743e4f841e5ed ! 8m2 ! 3d40 .\ 7936551 ! 4d - 74.0124687 ", " https: / / www.google.com / maps / place / \ Lucky + Dhaba / @ 30.653792 , 76.8165233 , 17z / data = ! 3m1 ! 4b1 ! 4m5 ! 3m4 !\ 1s0x390feb3e3de1a031 : 0x862036ab85567f75 ! 8m2 ! 3d30 . 653792 ! 4d76 . 818712 "] # Initialize variables and declare it 0 i = 0 # Create a loop for obtaining data from URLs for i in range ( len (url)): # Open the Google Map URL browser.get(url[i]) # Obtain the title of that place title = browser.find_element_by_class_name( "x3AX1-LfntMc-header-title-title" ) print (i + 1 , "-" , title.text) # Obtain the ratings of that place stars = browser.find_element_by_class_name( "aMPvhf-fI6EEc-KVuj8d" ) print ( "The stars of restaurant are:" , stars.text) print ( "\n" ) |
Output:
1 - Papa Johns Pizza The stars of restaurant are: 3.6 2 - Lucky Da Dhaba The stars of restaurant are: 3.8
Example 2:
Scrap Title and address.
Python3
# Import the library Selenium from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains # Make browser open in background options = webdriver.ChromeOptions() options.add_argument( 'headless' ) # Create the webdriver object browser = webdriver.Chrome( executable_path = "C:\chromedriver_win32\chromedriver.exe" , options = options) # Obtain the Google Map URL url = ["https: / / www.google.com / maps / place / \ Papa + John's + Pizza / @ 40.7936551 , - 74.0124687 , 17z / data = ! 3m1 ! 4b1 !\ 4m5 ! 3m4 ! 1s0x89c2580eaa74451b : 0x15d743e4f841e5ed ! 8m2 ! 3d40 .\ 7936551 ! 4d - 74.0124687 ", " https: / / www.google.com / maps / place / \ Lucky + Dhaba / @ 30.653792 , 76.8165233 , 17z / data = ! 3m1 ! 4b1 ! 4m5 ! 3m4 !\ 1s0x390feb3e3de1a031 : 0x862036ab85567f75 ! 8m2 ! 3d30 . 653792 ! 4d76 . 818712 "] # Initialize variables and declare it 0 i = 0 # Create a loop for obtaining data from URLs for i in range ( len (url)): # Open the Google Map URL browser.get(url[i]) # Obtain the title of that place title = browser.find_element_by_class_name( "x3AX1-LfntMc-header-title-title" ) print (i + 1 , "-" , title.text) # Obtain the address of that place address = browser.find_elements_by_class_name( "CsEnBe" )[ 0 ] print ( "Address: " , address.text) print ( "\n" ) |
Output:
1 – Papa Johns Pizza
Address: 6602 Bergenline Ave, West New York, NJ 07093, United States
2 – Lucky Da Dhaba
Address: shop no.2, Patiala Road, National Highway 64, Zirakpur, Punjab 140603
Example 3:
Scrap Title and contact number.
Python3
# Import the library Selenium from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains # Make browser open in background options = webdriver.ChromeOptions() options.add_argument( 'headless' ) # Create the webdriver object browser = webdriver.Chrome( executable_path = "C:\chromedriver_win32\chromedriver.exe" , options = options) # Obtain the Google Map URL url = ["https: / / www.google.com / maps / place / \ Papa + John's + Pizza / @ 40.7936551 , - 74.0124687 , 17z / data = ! 3m1 ! 4b1 !\ 4m5 ! 3m4 ! 1s0x89c2580eaa74451b : 0x15d743e4f841e5ed ! 8m2 ! 3d40 .\ 7936551 ! 4d - 74.0124687 ", " https: / / www.google.com / maps / place / \ Lucky + Dhaba / @ 30.653792 , 76.8165233 , 17z / data = ! 3m1 ! 4b1 ! 4m5 ! 3m4 !\ 1s0x390feb3e3de1a031 : 0x862036ab85567f75 ! 8m2 ! 3d30 . 653792 ! 4d76 . 818712 "] # Initialize variables and declare it 0 i = 0 # Create a loop for obtaining data from URLs for i in range ( len (url)): # Open the Google Map URL browser.get(url[i]) # Obtain the title of that place title = browser.find_element_by_class_name( "x3AX1-LfntMc-header-title-title" ) print (i + 1 , "-" , title.text) # Obtain the contact number of that place phone = browser.find_elements_by_class_name( "CsEnBe" )[ - 2 ] print ( "Contact Number: " , phone.text) print ( "\n" ) |
Output:
1 - Papa Johns Pizza Contact Number: +1 201-662-7272 2 - Lucky Da Dhaba Contact Number: 095922 67185
Example 4:
Scrap Title and reviews.
Python3
# Import the library Selenium from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains # Make browser open in background options = webdriver.ChromeOptions() options.add_argument( 'headless' ) # Create the webdriver object browser = webdriver.Chrome( executable_path = "C:\chromedriver_win32\chromedriver.exe" , options = options) # Obtain the Google Map URL url = ["https: / / www.google.com / maps / place / \ Papa + John's + Pizza / @ 40.7936551 , - 74.0124687 , 17z / data = ! 3m1 ! 4b1 !\ 4m5 ! 3m4 ! 1s0x89c2580eaa74451b : 0x15d743e4f841e5ed ! 8m2 ! 3d40 .\ 7936551 ! 4d - 74.0124687 ", " https: / / www.google.com / maps / place / \ Lucky + Dhaba / @ 30.653792 , 76.8165233 , 17z / data = ! 3m1 ! 4b1 ! 4m5 ! 3m4 !\ 1s0x390feb3e3de1a031 : 0x862036ab85567f75 ! 8m2 ! 3d30 . 653792 ! 4d76 . 818712 "] # Initialize variables and declare it 0 i = 0 # Create a loop for obtaining data from URLs for i in range ( len (url)): # Open the Google Map URL browser.get(url[i]) # Obtain the title of that place title = browser.find_element_by_class_name( "x3AX1-LfntMc-header-title-title" ) print (i + 1 , "-" , title.text) # Obtain the reviews of that place review = browser.find_elements_by_class_name( "OXD3gb" ) print ( "------------------------ Reviews --------------------" ) for j in review: print (j.text) print ( "\n" ) |
Output:
1 – Papa Johns Pizza
———————— Reviews ——————–
“The food is so good and they even make the pizza so fast, omg.”
“He deals with the money also helps making the pizza without plastic gloves.”
“This is my pizza place to go, no hassles at all!”
2 – Lucky Da Dhaba
———————— Reviews ——————–
“Best place for a small group of people, food quality is amazing”
“Nice staff and quick service good quantity and well cooked meal.”
“I ordered chicken biryani they served me chicken pulao not Biryani.”
Code Implementation:
Python3
# Import the library Selenium from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains # Make browser open in background options = webdriver.ChromeOptions() options.add_argument( 'headless' ) # Create the webdriver object browser = webdriver.Chrome( executable_path = "C:\chromedriver_win32\chromedriver.exe" , options = options) # Obtain the Google Map URL url = ["https: / / www.google.com / maps / place / \ Papa + John's + Pizza / @ 40.7936551 , - 74.0124687 , 17z / data = ! 3m1 ! 4b1 !\ 4m5 ! 3m4 ! 1s0x89c2580eaa74451b : 0x15d743e4f841e5ed ! 8m2 ! 3d40 .\ 7936551 ! 4d - 74.0124687 ", " https: / / www.google.com / maps / place / \ Lucky + Dhaba / @ 30.653792 , 76.8165233 , 17z / data = ! 3m1 ! 4b1 ! 4m5 ! 3m4 !\ 1s0x390feb3e3de1a031 : 0x862036ab85567f75 ! 8m2 ! 3d30 . 653792 ! 4d76 . 818712 "] # Initialize variables and declare it 0 i = 0 # Create a loop for obtaining data from URLs for i in range ( len (url)): # Open the Google Map URL browser.get(url[i]) # Obtain the title of that place title = browser.find_element_by_class_name( "x3AX1-LfntMc-header-title-title" ) print (i + 1 , "-" , title.text) # Obtain the ratings of that place stars = browser.find_element_by_class_name( "aMPvhf-fI6EEc-KVuj8d" ) print ( "The stars of restaurant are:" , stars.text) # Obtain the description of that place description = browser.find_element_by_class_name( "uxOu9-sTGRBb-T3yXSc" ) print ( "Description: " , description.text) # Obtain the address of that place address = browser.find_elements_by_class_name( "CsEnBe" )[ 0 ] print ( "Address: " , address.text) # Obtain the contact number of that place phone = browser.find_elements_by_class_name( "CsEnBe" )[ - 2 ] print ( "Contact Number: " , phone.text) # Obtain the reviews of that place review = browser.find_elements_by_class_name( "OXD3gb" ) print ( "------------------------ Reviews --------------------" ) for j in review: print (j.text) print ( "\n" ) |
Output: