Selenium is a powerful tool for controlling web browsers through programs and performing browser automation. It is functional for all browsers, works on all major OS and its scripts are written in various languages i.e Python, Java, C#, etc, we will be working with Python.
In this article, we are going to see how to automate our browser. We can just select the word/sentence and speak Search and the word/sentence gets automatically searched and provide you with accurate results.
Requirement:
- pyautogui: PyAutoGUI is a cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
- selenium : Selenium is a powerful tool for controlling web browsers through programs and performing browser automation.
- speech_recognition: Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc.
- We are using chromedriver_autoinstaller so that then we could see the meaning of the searched word. [Have Installed Chrome Latest Version on your local device).
Step-by-step Approach:
Step 1: Import required modules
Python3
# import module. # Web browser Automation from selenium import webdriver from selenium.webdriver.common.by import By import time # Support for chrome import chromedriver_autoinstaller # Invoking speech module import speech_recognition as sr # Support file for speech recognition import pyttsx3 # Automating task import pyautogui |
Step 2: Let’s invoke the speech recognition module and initiate our internal speaker so that it could hear our voice as input and could initiate the process. MyText stores our voice command as text.
Python
r = sr.Recognizer() with sr.Microphone() as source2: r.adjust_for_ambient_noise(source2, duration = 0.2 ) audio2 = r.listen(source2) MyText = r.recognize_google(audio2) MyText = str (MyText.lower()) |
Step 3: After selecting and speaking search using your voice will now initiate the process. Using selenium and pyautogui automatically takes that word and gives the appropriate search result.
Python3
if MyText = = "search" : # Automates 'copy' internally pyautogui.hotkey( 'ctrl' , 'c' ) chrome_options = webdriver.ChromeOptions() capabilities = { 'browserName' : 'chrome' , 'javascriptEnabled' : True } capabilities.update(chrome_options.to_capabilities()) chromedriver_autoinstaller.install() # Invoking the chrome driver = webdriver.Chrome() # Adjusting the size of the window driver.set_window_size( 1920 , 1080 ) driver.implicitly_wait( 10 ) #Place where our selected word gets pasted driver.find_element(By.XPATH, "/html/body//form[@role='search']/div[2]/div[1]//div[@class='a4bIc']/input[@role='combobox']" ) .send_keys(pyautogui.hotkey( 'ctrl' , 'v' )) |
Below is the full implementation:
Python
from selenium import webdriver from selenium.webdriver.common.by import By import time import chromedriver_autoinstaller import speech_recognition as sr import pyttsx3 import pyautogui while ( True ): try : r = sr.Recognizer() with sr.Microphone() as source2: r.adjust_for_ambient_noise(source2, duration = 0.2 ) audio2 = r.listen(source2) MyText = r.recognize_google(audio2) MyText = str (MyText.lower()) if MyText = = "search" : pyautogui.hotkey( 'ctrl' , 'c' ) chrome_options = webdriver.ChromeOptions() capabilities = { 'browserName' : 'chrome' , 'javascriptEnabled' : True } capabilities.update(chrome_options.to_capabilities()) chromedriver_autoinstaller.install() driver = webdriver.Chrome() driver.set_window_size( 1920 , 1080 ) driver.implicitly_wait( 10 ) driver.find_element(By.XPATH, "/html/body//form[@role='search']/div[2]/div[1]//div[@class='a4bIc']/input[@role='combobox']" ) .send_keys(pyautogui.hotkey( 'ctrl' , 'v' )) elif MyText = = "stop" : break except Exception as e: pyautogui.press( 'enter' ) |
Demo: