Saturday, November 16, 2024
Google search engine
HomeLanguagesGet all text of the page using Selenium in Python

Get all text of the page using Selenium in Python

As we know Selenium is an automation tool through which we can automate browsers by writing some lines of code. It is compatible with all browsers, Operating systems, and also its program can be written in any programming language such as Python, Java, and many more.

Selenium provides a convenient API to access Selenium WebDrivers like Firefox, IE, Chrome, Remote, etc. The currently supported Python versions are 3.5 and above.

Installation:

Use pip to install the Selenium package. Just write this below command on Command Prompt.

pip install selenium

Once installation gets done. Open Python Console and just write these two commands for verifying whether Selenium is installed or not.

Python3




import selenium
 
print(selenium.__version__)


Output:

'3.141.0'

Webdriver Manager for Python:

Previously, We should download binary chromedriver and Unzip it somewhere on our PC and also set a path. After that, set path to this driver like this:

webdriver.Chrome(executable_path=”D:\PyCharm_Projects\SeleniumLearning\Drivers\ChromeDriverServer.exe”)

But Every time, a new version of the driver is released, so we need to download a new driver otherwise it will give us errors. For Solving this issue, we need to install webdriver-manager:

Installation:

pip install webdriver-manager

If we are using chrome driver, then we need to write these lines:

Python3




from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
 
driver = webdriver.Chrome(ChromeDriverManager().install())


Like Chrome, We have some other browsers also. For Example:

Use with Chromium:

Python3




from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from webdriver_manager.utils import ChromeType
 
driver = webdriver.Chrome(ChromeDriverManager(chrome_type = ChromeType.CHROMIUM).install())


Use with FireFox:

Python3




from selenium import webdriver
from webdriver_manager.firefox import GeckoDriverManager
 
driver = webdriver.Firefox(executable_path = GeckoDriverManager().install())


Use with IE:

Python3




from selenium import webdriver
from webdriver_manager.microsoft import IEDriverManager
 
driver = webdriver.Ie(IEDriverManager().install())


Use with Edge:

Python3




from selenium import webdriver
from webdriver_manager.microsoft import EdgeChromiumDriverManager
 
driver = webdriver.Edge(EdgeChromiumDriverManager().install())


Get all text of the page using Selenium in Python

Let’s learn how to automate the tasks with the help of selenium in Python Programming. Here in this article, We are discussing how to get all text of the page using selenium.

Approach:

  1. Import the webdriver from selenium module
  2. Here, in this article, we will automate the task on Chrome browser. So, We have to import ChromeDriverManager from the webdriver_manager.chrome. Now, We are not required to download any drivers from the internet. This command will automatically download the drivers from the Internet. Currently, Supported WebDriver implementations are Firefox, Chrome, IE, and Remote.
  3. Installing the Chrome driver and store in the instance of webdriver.
  4. The driver.get method will navigate to a page given by the URL. WebDriver will wait until the page gets fully loaded before returning control to our program.
  5. WebDriver gives various ways to find the elements in our page using one of the find_element_by_* methods. For example, Body section of the given page can be located with the help of it’s xpath, we will use the  find_element_by_xpath method.
  6. Finally, for closing the browser window. We will use the driver.close method. One more method, we have driver.exit method which closes entire browsers where driver.close will close only one window tab.

Below is the Implementation:

Python3




# Importing necessary modules
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
 
# WebDriver Chrome
driver = webdriver.Chrome()
 
# Target URL
# To load entire webpage
time.sleep(5)
 
# Printing the whole body text
print(driver.find_element(By.XPATH, "/html/body").text)
 
# Closing the driver
driver.close()


Output:

RELATED ARTICLES

Most Popular

Recent Comments