Wednesday, December 25, 2024
Google search engine
HomeLanguagesBeautifulSoup – Find tags by CSS class with CSS Selectors

BeautifulSoup – Find tags by CSS class with CSS Selectors

Prerequisites: Beautifulsoup

Beautifulsoup is a Python library used for web scraping. BeautifulSoup object is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. The BeautifulSoup object represents the parsed document as a whole. This powerful python tool can also be used to modify HTML webpages. This article depicts how beautifulsoup can be employed to find tag by CSS class with CSS Selectors. For this, find_all() method of the module is used.

Syntax:

find_all(class_=”class_name”)

Returns tags having a particular CSS class.

Approach:

  • Import module
  • Scrap data from a webpage.
  • Parse the string scraped to HTML.
  • Use find_all() function to get a list of tag with the given class name.
  • Print tags.

Example 1:  Finding all tags of a particular CSS class from an HTML file.

Python3




# importing module
from bs4 import BeautifulSoup
 
markup = """
 
<!DOCTYPE>
<html>
  <head><title>Example</title></head>
    <body>
        <div class="first"> Div with Class first
        </div>
        <p class="first"> Para with Class first
        </p>
 
 
        <div class="second"> Div with Class second
        </div>
        <span class="first"> Span with Class first
        </span>
    </body>
</html>
"""
 
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
 
# printing tags with given class name
for i in soup.find_all(class_="first"):
    print(i.name)


Output:

div
p
span

Example 2: Finding all tags of a particular CSS class from a URL.

Python3




# importing module
from bs4 import BeautifulSoup
import requests
 
html = requests.get(URL)
 
# parsering string to HTML
soup = BeautifulSoup(html.content, "html5lib")
 
# printing tags with given class name
for i in soup.find_all(class_="article--container_content"):
    print(i.name)


Output:

div

RELATED ARTICLES

Most Popular

Recent Comments