Thursday, December 26, 2024
Google search engine
HomeLanguagesHow to extract a div tag and its contents by id with...

How to extract a div tag and its contents by id with BeautifulSoup?

Beautifulsoup is a Python library used for web scraping. This powerful python tool can also be used to modify HTML webpages. This article depicts how beautifulsoup can be employed to extract a div and its content by its ID. For this, find() function of the module is used to find the div by its ID.

Approach:

  • Import module
  • Scrap data from a webpage
  • Parse the string scraped to HTML
  • Find the div with its ID
  • Print its content

Syntax : find(tag_name, **kwargs)

Parameters:

  • The tag_name argument tell Beautiful Soup to only find tags with given names. Text strings will be ignored, as will tags whose names that don’t match.
  • The **kwargs arguments are used to filter against each tag’s ‘id’ attribute.

Below is the implementation:

Example 1:

Python3




#importing module
from bs4 import BeautifulSoup
  
markup = '''<html><body><div id="container">Div Content</div></body></html>'''
soup = BeautifulSoup(markup, 'html.parser')
  
#finding the div with the id
div_bs4 = soup.find('div', id = "container")
  
print(div_bs4.string)


Output:

Div Content

Example 2:

Python3




#importing module
from bs4 import BeautifulSoup
  
markup =markup = """
  
<!DOCTYPE>
<html>
  <head><title>Example</title></head>
    <body>
        
<p>
        Nested div
      </p>
  
        <div id="first"> Div with ID first
          <div id="second"> Div with id second
          </div>
        </div> 
    </body>
</html>
"""
  
# parsering string to HTML 
soup = BeautifulSoup(markup, 'html.parser')
  
#finding the div with the id
div_bs4 = soup.find('div', id = "second")
  
print(div_bs4.string)


Output:

 Div with id second

RELATED ARTICLES

Most Popular

Recent Comments