BeautifulSoup object is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. The BeautifulSoup object represents the parsed document as a whole. For most purposes, you can treat it as a Tag object.
Syntax: BeautifulSoup(document, parser)
Parameters: This function accepts two parameters as explained below:
- document: This parameter contains the XML or HTML document.
- parser: This parameter contains the name of the parser to be used to parse the document.
Below given examples explain the concept of BeautifulSoup object in Beautiful Soup.
Example 1: In this example, we are going to create a document with a BeautifulSoup object and print a tag.
Python3
# Import Beautiful Soup from bs4 import BeautifulSoup # Initialize the object with a HTML page soup = BeautifulSoup( ''' <html> <h2> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''' , "lxml" ) # Get the whole h2 tag tag = soup.h2 # Print the tag print (tag) |
Output:
<h2> Heading 1 </h2>
Example 2: In this example, we are going to create a document with a BeautifulSoup object and then extract the attributes using attrs approach.
Python3
# Import Beautiful Soup from bs4 import BeautifulSoup # Initialize the object with a HTML page soup = BeautifulSoup( ''' <h2 class="hello"> Heading 1 </h2> <h1> Heading 2 </h1> ''' , "lxml" ) # Get the whole h2 tag tag = soup.h2 # Get the attribute attribute = tag.attrs # Print the output print (attribute) |
Output:
{'class': ['hello']}