Prerequisite: Beautifulsoup Installation
Attributes are provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. A tag may have any number of attributes. For example, the tag <b class=”active”> has an attribute “class” whose value is “active”. We can access a tag’s attributes by treating it like a dictionary.
Syntax:
tag.attrs
Implementation:
Example 1: Program to extract the attributes using attrs approach.
Python3
# Import Beautiful Soup from bs4 import BeautifulSoup # Initialize the object with a HTML page soup = BeautifulSoup( ''' <html> <h2 class="hello"> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''' , "lxml" ) # Get the whole h2 tag tag = soup.h2 # Get the attribute attribute = tag.attrs # Print the output print (attribute) |
Output:
{'class': ['hello']}
Example 2: Program to extract the attributes using dictionary approach.
Python3
# Import Beautiful Soup from bs4 import BeautifulSoup # Initialize the object with a HTML page soup = BeautifulSoup( ''' <html> <h2 class="hello"> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''' , "lxml" ) # Get the whole h2 tag tag = soup.h2 # Get the attribute attribute = tag[ 'class' ] # Print the output print (attribute) |
Output:
['hello']
Example 3: Program to extract the multiple attribute values using dictionary approach.
Python3
# Import Beautiful Soup from bs4 import BeautifulSoup # Initialize the object with a HTML page soup = BeautifulSoup( ''' <html> <h2 class="first second third"> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''' , "lxml" ) # Get the whole h2 tag tag = soup.h2 # Get the attribute attribute = tag[ 'class' ] # Print the output print (attribute) |
Output:
['first', 'second', 'third']