How to Handle duplicate attributes in BeautifulSoup ?

26 July 2024

1

Sometimes while obtaining the information, are you facing any issue in handling the information received from duplicate attributes of the same tags? If YES, then read the article and clear all your doubts.

Once you have created the list to store the items, write the given below code.

Syntax:

list=soup.find_all(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”})

After writing the following code, remove the attributes from the output and print the certain item you want from the list.

Approach:

Import module
Now, remove the last segment of the path by entering the name of Python file in which you are currently working.

Syntax:

base=os.path.dirname(os.path.abspath(‘#Name of Python file in which you are currently working’))

Then, open the HTML file from which you want to read the value.

Syntax:

html=open(os.path.join(base, ‘#Name of HTML file from which you wish to read value’))

Parse the HTML file in BeautifulSoup.
Further, create a list to store all the item values of the same tag and attributes.
Next, find all the items which have same tag and attributes.

Syntax:

list=soup.find_all(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”})

Later on, remove all the attributes from the tag.
Finally, print the certain item of the widget tag.

Webpage in use:

HTML

<!DOCTYPE html> 
<html> 
 <head> 
   Geeks For Geeks 
 </head> 
 <body> 
 <div> 
     <p id="vinayak">King</p> 
  
     <p id="vinayak">Prince</p> 
  
     <p id="vinayak">Queen</p> 
  
 </div> 
 <p id="vinayak">Princess</p> 
  
  </body> 
</html>

Program:

Python

# Import the libraries beautifulsoup and os 
from bs4 import BeautifulSoup as bs 
import os 
  
# Remove the last segment of the path 
# Here replace the name of your python file with 
# gfg4.py 
base = os.path.dirname(os.path.abspath("gfg4.py")) 
  
# Open the HTML in which you want to make  
# changes 
html = open(os.path.join(base, 'gfg.html')) 
  
# Parse HTML file in Beautiful Soup 
soup = bs(html, 'html.parser') 
  
# Create a list to store the items 
list = [3] 
  
# Finding all the elements inside div 
# with paragraph having id: vinayak 
list = soup.div.find_all("p", {"id": "vinayak"}) 
  
# Removing attributes from the output 
for i in list: 
    i.attrs = {} 
  
# Printing the value Prince 
print(list[1]) 
  
# Printing the value Queen 
print(list[2]) 

Output:

<p>Prince</p>

<p>Queen</p>

How to Handle duplicate attributes in BeautifulSoup ?

Approach:

HTML

Python

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

10 Best Antiviruses With a Password Manager in 2025 by Ana Jovanovic

5 Best Internet Security Packages for Laptops in 2025 by Eric Goldstein

5 Best (REALLY FREE) Android Antivirus Apps for 2025 by Hazel Shaw

5 Best Security Apps for Tablets in 2025: Expert Ranked by Sam Boyd

Recent Comments

EDITOR PICKS

10 Best Antiviruses With a Password Manager in 2025 by Ana Jovanovic

5 Best Internet Security Packages for Laptops in 2025 by Eric Goldstein

5 Best (REALLY FREE) Android Antivirus Apps for 2025 by Hazel Shaw

POPULAR POSTS

10 Best Antiviruses With a Password Manager in 2025 by Ana Jovanovic

5 Best Internet Security Packages for Laptops in 2025 by Eric Goldstein

5 Best (REALLY FREE) Android Antivirus Apps for 2025 by Hazel Shaw

POPULAR CATEGORY

ABOUT US

FOLLOW US