Friday, December 27, 2024
Google search engine
HomeLanguagesHow to Remove tags using BeautifulSoup in Python?

How to Remove tags using BeautifulSoup in Python?

Prerequisite- Beautifulsoup module

In this article, we are going to draft a python script that removes a tag from the tree and then completely destroys it and its contents. For this, decompose() method is used which comes built into the module.

Syntax:

Beautifulsoup.Tag.decompose()

Tag.decompose() removes a tag from the tree of a given HTML document, then completely destroys it and its contents.

Implementation:

Example 1:

Python3




# import module
from bs4 import BeautifulSoup
 
# URL for scraping data
markup = '<a href="https://www.geeksforgeeks.org/">Welcome to <i>neveropen.com</i></a>'
 
# get URL html
soup = BeautifulSoup(markup, 'html.parser')
 
# display before decompose
print("Before Decompose")
print(soup.a)
 
# decomposing the
# soup data
new_tag = soup.a.decompose()
print("After decomposing:")
print(new_tag)


Output:

Before Decompose

<a href=”https://www.geeksforgeeks.org/”>Welcome to <i>neveropen.com</i></a>

After decomposing:

None
 

Example 2: Implementation of given URL to scrape the HTML document.

Python3




# import module
from bs4 import BeautifulSoup
import requests
 
# Get URL html
# Scraping the data from
# Html doc
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, 'html.parser')
 
# Before decomposing
print("Before Decomposing")
print(soup)
 
# decompose the soup
result = soup.decompose()
print("After decomposing:")
print(result)


Output:

Before Decomposing

<!DOCTYPE html>

<!–[if IE 7]>

<html class=”ie ie7″ lang=”en-US” prefix=”og: http://ogp.me/ns#”>

<![endif]–>

<!–[if IE 8]>

<html class=”ie ie8″ lang=”en-US” prefix=”og: http://ogp.me/ns#”>

<![endif]–>

<!–[if !(IE 7) | !(IE 8)  ]><!–>

<html lang=”en-US” prefix=”og: http://ogp.me/ns#”>

<!–<![endif]–>

<head>

<meta charset=”utf-8″/>..

……

After decomposing:

None
 

RELATED ARTICLES

Most Popular

Recent Comments