How to find a HTML tag that contains certain text using BeautifulSoup ?

27 July 2024

3

In this article, we are going to see how to find an HTML tag that contains certain text using BeautifulSoup.

Methods used:

Open( filename, mode ): It opens the given filename in that mode which we have passed.

find_all ( ): It finds all the pattern in the file which will match with the passed expression.

Here, in the given below code, we are finding a certain text mentioned as a pattern in the program, in various different tags. Now the code will provide all these tags which will have the text matched with the pattern.

Approach:

Here we first import the regular expressions and BeautifulSoup libraries. Then we open the HTML file using the open function which we want to parse. Then using the find_all function, we find a particular tag that we pass inside that function and also the text we want to have within the tag. If the passed tag has that certain text, then it is added to a list.

So all the tags having certain text are stored in a list and then the list is printed. If we get the empty list, then it means that there is no such tag having the text we were trying to check.

Below is the HTML file for demonstration:

HTML

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>GFG </title>
</head>
<body>
    <a href="https://www.geeksforgeeks.org/">Geeks For Geeks</a>
    <a href="Dummy Check Text">Geeks For Geeks</a>
    <a href="Dummywebsite.com">Dummy Text</a>
 
    <h1>Hello</h1>
    <h1>Python Program</h1>
 
   <span class = true>Geeks For Geeks</span>
   <span class = false>Geeks For Geeks</span>
 
   <li class = 1 >Python Program</li>
   <li class = 2 >Python Code</li>
 
   <table>
       <tr>GFG Website</tr>
   </table>
 
</body>
</html>

Output:

Below is the implementation:

Python3

# Python program to find a HTML tag
# that contains certain text Using BeautifulSoup
 
# Importing library
from bs4 import BeautifulSoup
import re
 
# Opening and reading the html file
file = open("gfg.html", "r")
contents = file.read()
 
soup = BeautifulSoup(contents, 'html.parser')
 
# Finding a pattern(certain text)
pattern = 'Geeks For Geeks'
 
# Anchor tag
text1 = soup.find_all('a', text = pattern) 
print(text1)
 
# Span tag
text2 = soup.find_all('span', text = pattern)  
print(text2)
 
# Finding a pattern(certain text)
pattern2 = 'Python Program'
 
# Heading tag
text3 = soup.find_all('h1', text = pattern2)  
print(text3)
 
# List tag
text4 = soup.find_all('li', text = pattern2)  
print(text4)
 
# Finding a pattern(certain text)
pattern3 = 'GFG Website'
 
# Table(row) tag
text5 = soup.find_all('tr', text = pattern3)  
print(text5)

Output:

[<a href=”https://www.geeksforgeeks.org/”>Geeks For Geeks</a>, <a href=”Dummy Check Text”>Geeks For Geeks</a>]

[<span class=”true”>Geeks For Geeks</span>, <span class=”false”>Geeks For Geeks</span>]

[<h1>Python Program</h1>]

[<li class=”1″>Python Program</li>]

[<tr>GFG Website</tr>]

How to find a HTML tag that contains certain text using BeautifulSoup ?

HTML

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

One UI 7: Everything you need to know

Review: The Ulefone Armor Mini 20T Pro makes other rugged phones seem flimsy

Best midrange Android phones in 2024

I tried a Xiaomi mid-ranger for the first time in years, and I’m glad the Pixel 8a exists in the US

Recent Comments

EDITOR PICKS

One UI 7: Everything you need to know

Review: The Ulefone Armor Mini 20T Pro makes other rugged phones seem flimsy

Best midrange Android phones in 2024

POPULAR POSTS

One UI 7: Everything you need to know

Review: The Ulefone Armor Mini 20T Pro makes other rugged phones seem flimsy

Best midrange Android phones in 2024

POPULAR CATEGORY

ABOUT US

FOLLOW US