Wednesday, December 4, 2024
Google search engine
HomeLanguagesDifference between find and find_all in BeautifulSoup – Python

Difference between find and find_all in BeautifulSoup – Python

BeautifulSoup is one of the most common libraries in Python which is used for navigating, searching, and pulling out data from HTML or XML webpages. The most common methods used for finding anything on the webpage are find() and find_all(). However, there is a slight difference between these two, let’s discuss them in detail.

find() method

The find method is used for finding out the first tag with the specified name or id and returning an object of type bs4.

Syntax: find_syntax=soup.find(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”}).get_text()

Example:

For instance, consider this simple HTML webpage having different paragraph tags.

HTML




<!DOCTYPE html>
<html>
    
 <head>
   Geeks For Geeks
 </head>
    
 <body>
 <div>
     <p id="vinayak">King</p>
  
     <p id="vinayak1">Prince</p>
  
     <p id="vinayak2">Queen</p>
  
 </div>
 <p id="vinayak3">Princess</p>
  
  </body>
  
</html>


For obtaining the text King, we use find method.

Python




# Find example
  
# Import the libraries BeautifulSoup
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
base=os.path.dirname(os.path.abspath(__file__))
  
# Open the HTML in which you want to
# make changes
html=open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup=bs(html, 'html.parser')
  
# Obtain the text from the widget after 
# finding it
find_example=soup.find("p", {"id":"vinayak"}).get_text()
  
# Printing the text obtained received 
# in previous step
print(find_example)


Output:

find_all() method

The find_all method is used for finding out all tags with the specified tag name or id and returning them as a list of type bs4.

Syntax:

for word in soup.find_all(‘id’):

     find_all_syntax=word.get_text()

     print(find_all_syntax)

Example:

For instance, consider this simple HTML webpage having different paragraph tags.

HTML




<!DOCTYPE html>
<html>
    
 <head>
   Geeks For Geeks
 </head>
    
 <body>
 <div>
     <p id="vinayak">King</p>
  
     <p id="vinayak1">Prince</p>
  
     <p id="vinayak2">Queen</p>
  
 </div>
 <p id="vinayak3">Princess</p>
  
  </body>
    
</html>


For obtaining all the text, i.e., King, Prince, Queen, Princess, we use find_all method.

Python




# find_all example
  
# Import the libraries BeautifulSoup
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
base=os.path.dirname(os.path.abspath(__file__))
  
# Open the HTML in which you want to 
# make changes
html=open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup=bs(html, 'html.parser')
  
# Construct a loop to find all the
# p tags
for word in soup.find_all('p'):
  
    # Obtain the text from the received
    # tags
    find_all_example=word.get_text()
  
    # Print the text obtained received 
    # in previous step
    print(find_all_example)


Output:

Table of Difference between find and find_all

S.No.       

find

find_all

1

find is used for returning the result when the searched element is found on the page. 

find_all is used for returning all the matches after scanning the entire document.

2

It is used for getting merely the first tag of the incoming HTML object for which condition is satisfied.  

It is used for getting all the incoming HTML objects for which condition is satisfied.  

3

The return type of find is <class ‘bs4.element.Tag’>.

The return type of find_all is <class ‘bs4.element.ResultSet’>

4

We can print only the first search as an output.

We can print any search, I.e., second, third, last, etc. or all the searches as an output.

5

Prototype: find(tag, attributes, recursive, text, keywords)

Prototype: findAll(tag, attributes, recursive, text, limit, keywords)

RELATED ARTICLES

Most Popular

Recent Comments