Saturday, November 16, 2024
Google search engine
HomeLanguagesFind the text of the given tag using BeautifulSoup

Find the text of the given tag using BeautifulSoup

Web scraping is a process of using software bots called web scrapers in extracting information from HTML or XML content of a web page. Beautiful Soup is a library used for scraping data through python. Beautiful Soup works along with a parser to provide iteration, searching, and modifying the content that the parser provides(in the form of a parse tree). It’s fairly easy to crawl through the web pages and to find the text of a given tag using Beautiful Soup. 

In this article, we will discuss finding the text from the given tag.

Step-by-step Approach: 

  • First import the library.

Python3




from bs4 import BeautifulSoup
import requests


  • Now assign the URL.

Python3




# assign URL


  • Fetch the raw HTML content from the URL.

Python3




html_content = requests.get(url).text


  • Now parse through the content.

Python3




# Now that the content is ready, iterate
# through the content using BeautifulSoup
soup = BeautifulSoup(html_content, "html.parser")


  • After the content is parsed we search for a specific tag and print its text.

Python3




print(soup.find('title'))


Below is the complete program.

Python3




from bs4 import BeautifulSoup
import requests
 
 
# Assign URL
 
# Fetch raw HTML content
html_content = requests.get(url).text
 
# Now that the content is ready, iterate
# through the content using BeautifulSoup:
soup = BeautifulSoup(html_content, "html.parser")
 
# similarly to get all the occurrences of a given tag
print(soup.find('title').text)


Output:

Similarly to get all the occurrences of the given tag:

Python3




from bs4 import BeautifulSoup
import requests
 
# Assign URL
 
# Fetch raw HTML content
html_content = requests.get(url).text
 
# Now that the content is ready, iterate
# through the content using BeautifulSoup:
soup = BeautifulSoup(html_content, "html.parser")
 
# similarly to get all the occurrences of a given tag
texts = soup.find_all('p')
for text in texts:
    print(text.get_text())


Output:

RELATED ARTICLES

Most Popular

Recent Comments