Monday, November 18, 2024
Google search engine
HomeLanguagesBeautifulSoup – Scraping List from HTML

BeautifulSoup – Scraping List from HTML

Prerequisite: 

Python can be employed to scrap information from a web page. It can also be used to retrieve data provided within a specific tag, this article how list elements can be scraped from HTML.

Module Needed:

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • requests: Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests

Approach:

  • Import module
  • Get HTML Code using requests module
  • Find all list tag using find_all() method.
  • Iterate through all list tag and get text using text property

Example 1: Scraping List from HTML Code

Python3




# Import Required Modules
from bs4 import BeautifulSoup
import requests
 
# HTML Code
html_content = """
<ul>
  <li>Coffee</li>
  <li>Tea</li>
  <li>Milk</li>
</ul>
"""
 
# Parse the html content
soup = BeautifulSoup(html_content, "lxml")
 
# Find all li tag
datas = soup.find_all("li")
 
# Iterate through all li tags
for data in datas:
    # Get text from each tag
    print(data.text)
 
print(f"Total {len(datas)} li tag found")


Output:

Coffee

Tea

Milk

Total 3 li tag found

Example 2: Scraping List from Web URL

Python3




# Import Required Modules
from bs4 import BeautifulSoup
import requests
 
# HTML Code
html_content = """
<ul>
  <li>Coffee</li>
  <li>Tea</li>
  <li>Milk</li>
</ul>
"""
 
# Parse the html content
soup = BeautifulSoup(html_content, "lxml")
 
# Find all li tag
datas = soup.find_all("li")
 
# Iterate through all li tags
for data in datas:
    # Get text from each tag
    print(data.text)
 
print(f"Total {len(datas)} li tag found")


Output:

RELATED ARTICLES

Most Popular

Recent Comments