Prerequisite:
Python can be employed to scrap information from a web page. It can also be used to retrieve data provided within a specific tag, this article how list elements can be scraped from HTML.
Module Needed:
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
- requests: Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests
Approach:
- Import module
- Get HTML Code using requests module
- Find all list tag using find_all() method.
- Iterate through all list tag and get text using text property
Example 1: Scraping List from HTML Code
Python3
# Import Required Modules from bs4 import BeautifulSoup import requests # HTML Code html_content = """ <ul> <li>Coffee</li> <li>Tea</li> <li>Milk</li> </ul> """ # Parse the html content soup = BeautifulSoup(html_content, "lxml" ) # Find all li tag datas = soup.find_all( "li" ) # Iterate through all li tags for data in datas: # Get text from each tag print (data.text) print (f "Total {len(datas)} li tag found" ) |
Output:
Coffee
Tea
Milk
Total 3 li tag found
Example 2: Scraping List from Web URL
Python3
# Import Required Modules from bs4 import BeautifulSoup import requests # HTML Code html_content = """ <ul> <li>Coffee</li> <li>Tea</li> <li>Milk</li> </ul> """ # Parse the html content soup = BeautifulSoup(html_content, "lxml" ) # Find all li tag datas = soup.find_all( "li" ) # Iterate through all li tags for data in datas: # Get text from each tag print (data.text) print (f "Total {len(datas)} li tag found" ) |
Output: