In order to print all the heading tags using BeautifulSoup, we use the find_all() method. The find_all method is one of the most common methods in BeautifulSoup. It looks through a tag and retrieves all the occurrences of that tag.
Syntax: find_all(name, attrs, recursive, string, limit, **kwargs)
An HTML document consists of the following tags – h1, h2, h3, h4, h5, and h6. The most commonly used HTML tags in webpages are h1, h2 and h3, and to find these we pass a list of tags as an argument to the find_all() method.
Steps:
- import the libraries requests and BeautifulSoup
- pass a URL into a variable
- use the request library to fetch the URL
- create a BeautifulSoup object
- create a list of heading tags ()
- iterate over all the heading tags using find_all() method
Example:
Python3
# Python program to print all heading tags import requests from bs4 import BeautifulSoup # scraping a wikipedia article request = requests.get(url_link) Soup = BeautifulSoup(request.text, 'lxml' ) # creating a list of all common heading tags heading_tags = [ "h1" , "h2" , "h3" ] for tags in Soup.find_all(heading_tags): print (tags.name + ' -> ' + tags.text.strip()) |
Output:
h2 -> Related Articles h2 -> Python3 h2 -> Python3 h2 -> Python3 h2 -> Python3 h2 -> Python3 h2 -> Python3