You might have seen there are various websites that are complex as well as lengthy, from which searching anything becomes difficult. To ease our work of searching, modifying, and iteration, Python gives us some inbuilt libraries, such as Requests, Xml, Beautiful Soup, Selenium, Scrapy, etc. Among all these available libraries, Beautiful Soup is the one that does web scraping comparatively faster than those other available in Python. Sometimes, there occurs situations, when we need to find all the children of an element with the help of Beautiful Soup. If you don’t know, how to find these. Don’t worry! In this article, we will be discussing the procedure of finding the children of an element.
Syntax:
unordered_list=soup.find(“#Widget Name”, {“id”:”#Id name of element of which you want to find children “})
children = unordered_list.findChildren()
Below is the HTML file for considering:
HTML
<!DOCTYPE html> < html > < head > My First Heading </ head > < body > < p id = "para" > Vinayak Rai </ p > < ul id = "list" >Fruits < li >Apple</ li > < li >Banana</ li > < li >Mango</ li > </ ul > </ body > </ html > |
Stepwise implementation:
Step 1: First, import the libraries Beautiful Soup and os.
Python3
from bs4 import BeautifulSoup as bs import os |
Step 2: Now, remove the last segment of the path by giving the same name to abspath as given to your Python file.
Python3
base = os.path.dirname(os.path.abspath( #Name of your Python file)) |
Step 3: Then, open the HTML file you wish to open.
Python3
html = open (os.path.join(base, '#Name of HTML file' )) |
Step 4: Parsing HTML in Beautiful Soup.
Python3
soup = bs(html, 'html.parser' ) |
Step 5: Further, give the location of an element for which you want to find children
Python3
unordered_list = soup.find( "#Widget Name" , { "id" : "#Id name of element of which you want to find children " }) |
Step 6: Next, find all the children of an element.
Python3
children = unordered_list.findChildren() |
Step 7: Finally, print all the children of an element that you have found in the last step.
Python3
for child in children: print (child) |
Below is the full implementation:
Python
# Python program to find all the children # of an element using Beautiful Soup # Import the libraries BeautifulSoup and os from bs4 import BeautifulSoup as bs import os # Remove the last segment of the path # Give same name in abspath as given to Python file base = os.path.dirname(os.path.abspath( 'run.py' )) # Open the HTML in which you want to make changes html = open (os.path.join(base, 'gfg.html' )) # Parse HTML file in Beautiful Soup soup = bs(html, 'html.parser' ) # Give location where text is stored which you wish to alter unordered_list = soup.find( "ul" , { "id" : "list" }) # Find children of an element children = unordered_list.findChildren() # Print all children of an element for child in children: print (child) |
Output: