In this article, we will see how beautifulsoup can be employed to select nth-child. For this, select() methods of the module are used. The select() method uses the SoupSieve package to use the CSS selector against the parsed document.
Syntax: select(“css_selector”)
CSS SELECTOR:
- nth-of-type(n): Selects the nth paragraph child of the parent.
- nth-child(n): Selects paragraph which is the nth child of the parent
Approach:
- Import module
- Scrap data from a webpage.
- Parse the string scraped to HTML.
- Use find() function to get tag with the given class name or id or tag_name.
- Use select(“css_selector”) to find the nth child
- Print the child.
Example 1:
Python3
# importing module from bs4 import BeautifulSoup markup = """ <html> <head> <title>GEEKS FOR GEEKS EXAMPLE</title> </head> <body> <p class="1"><b>Geeks for Geeks</b></p> <p class="coding">A Computer Science portal for Lazyroar. <h1>Heading</h1> <b class="gfg">Programming Articles</b>, <b class="gfg">Programming Languages</b>, <b class="gfg">Quizzes</b>; </p> <p class="coding">practice</p> </body> </html> """ # parsering string to HTML soup = BeautifulSoup(markup, 'html.parser' ) parent = soup.find( class_ = "coding" ) # assign n n = 2 # print the 2nd <b> of parent print (parent.select( "b:nth-of-type(" + str (n) + ")" )) print () # print the <b> which is the 2nd child of the parent print (parent.select( "b:nth-child(" + str (n) + ")" )) |
Output:
Explanation:
- select(“p:nth-of-type(n)”) means select the nth paragraph child of the parent.
- select(“p:nth-child(n)”) means select paragraph which is the nth child of the parent.
- Both functions will return [] if a parent doesn’t have nth-child.
Example 2:
Python3
# importing module from bs4 import BeautifulSoup import requests # assign website page = requests.get(sample_website) # parsering string to HTML soup = BeautifulSoup(page.content, 'html.parser' ) parent = soup.find( class_ = "wrapper" ) # assign n n = 1 # print the 2nd <b> of parent print (parent.select( "b:nth-of-type(" + str (n) + ")" )) print () # print the <b> which is the 2nd child of the parent print (parent.select( "b:nth-child(" + str (n) + ")" )) |
Output: