Wednesday, September 25, 2024
Google search engine
HomeLanguagesBeautifulSoup CSS selector – Selecting nth child

BeautifulSoup CSS selector – Selecting nth child

In this article, we will see how beautifulsoup can be employed to select nth-child. For this, select() methods of the module are used. The select() method uses the SoupSieve package to use the CSS selector against the parsed document.

Syntax: select(“css_selector”)

CSS SELECTOR:

  • nth-of-type(n): Selects the nth paragraph child of the parent.
  • nth-child(n): Selects paragraph which is the nth child of the parent

Approach:

  1. Import module
  2. Scrap data from a webpage.
  3. Parse the string scraped to HTML.
  4. Use find() function to get tag with the given class name or id or tag_name.
  5. Use select(“css_selector”) to find the nth  child
  6. Print the child.

Example 1:

Python3




# importing module
from bs4 import BeautifulSoup
  
markup = """
<html>
    <head>
        <title>GEEKS FOR GEEKS EXAMPLE</title>
    </head>
    <body>
        <p class="1"><b>Geeks for Geeks</b></p>
  
        <p class="coding">A Computer Science portal for Lazyroar.
            <h1>Heading</h1>
            <b class="gfg">Programming Articles</b>,
            <b class="gfg">Programming Languages</b>,
            <b class="gfg">Quizzes</b>;
        </p>
  
        <p class="coding">practice</p>
  
    </body>
</html>
    """
  
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
  
parent = soup.find(class_="coding")
  
# assign n
n = 2
  
# print the 2nd <b> of parent
print(parent.select("b:nth-of-type("+str(n)+")"))
print()
  
# print the <b> which is the 2nd child of the parent
print(parent.select("b:nth-child("+str(n)+")"))


Output:

Explanation:

  • select(“p:nth-of-type(n)”) means select the nth paragraph child of the parent.
  • select(“p:nth-child(n)”) means select paragraph which is the nth child of the parent.
  • Both functions will return [] if a parent doesn’t have nth-child.

Example 2:

Python3




# importing module
from bs4 import BeautifulSoup
import requests
  
# assign website
page=requests.get(sample_website)
  
# parsering string to HTML
soup = BeautifulSoup(page.content, 'html.parser')
parent = soup.find(class_="wrapper")
  
# assign n
n = 1
  
# print the 2nd <b> of parent
print(parent.select("b:nth-of-type("+str(n)+")"))
print()
  
# print the <b> which is the 2nd child of the parent
print(parent.select("b:nth-child("+str(n)+")"))


Output:

RELATED ARTICLES

Most Popular

Recent Comments