Image Scraping with Python

28 July 2024

1

Scraping Is a very essential skill for everyone to get data from any website. In this article, we are going to see how to scrape images from websites using python. For scraping images, we will try different approaches.

Method 1: Using BeautifulSoup and Requests

bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.

pip install bs4

requests: Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.

pip install requests

Approach:

Import module
Make requests instance and pass into URL
Pass the requests into a Beautifulsoup() function
Use ‘img’ tag to find them all tag (‘src ‘)

Implementation:

Python3

import requests 
from bs4 import BeautifulSoup 
    
def getdata(url): 
    r = requests.get(url) 
    return r.text 
    
htmldata = getdata("https://www.geeksforgeeks.org/") 
soup = BeautifulSoup(htmldata, 'html.parser') 
for item in soup.find_all('img'):
    print(item['src'])

Output:

https://media.geeksforgeeks.org/wp-content/cdn-uploads/20201018234700/GFG-RT-DSA-Creative.png
https://media.geeksforgeeks.org/wp-content/cdn-uploads/logo-new-2.svg

Method 2: Using urllib and BeautifulSoup

urllib : It is a Python module that allows you to access, and interact with, websites with their URL. To install this type the below command in the terminal.

pip install urllib

Approach:

Import module
Read URL with urlopen()
Pass the requests into a Beautifulsoup() function
Use ‘img’ tag to find them all tag (‘src ‘)

Implementation:

Python3

from urllib.request import urlopen
from bs4 import BeautifulSoup
  
htmldata = urlopen('https://www.geeksforgeeks.org/')
soup = BeautifulSoup(htmldata, 'html.parser')
images = soup.find_all('img')
  
for item in images:
    print(item['src'])

Output:

https://media.geeksforgeeks.org/wp-content/cdn-uploads/20201018234700/GFG-RT-DSA-Creative.png
https://media.geeksforgeeks.org/wp-content/cdn-uploads/logo-new-2.svg

Image Scraping with Python

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Verizon will basically pay you to buy the new, awesome Barbie phone

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Recent Comments

EDITOR PICKS

Verizon will basically pay you to buy the new, awesome Barbie phone

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

POPULAR POSTS

Verizon will basically pay you to buy the new, awesome Barbie phone

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

POPULAR CATEGORY

ABOUT US

FOLLOW US