ImageGrab and PyTesseract
ImageGrab is a Python module that helps to capture the contents of the screen. PyTesseract is an Optical Character Recognition(OCR) tool for Python. Together they can be used to read the contents of a section of the screen.
Installation –
Pillow (a newer version of PIL)
pip install PillowPyTesseract
pip install pytesseractApart from this, a tesseract executable needs to be installed.
Implementation of code
The following functions were primarily used in the code –
pytesseract.image_to_string(image, lang=**language**) – Takes the image and searches for words of the language in their text.
cv2.cvtColor(image, **colour conversion**) – Used to make the image monochrome(using cv2.COLOR_BGR2GRAY).
ImageGrab.grab(bbox=**Coordinates of the area of the screen to be captured**) – Used to repeatedly(using a loop) capture a specific part of the screen.
The objectives of the code are:
- To use a loop to repeatedly capture a part of the screen.
- To convert the captured image into grayscale.
- Use PyTesseract to read the text in it.
Code : Python code to use ImageGrab and PyTesseract
# cv2.cvtColor takes a numpy ndarray as an argument import numpy as nm import pytesseract # importing OpenCV import cv2 from PIL import ImageGrab def imToString(): # Path of tesseract executable pytesseract.pytesseract.tesseract_cmd = '**Path to tesseract executable**' while ( True ): # ImageGrab-To capture the screen image in a loop. # Bbox used to capture a specific area. cap = ImageGrab.grab(bbox = ( 700 , 300 , 1400 , 900 )) # Converted the image to monochrome for it to be easily # read by the OCR and obtained the output String. tesstr = pytesseract.image_to_string( cv2.cvtColor(nm.array(cap), cv2.COLOR_BGR2GRAY), lang = 'eng' ) print (tesstr) # Calling the function imToString() |
Output
The above code can be used to capture a certain section of the screen and read the text contents of it.
Please Login to comment…