Face detection is one of the most widely-demanded subfields of computer vision. Due to the advent of deep learning, computer vision has gained significant development in the last few years, and this trend is only going to increase over time. There are more and more people using computer vision without even noticing it. For example, cameras in our smartphones are able to detect faces and even identify its owner by their portrait.
The goal of this tutorial is to show you how to perform the first step of face detection: creating the face detection app by using the basic functions of the dlib library. To begin, dlib is a very powerful instrument. Originally written in C++, it can be used with C++ as well as with Python 3.x. It is also a cross-platform framework, so you can use it within Linux, MacOS, and even Windows. As the basic frontal face detection engine, dlib uses the histogram of oriented gradients (HOG) algorithm, which is generally one of the most widely used object detection algorithms.
[Related Article: Generating Gender-Neutral Face Images with Semi-Adversarial Neural Networks to Enhance Privacy]
Installation on Ubuntu
These instructions were tested with Ubuntu 16.04 and should work with newer versions too, whereas installation for other operating systems is different. Regardless of which platform you use, it is also possible to build dlib from source (this process is described here). If you are using Ubuntu, you can install package binaries to save time. In this case, you first need to get the most recent versions of packages and their dependencies, then install the most important dependencies for dlib:
>>> build-essential – required for building dlib as well as any other
>>> software, CMake – to manage the dlib building process,
>>> openBLAS – to improve performance with linear algebra optimizations.
>>> sudo apt-get update && sudo apt-get upgrade
>>> sudo apt-get install build-essential cmake
>>> sudo apt-get install libopenblas-dev liblapack-dev
It is recommended, but not required, to run the python script inside a virtual environment to prevent mixing libraries with other Python projects. In general, using a virtual environment is usually a good Python developers’ practice. It’s easy to create a virtual environment for Python 3 and start using it:
$ python3 -m venv venv
$ source venv/bin/activate
Once you can see the name of the virtual environment before the working directory name, be sure that you are inside your virtual environment. From there,you can install numpy, which is required to represent images as numpy arrays, and dlib to detect faces on the images.
$ pip install numpy dlib
Python code
I have used argparse to get the path of the input image and to set how much time the image is displayed. I would say that argparse is the most flexible way to use the same code with different parameters many times. Also, I have used time library to specify how much time the image would be shown (in seconds).
>>>import argparse
>>> from time import sleep
>>>import dlib
The next code snippet parses the arguments with argparse. The first argument is required, as it provides the relative path to the image to be passed to the face detector. The second argument specifies how much time in seconds the image would be shown. It is optional, because there is stated a default value.
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument(‘–img_path’, ‘-p’, required=True, help=’Path to image’)
>>> parser.add_argument(‘–delay’, ‘-d’, type=int, default=4)
>>> args = parser.parse_args()
In the next two lines, the HOG-based frontal face detector and popup window are defined. Both the detector and window are provided with dlib library.
>>> detector = dlib.get_frontal_face_detector()
>>> popup_window = dlib.image_window()
Here, the image is loaded and read as numpy.ndarray of shape [H,W,C], where H stands for image height (in pixels), W is image width (also in pixels), and C is the number of color channels (usually 3 if the image is in RGB format, or 1 in case if image is grayscale).
>>> image_array = dlib.load_rgb_image(args.img_path)
Finally, the next line of the code performs the face detection, using the custom dlib HOG-based frontal face detector. The second argument of the detector tells how many times the image is upsampled. The following line outputs the count of detected faces.
>>> faces = detector(image_array, 1)
>>> print(f'{len(faces)} faces detected on the image’)
Now, let’s visualize everything we’ve done. First of all, display the image inside the popup window and then add the bounding boxes over the area, detected as human faces. After that, let’s freeze the code for some time, which we’ve put in with the –delay argument. Once some time has passed, the image window should collapse:
>>> popup_window.set_image(image_array)
>>> popup_window.add_overlay(faces)
>>> sleep(args.delay)
>>> popup_window.clear_overlay()
Detecting faces
[Related Article: 6 Unique GANs Use Cases]
Now the script is ready, so let’s run it. We have to specify the command line arguments for the image path, and optionally, for the duration of showing the image. Don’t forget about the virtual environment!
$ python dlib_face_detection.py –img_path images/image.jpg –delay 3
This is what the final result should look like: