In this article, we are going to see how to compute the area of a set of bounding boxes in PyTorch. We can compute the area of a set of bounding boxes by using the box_area() method of torchvision.io module.
box_area() method
This method accepts bounding boxes as an input and returns the area of the given bounding boxes. The input bounding boxes must be torch Tensors with [N,4] size, where N represents the number of bounding boxes for which the area will be computed. The bounding boxes are expected to be in the format (x_min, y_min, x_max, y_max), where 0 ≤ x_min < x_max, and 0 ≤ y_min < y_max. Before computing the area of a bounding box we use unsqueeze to make this bounding box tensor into a 2D tensor.
Syntax: torchvision.ops.box_area(boxes)
Parameter:
- boxes: This method accepts bounding boxes as input.
Return: This method return area for each box.
Stepwise Implementation
Step 1: Import the required libraries.
Python
import torchimport torchvisionfrom torchvision.io import read_imagefrom torchvision.utils import draw_bounding_boxesfrom torchvision.ops import box_area |
Step 2: Read the input image from your computer.
Python
img = read_image('img.png') |
Step 3: define a bounding box and convert this box into a torch tensor.
Python
b_box = [80, 70, 500, 200]b_box = torch.tensor(b_box, dtype=torch.int) |
Step 4: unsqueeze the given bounding box to make it a 2D tensor. Execute this step only if we want to compute the area of a single bounding box else skip this step.
Python
b_box = b_box.unsqueeze(0) |
Step 5: Compute the above defined bounding box area and store this computed area in a variable for further use.
Python
area = box_area(b_box) |
Step 6: set this computed area on the label.
Python
label = [f"b_box area = {area.item()}"] |
Step 7: Draw a bounding box on the image and put the above-defined label on box.
Python
img = draw_bounding_boxes(img, b_box, labels=label,                          width=4, colors=(255, 0, 0)) |
Step 8: transform this image into a PIL image
Python
img = torchvision.transforms.ToPILImage()(img) |
Step 9: Display the output image.
Python
img.show() |
The below image is used for demonstration:
Â
Example 1:
in this example, we are computing the area of a single bounding box and set this computed area as a label.
Python
# Import the required librariesimport torchimport torchvisionfrom torchvision.io import read_imagefrom torchvision.utils import draw_bounding_boxesfrom torchvision.ops import box_areaÂ
# read input image from your computerimg = read_image('img.png')Â
# bounding box are xmin, ymin, xmax, ymaxb_box = [80, 70, 500, 200]Â
# convert the bounding box to torch tensorb_box = torch.tensor(b_box, dtype=torch.int)Â
# unsqueeze the given bounding box to make# it 2D tensorb_box = b_box.unsqueeze(0)Â
# Compute the bounding box areaarea = box_area(b_box)Â
# set this computed area on labellabel = [f"b_box area = {area.item()}"]Â
# draw the above define bounding box on image# Set the above define label on imageimg = draw_bounding_boxes(img, b_box, labels=label,                          width=4, colors=(255, 0, 0))Â
Â
# transform this image to PIL imageimg = torchvision.transforms.ToPILImage()(img)Â
# display resultimg.show() |
Output:
Â
Example 2:
in this example, we are computing the area of multiple bounding boxes and set this computed area as a label for each box.
Python
# Import the required librariesimport torchfrom PIL import Imageimport torchvisionfrom torchvision.io import read_imagefrom torchvision.utils import draw_bounding_boxesfrom torchvision.ops import box_areaÂ
# read input image from your computerimg = read_image('img.png')Â
# create boxesb_box1 = [80, 70, 500, 200]b_box2 = [80, 230, 500, 300]b_box3 = [580, 70, 720, 300]b_box = [b_box1, b_box2,b_box3]Â
# convert the bounding box to torch tensorb_box = torch.tensor(b_box, dtype=torch.int)Â
# Compute the bounding box areaarea = box_area(b_box)Â
# set this computed area on labellabels = [f"b_box area ={n}" for n in area]Â
# draw the above define bounding boxes on imageimg=draw_bounding_boxes(img, b_box, labels = labels, width=4,                        colors=["orange", "white","red"])Â
# transform this image to PIL imageimg = torchvision.transforms.ToPILImage()(img)Â
# display resultimg.show() |
Output:
Â
