In this article, we are going to see how to compute the area of a set of bounding boxes in PyTorch. We can compute the area of a set of bounding boxes by using the box_area() method of torchvision.io module.
box_area() method
This method accepts bounding boxes as an input and returns the area of the given bounding boxes. The input bounding boxes must be torch Tensors with [N,4] size, where N represents the number of bounding boxes for which the area will be computed. The bounding boxes are expected to be in the format (x_min, y_min, x_max, y_max), where 0 ≤ x_min < x_max, and 0 ≤ y_min < y_max. Before computing the area of a bounding box we use unsqueeze to make this bounding box tensor into a 2D tensor.
Syntax: torchvision.ops.box_area(boxes)
Parameter:
- boxes: This method accepts bounding boxes as input.
Return: This method return area for each box.
Stepwise Implementation
Step 1: Import the required libraries.
Python
import torch import torchvision from torchvision.io import read_image from torchvision.utils import draw_bounding_boxes from torchvision.ops import box_area |
Step 2: Read the input image from your computer.
Python
img = read_image( 'img.png' ) |
Step 3: define a bounding box and convert this box into a torch tensor.
Python
b_box = [ 80 , 70 , 500 , 200 ] b_box = torch.tensor(b_box, dtype = torch. int ) |
Step 4: unsqueeze the given bounding box to make it a 2D tensor. Execute this step only if we want to compute the area of a single bounding box else skip this step.
Python
b_box = b_box.unsqueeze( 0 ) |
Step 5: Compute the above defined bounding box area and store this computed area in a variable for further use.
Python
area = box_area(b_box) |
Step 6: set this computed area on the label.
Python
label = [f "b_box area = {area.item()}" ] |
Step 7: Draw a bounding box on the image and put the above-defined label on box.
Python
img = draw_bounding_boxes(img, b_box, labels = label, width = 4 , colors = ( 255 , 0 , 0 )) |
Step 8: transform this image into a PIL image
Python
img = torchvision.transforms.ToPILImage()(img) |
Step 9: Display the output image.
Python
img.show() |
The below image is used for demonstration:
Example 1:
in this example, we are computing the area of a single bounding box and set this computed area as a label.
Python
# Import the required libraries import torch import torchvision from torchvision.io import read_image from torchvision.utils import draw_bounding_boxes from torchvision.ops import box_area # read input image from your computer img = read_image( 'img.png' ) # bounding box are xmin, ymin, xmax, ymax b_box = [ 80 , 70 , 500 , 200 ] # convert the bounding box to torch tensor b_box = torch.tensor(b_box, dtype = torch. int ) # unsqueeze the given bounding box to make # it 2D tensor b_box = b_box.unsqueeze( 0 ) # Compute the bounding box area area = box_area(b_box) # set this computed area on label label = [f "b_box area = {area.item()}" ] # draw the above define bounding box on image # Set the above define label on image img = draw_bounding_boxes(img, b_box, labels = label, width = 4 , colors = ( 255 , 0 , 0 )) # transform this image to PIL image img = torchvision.transforms.ToPILImage()(img) # display result img.show() |
Output:
Example 2:
in this example, we are computing the area of multiple bounding boxes and set this computed area as a label for each box.
Python
# Import the required libraries import torch from PIL import Image import torchvision from torchvision.io import read_image from torchvision.utils import draw_bounding_boxes from torchvision.ops import box_area # read input image from your computer img = read_image( 'img.png' ) # create boxes b_box1 = [ 80 , 70 , 500 , 200 ] b_box2 = [ 80 , 230 , 500 , 300 ] b_box3 = [ 580 , 70 , 720 , 300 ] b_box = [b_box1, b_box2,b_box3] # convert the bounding box to torch tensor b_box = torch.tensor(b_box, dtype = torch. int ) # Compute the bounding box area area = box_area(b_box) # set this computed area on label labels = [f "b_box area ={n}" for n in area] # draw the above define bounding boxes on image img = draw_bounding_boxes(img, b_box, labels = labels, width = 4 , colors = [ "orange" , "white" , "red" ]) # transform this image to PIL image img = torchvision.transforms.ToPILImage()(img) # display result img.show() |
Output: