The matrix multiplication is an integral part of scientific computing. It becomes complicated when the size of the matrix is huge. One of the ways to easily compute the product of two matrices is to use methods provided by PyTorch. This article covers how to perform matrix multiplication using PyTorch.
PyTorch and tensors:
It is a package that can be used for neural network-based deep learning projects. It is an open-source library developed by Facebook’s AI research team. It can replace NumPy with its power of GPU. One of the important classes provided by this library is Tensor. It is nothing but the n-dimensional arrays as provided by the NumPy package. There are so many methods in PyTorch that can be applied to Tensor, which makes computations faster and easy. The Tensor can hold only elements of the same data type.
Matrix multiplication with PyTorch:
The methods in PyTorch expect the inputs to be a Tensor and the ones available with PyTorch and Tensor for matrix multiplication are:
- torch.mm().
- torch.matmul().
- torch.bmm()
- @ operator.
torch.mm():
This method computes matrix multiplication by taking an m×n Tensor and an n×p Tensor. It can deal with only two-dimensional matrices and not with single-dimensional ones. This function does not support broadcasting. Broadcasting is nothing but the way the Tensors are treated when their shapes are different. The smaller Tensor is broadcasted to suit the shape of the wider or larger Tensor for operations. The syntax of the function is given below.
torch.mm(Tensor_1, Tensor_2, out=None)
The parameters are two Tensors and the third one is an optional argument. Another Tensor to hold the output values can be given there.
Example-1: Matrices of the same dimension
Here both the inputs are of same dimensions. Thus, the output will also be of the same dimension.
Python3
import torch as t mat_1 = torch.tensor([[ 1 , 2 , 3 ], [ 4 , 3 , 8 ], [ 1 , 7 , 2 ]]) mat_2 = torch.tensor([[ 2 , 4 , 1 ], [ 1 , 3 , 6 ], [ 2 , 6 , 5 ]]) torch.mm(mat_1, mat_2, out = None ) |
Output:
tensor([[10, 28, 28], [27, 73, 62], [13, 37, 53]])
Example2: Matrices of a different dimension
Here tensor_1 is of 2×2 dimension, tensor_2 is of 2×3 dimension. So the output will be of 2×3.
Python3
import torch as t mat_1 = torch.tensor([[ 1 , 2 ], [ 4 , 3 ]]) mat_2 = torch.tensor([[ 2 , 4 , 1 ], [ 1 , 3 , 6 ]]) torch.mm(mat_1, mat_2, out = None ) |
Output:
tensor([[1.4013e-45, 0.0000e+00, 2.8026e-45], [0.0000e+00, 5.6052e-45, 0.0000e+00]])
torch.matmul():
This method allows the computation of multiplication of two vector matrices (single-dimensional matrices), 2D matrices and mixed ones also. This method also supports broadcasting and batch operations. Depending upon the input matrices dimensions, the operation to be done is decided. The general syntax is given below.
torch.matmul(Tensor_1, Tensor_2, out=None)
The table below lists the various possible dimensions of the arguments and the operations based on it.
argument_1 |
argument_2 |
Action taken |
1-dimensional | 1-dimensional | The scalar product is calculated |
2-dimensional | 2-dimensional | General matrix multiplication is done |
1-dimensional | 2-dimensional | The tensor-1 is pretended with a ‘1’ to match dimension of tensor-2 |
2-dimensional | 1-dimensional | Matrix-vector product is calculated |
1/N-dimensional (N>2) | 1/N-dimensional (N>2) | Batched matrix multiplication is done |
Example1: Arguments of the same dimension
Python3
import torch as t # both arguments 1D vec_1 = torch.tensor([ 3 , 6 , 2 ]) vec_2 = torch.tensor([ 4 , 1 , 9 ]) print ( "Single dimensional tensors :" , torch.matmul(vec_1, vec_2)) # both arguments 2D mat_1 = torch.tensor([[ 1 , 2 , 3 ], [ 4 , 3 , 8 ], [ 1 , 7 , 2 ]]) mat_2 = torch.tensor([[ 2 , 4 , 1 ], [ 1 , 3 , 6 ], [ 2 , 6 , 5 ]]) out = torch.matmul(mat_1, mat_2) print ( "\n3x3 dimensional tensors :\n" , out) |
Output:
Single dimensional tensors : tensor(36) 3x3 dimensional tensors : tensor([[10, 28, 28], [27, 73, 62], [13, 37, 53]])
Example2: Arguments of different dimensions
Python3
import torch # first argument 1D and second argument 2D mat1_1 = torch.tensor([ 3 , 6 , 2 ]) mat1_2 = torch.tensor([[ 1 , 2 , 3 ], [ 4 , 3 , 8 ], [ 1 , 7 , 2 ]]) out_1 = torch.matmul(mat1_1, mat1_2) print ( "\n1D-2D multiplication :\n" , out_1) # first argument 2D and second argument 1D mat2_1 = torch.tensor([[ 2 , 4 , 1 ], [ 1 , 3 , 6 ], [ 2 , 6 , 5 ]]) mat2_2 = torch.tensor([ 4 , 1 , 9 ]) # assigning to output tensor out_2 = torch.matmul(mat2_1, mat2_2) print ( "\n2D-1D multiplication :\n" , out_2) |
Output:
1D-2D multiplication : tensor([29, 38, 61]) 2D-1D multiplication : tensor([21, 61, 59])
Example3: N-dimensional argument (N>2)
Python3
import torch # creating Tensors using randn() mat_1 = torch.randn( 2 , 3 , 3 ) mat_2 = torch.randn( 3 ) # printing the matrices print ( "matrix A :\n" , mat_1) print ( "\nmatrix B :\n" , mat_2) # output print ( "\nOutput :\n" , torch.matmul(mat_1, mat_2)) |
Output:
matrix A : tensor([[[ 0.5433, 0.0546, -0.5301], [ 0.9275, -0.0420, -1.3966], [-1.1851, -0.2918, -0.7161]], [[-0.8659, 1.8350, 1.6068], [-1.1046, 1.0045, -0.1193], [ 0.9070, 0.7325, -0.4547]]]) matrix B : tensor([ 1.8785, -0.4231, 0.1606]) Output : tensor([[ 0.9124, 1.5358, -2.2177], [-2.1448, -2.5191, 1.3208]])
torch.bmm():
This method provides batched matrix multiplication for the cases where both the matrices to be multiplied are of only 3-Dimensions (x×y×z) and the first dimension (x) of both the matrices must be same. This does not support broadcasting. The syntax is as given below.
torch.bmm( Tensor_1, Tensor_2, deterministic=false, out=None)
The “deterministic” parameter takes up boolean value. A ‘false‘ does a faster calculation which is non-deterministic. A ‘true‘ does a slower calculation however, it is deterministic.
Example:
In the example below, the matrix_1 is of dimension 2×3×3. The second matrix is of dimension 2×3×4.
Python3
import torch # 3D matrices mat_1 = torch.randn( 2 , 3 , 3 ) mat_2 = torch.randn( 2 , 3 , 4 ) print ( "matrix A :\n" ,mat_1) print ( "\nmatrix B :\n" ,mat_2) print ( "\nOutput :\n" ,torch.bmm(mat_1,mat_2)) |
Output:
matrix A : tensor([[[-0.0135, -0.9197, -0.3395], [-1.0369, -1.3242, 1.4799], [-0.0182, -1.2917, 0.6575]], [[-0.3585, -0.0478, 0.4674], [-0.6688, -0.9217, -1.2612], [ 1.6323, -0.0640, 0.4357]]]) matrix B : tensor([[[ 0.2431, -0.1044, -0.1437, -1.4982], [-1.4318, -0.2510, 1.6247, 0.5623], [ 1.5265, -0.8568, -2.1125, -0.9463]], [[ 0.0182, 0.5207, 1.2890, -1.3232], [-0.2275, -0.8006, -0.6909, -1.0108], [ 1.3881, -0.0327, -1.4890, -0.5550]]]) Output : tensor([[[ 0.7954, 0.5231, -0.7752, -0.1756], [ 3.9031, -0.8274, -5.1288, -0.5915], [ 2.8488, -0.2372, -3.4850, -1.3212]], [[ 0.6532, -0.1637, -1.1251, 0.2633], [-1.5532, 0.4309, 1.6527, 2.5167], [ 0.6492, 0.8870, 1.4994, -2.3371]]])
** Note: the matrices vary for each run as random values are filled dynamically.
@ operator:
The @ – Simon H operator, when applied on matrices performs multiplication element-wise on 1D matrices and normal matrix multiplication on 2D matrices. If both the matrices have the same dimension, then the matrix multiplication is carried out normally without any broadcasting/prepending. If any one of the matrices is of a different dimension, then appropriate broadcasting is carried out first and then the multiplication is carried out. This operator applies to N-Dimensional matrices also.
Example:
Python3
# single dimensional matrices oneD_1 = torch.tensor([ 3 , 6 , 2 ]) oneD_2 = torch.tensor([ 4 , 1 , 9 ]) # two dimensional matrices twoD_1 = torch.tensor([[ 1 , 2 , 3 ], [ 4 , 3 , 8 ], [ 1 , 7 , 2 ]]) twoD_2 = torch.tensor([[ 2 , 4 , 1 ], [ 1 , 3 , 6 ], [ 2 , 6 , 5 ]]) # N-dimensional matrices (N>2) # 2x3x3 dimensional matrix ND_1 = torch.tensor([[[ - 0.0135 , - 0.9197 , - 0.3395 ], [ - 1.0369 , - 1.3242 , 1.4799 ], [ - 0.0182 , - 1.2917 , 0.6575 ]], [[ - 0.3585 , - 0.0478 , 0.4674 ], [ - 0.6688 , - 0.9217 , - 1.2612 ], [ 1.6323 , - 0.0640 , 0.4357 ]]]) # 2x3x4 dimensional matrix ND_2 = torch.tensor([[[ 0.2431 , - 0.1044 , - 0.1437 , - 1.4982 ], [ - 1.4318 , - 0.2510 , 1.6247 , 0.5623 ], [ 1.5265 , - 0.8568 , - 2.1125 , - 0.9463 ]], [[ 0.0182 , 0.5207 , 1.2890 , - 1.3232 ], [ - 0.2275 , - 0.8006 , - 0.6909 , - 1.0108 ], [ 1.3881 , - 0.0327 , - 1.4890 , - 0.5550 ]]]) print ( "1D matrices output :\n" , oneD_1 @ oneD_2) print ( "\n2D matrices output :\n" , twoD_1 @ twoD_2) print ( "\nN-D matrices output :\n" , ND_1 @ ND_2) print ( "\n Mixed matrices output :\n" , oneD_1 @ twoD_1 @ twoD_2) |
Output:
1D matrices output : tensor(36) 2D matrices output : tensor([[10, 28, 28], [27, 73, 62], [13, 37, 53]]) N-D matrices output : tensor([[[ 0.7953, 0.5231, -0.7751, -0.1757], [ 3.9030, -0.8274, -5.1287, -0.5915], [ 2.8487, -0.2372, -3.4850, -1.3212]], [[ 0.6531, -0.1637, -1.1250, 0.2633], [-1.5532, 0.4309, 1.6526, 2.5166], [ 0.6491, 0.8869, 1.4995, -2.3370]]]) Mixed matrices output: tensor([218, 596, 562])