Given a Matrix, the following article extracts a specifies number of rows that has a maximum sum.
Input : test_list = [[3, 4, 5, 6], [1, 4, 6], [199], [2, 3, 4, 5, 6], [7, 3, 1]], K = 3 Output : [[199], [2, 3, 4, 5, 6], [3, 4, 5, 6]] Explanation : 199 > 20 > 18, 3 maximum elements rows are extracted.
Input : test_list = [[3, 4, 5, 6], [1, 4, 6], [199], [2, 3, 4, 5, 6], [7, 3, 1]], K = 2 Output : [[199], [2, 3, 4, 5, 6]] Explanation : 199 > 20, 2 maximum elements rows are extracted.
Method 1 : Using sorted(), reverse, slice and sum()
In this, we perform the task of sorting using sorted() and getting sum using sum(). Reverse key is used to perform reversal of rows for getting maximum summation rows at top and then top K(specific number of rows) rows are sliced.
Python3
# Initializing list test_list = [[ 3 , 4 , 5 , 6 ], [ 1 , 4 , 6 ], [ 199 ], [ 2 , 3 , 4 , 5 , 6 ], [ 7 , 3 , 1 ]] # Printing original list print ( "The original list is : " + str (test_list)) # Initializing K K = 3 # sorted gets reverse sorted matrix by sum # K rows extracted using slicing res = sorted (test_list, key = lambda row: sum (row), reverse = True )[:K] # Printing result print ( "The filtered rows : " + str (res)) |
Output:
The original list is : [[3, 4, 5, 6], [1, 4, 6], [199], [2, 3, 4, 5, 6], [7, 3, 1]]
The filtered rows : [[199], [2, 3, 4, 5, 6], [3, 4, 5, 6]]
Time Complexity: O(nlogn+mlogm)
Auxiliary Space: O(k)
Method 2 : Using sort(), reverse, slicing and sum()
In this, we perform the task of in-place sorting using sort(), using reverse as key. Slicing is done using slice operation. The sum(), is used to take summation, and an external function is called to compute sum of rows of the list.
Python3
# row sum util. def row_sum(row): return sum (row) # initializing list test_list = [[ 3 , 4 , 5 , 6 ], [ 1 , 4 , 6 ], [ 199 ], [ 2 , 3 , 4 , 5 , 6 ], [ 7 , 3 , 1 ]] # printing original list print ( "The original list is : " + str (test_list)) # initializing K K = 3 # sort() used to sort # K rows extracted using slicing test_list.sort(key = row_sum, reverse = True ) res = test_list[:K] # printing result print ( "The filtered rows : " + str (res)) |
Output:
The original list is : [[3, 4, 5, 6], [1, 4, 6], [199], [2, 3, 4, 5, 6], [7, 3, 1]]
The filtered rows : [[199], [2, 3, 4, 5, 6], [3, 4, 5, 6]]
Time Complexity: O(nlogn+mlogm)
Auxiliary Space: O(k)
Method 3: Using the heapq module and maintaining a heap of size K.
Approach:
- Import the heapq module.
- Define the function row_sum() to return the sum of a row.
- Initialize a heap of size K with an initial value of None.
- Iterate through each row in the test_list.
- Calculate the row sum for the current row using the row_sum() function.
- If the heap is not full or if the current row sum is greater than the smallest element in the heap, push the current row and its sum onto the heap.
- If the heap is full and the current row sum is less than or equal to the smallest element in the heap, continue to the next row.
- Once all rows have been processed, pop the K largest elements from the heap.
- Reverse the resulting list and return it as the filtered rows.
Python3
import heapq def row_sum(row): return sum (row) # Initializing list and K test_list = [[ 3 , 4 , 5 , 6 ], [ 1 , 4 , 6 ], [ 199 ], [ 2 , 3 , 4 , 5 , 6 ], [ 7 , 3 , 1 ]] K = 3 heap = [() for i in range (K)] # Iterating over list for row in test_list: row_sum_val = row_sum(row) if not all (heap) or row_sum_val > heap[ 0 ][ 0 ]: heapq.heappushpop(heap, (row_sum_val, row)) res = [row for sum_val, row in heapq.nlargest(K, heap)] res.reverse() # Printing result print ( "The filtered rows: " + str (res)) |
The filtered rows: [[3, 4, 5, 6], [2, 3, 4, 5, 6], [199]]
The time complexity for this approach is O(N*log(K)), where N is the total number of elements in the list. The auxiliary space complexity is O(K), which is the size of the heap.
Method 4: Using numpy library
Approach:
- Import the numpy library.
- Convert test_list to a numpy array using the numpy.array() function.
- Use the numpy.sum() function to calculate the sum of each row of the array.
- Use the numpy.argsort() function to get the indices of the rows sorted in descending order of row sums.
- Use slicing to get the first K rows from the sorted indices.
- Use the numpy.take() function to get the rows corresponding to the sorted indices.
- Convert the resulting array back to a list using the numpy.ndarray.tolist() method.
Python3
# import numpy library import numpy as np def row_sum(row): return sum (row) # Initializing list test_list = [[ 3 , 4 , 5 , 6 ], [ 1 , 4 , 6 ], [ 199 ], [ 2 , 3 , 4 , 5 , 6 ], [ 7 , 3 , 1 ]] # Initializing K K = 3 # Finding the length of the longest row max_row_length = max ( len (row) for row in test_list) # pad the shorter rows with zeros for i in range ( len (test_list)): row = test_list[i] row + = [ 0 ] * (max_row_length - len (row)) # Converting test_list to a numpy array arr = np.array(test_list) # Calculating the sum of each row row_sums = np. sum (arr, axis = 1 ) # Getting the indices of the rows sorted in descending order of row sums sorted_indices = np.argsort(row_sums)[:: - 1 ] # Getting the first K rows from the sorted indices k_sorted_indices = sorted_indices[:K] # getting the rows corresponding to the sorted indices result_arr = np.take(arr, k_sorted_indices, axis = 0 ) # Converting the resulting array back to a list result_list = [row[:max_row_length] for row in result_arr.tolist()] # Printing the result print ( "The filtered rows : " + str (result_list)) |
OUTPUT: The filtered rows: [[3, 4, 5, 6], [2, 3, 4, 5, 6], [199]]
Time complexity: O(NlogN + KlogK), where N is the total number of elements in test_list.
Auxiliary space: O(N)