Cumulative sum of a column in Pandas can be easily calculated with the use of a pre-defined function cumsum().
Syntax: cumsum(axis=None, skipna=True, *args, **kwargs)
Parameters:
axis: {index (0), columns (1)}
skipna: Exclude NA/null values. If an entire row/column is NA, the result will be NA
Returns: Cumulative sum of the column
Example 1:
Python3
import pandas as pdimport numpy as np# Create a dataframedf1 = pd.DataFrame({"A":[2, 3, 8, 14], "B":[1, 2, 4, 3], "C":[5, 3, 9,2]})# Computing sum over Index axisprint(df1.cumsum(axis = 0)) |
Output:
A B C 0 2 1 5 1 5 3 8 2 13 7 17 3 27 10 19
Time complexity: O(nm), where n is the number of rows and m is the number of columns in the DataFrame.
Auxiliary space: O(nm), since a new DataFrame is created to store the result of the cumsum operation, which has the same dimensions as the input DataFrame.
Example 2:
Python3
import pandas as pdimport numpy as np# Create a dataframedf1 = pd.DataFrame({"A":[None, 3, 8, 14], "B":[1, None, 4, 3], "C":[5, 3, 9,None]})# Computing sum over Index axisprint(df1.cumsum(axis = 0, skipna = True)) |
Output:
A B C 0 NaN 1.0 5.0 1 3.0 NaN 8.0 2 11.0 5.0 17.0 3 25.0 8.0 NaN
Example 3:
Python3
import pandas as pdimport numpy as np# Create a dataframedf1 = pd.DataFrame({"A":[2, 3, 8, 14], "B":[1, 2, 4, 3], "C":[5, 3, 9,2]})# Computing sum over Index axisprint(df1.cumsum(axis = 1)) |
Output:
A B C 0 2 3 8 1 3 5 8 2 8 12 21 3 14 17 19
