In this article, we will see how to calculate the rolling median in pandas.
A rolling metric is usually calculated in time series data. It represents how the values are changing by aggregating the values over the last ānā occurrences. The ānā is known as the window size. The aggregation is usually the mean or simple average. However, we can also use median aggregation to perform certain kinds of analyses.Ā
Before we move, let us install the pandaās library using pip:
pip install pandas
pandas.core.window.rolling.Rolling.median() function calculates the rolling median. The object pandas.core.window.rolling.Rolling is obtained by applying rolling() method to the dataframe or series.
Example 1:
Under this example, we will be using the pandas.core.window.rolling.Rolling.median() function to calculate the rolling median of the given data frame. we have calculated the rolling median for window sizes 1, 2, 3, and 4. We have merged all these different window outputs in the original dataframe so that we can compare them. As we can observe in the output, for a window size of ānā, we have the first n-1 columns as NaN value. For record 5, the median values of record 2 ā 5 will be considered. Similarly, for the 10th record, the median value of records between 7 and 10 is considered. This window size can be defined in the rolling() method in the window parameter.
Python
# Import the `pandas` library import pandas as pd Ā Ā # Create the pandas dataframe df = pd.DataFrame({ Ā Ā Ā Ā "value" : [ 101 , 94 , 112 , 100 , 134 , 124 ,Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā 119 , 127 , 143 , 128 , 141 ] }) Ā Ā # Calculate the rolling median for window = 1 w1_roll_median = df.rolling(window = 1 ).median() Ā Ā # Calculate the rolling median for window = 2 w2_roll_median = df.rolling(window = 2 ).median() Ā Ā # Calculate the rolling median for window = 3 w3_roll_median = df.rolling(window = 3 ).median() Ā Ā # Calculate the rolling median for window = 4 w4_roll_median = df.rolling(window = 4 ).median() Ā Ā # Add the rolling median series to the originalĀ # dataframe for comparison df[ 'w1_roll_median' ] = w1_roll_median df[ 'w2_roll_median' ] = w2_roll_median df[ 'w3_roll_median' ] = w3_roll_median df[ 'w4_roll_median' ] = w4_roll_median Ā Ā # Print the dataframe print (df) |
Output:
value w1_roll_median w2_roll_median w3_roll_median w4_roll_median 0 101 101.0 NaN NaN NaN 1 94 94.0 97.5 NaN NaN 2 112 112.0 103.0 101.0 NaN 3 100 100.0 106.0 100.0 100.5 4 134 134.0 117.0 112.0 106.0 5 124 124.0 129.0 124.0 118.0 6 119 119.0 121.5 124.0 121.5 7 127 127.0 123.0 124.0 125.5 8 143 143.0 135.0 127.0 125.5 9 128 128.0 135.5 128.0 127.5 10 141 141.0 134.5 141.0 134.5
Example 2:
In this example, we have taken the stock price of Tata Motors for the last 3 weeks. The rolling median is calculated for a window size of 7 which means a weekās time frame. Therefore, each value in the w7_roll_median column represents the median value of the stock price for a week. Since the window size is 7, the initial 6 records are NaN as discussed earlier.
Python
# Import the `pandas` library import pandas as pd Ā Ā # Create the pandas dataframe df = pd.DataFrame({ Ā Ā Ā Ā "value" : [ Ā Ā Ā Ā Ā Ā Ā Ā 506.40 , 487.85 , 484.90 , 489.70 , 501.40 , 509.65 , 510.75 , Ā Ā Ā Ā Ā Ā Ā Ā 503.45 , 507.05 , 505.45 , 519.05 , 530.15 , 509.70 , 486.10 , Ā Ā Ā Ā Ā Ā Ā Ā 495.50 , 488.65 , 492.75 , 460.20 , 461.45 , 458.60 , 475.25 , Ā Ā Ā Ā ] }) Ā Ā # Calculate the rolling median for window = 7 w7_roll_median = df.rolling(window = 7 ).median() Ā Ā # Add the rolling median series to the original # dataframe for comparison df[ 'w7_roll_median' ] = w7_roll_median Ā Ā # Print the dataframe print (df) |
Output:
value w7_roll_median 0 506.40 NaN 1 487.85 NaN 2 484.90 NaN 3 489.70 NaN 4 501.40 NaN 5 509.65 NaN 6 510.75 501.40 7 503.45 501.40 8 507.05 503.45 9 505.45 505.45 10 519.05 507.05 11 530.15 509.65 12 509.70 509.70 13 486.10 507.05 14 495.50 507.05 15 488.65 505.45 16 492.75 495.50 17 460.20 492.75 18 461.45 488.65 19 458.60 486.10 20 475.25 475.25