Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas Series.clip_upper()
is used to clip values above a passed maximum value. A threshold value is passed as parameter and all values in series that are more than the threshold values become equal to it.
Syntax: Series.clip_upper(threshold, axis=None, inplace=False)
Parameters:
threshold: numeric or list like, Sets maximum threshold value and in case of list, sets separate threshold values for each value in caller series ( Given list size is same)
axis: 0 or ‘index’ to apply method by rows and 1 or ‘columns’ to apply by columns.
inplace: Make changes in the caller series itself. (Overwrite with new values)Return type: Series with updated values
To download the data set used in following example, click here.
In the following examples, the data frame used contains data of some NBA players. The image of data frame before any operations is attached below.
Example #1: Applying on series with single value
In this example, a maximum threshold value of 26 is passed as parameter to .clip_upper() method. This method is called on Age column of the data frame and the new values are stored in Age_new column. Before doing any operations, null rows are dropped using .dropna()
Python3
# importing pandas module import pandas as pd # making data frame # removing null values to avoid errors data.dropna(inplace = True ) # setting threshold value threshold = 26.0 # applying method and passing to new column data[ "Age_new" ] = data[ "Age" ].clip_upper(threshold) # displaying top 10 rows data.head( 10 ) |
Output:
As shown in the Output image, the Age_new column has a maximum value of 26. All values more than 26 were clipped and made equal to 26.
Example #2: Applying on series with list type value
In this example, top 10 rows of Age column are extracted and stored using .head()
method. After that a list of same length is created and passed to threshold parameter of .clip_upper()
method to set separate threshold value for Each value in series. The returned values are stored in a new column ‘clipped_values’.
Python3
# importing pandas module import pandas as pd # importing regex module import re # making data frame # removing null values to avoid errors data.dropna(inplace = True ) # returning top rows new_data = data.head( 10 ).copy() # list for separate threshold values threshold = [ 27 , 23 , 19 , 30 , 26 , 22 , 22 , 41 , 11 , 33 ] # applying method and returning to new column new_data[ "Clipped values" ] = new_data[ "Age" ].clip_upper(threshold = threshold) # display new_data |
Output:
As shown in the output image, each value in series had a different threshold value according to the passed list and hence the results were returned according to each element’s separate threshold value. All values more than their respective threshold values were clipped down to the threshold value.