Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas nsmallest() method is used to get n least values from a data frame or a series.
Syntax: DataFrame.nsmallest(n, columns, keep=’first’)
Parameters:
n: int, Number of values to select
columns: Column to check for least values or user can select column while calling too. [For example: data[“age”].nsmallest(3) OR data.nsmallest(3, “age”)]
keep: object to set which value to select if duplicates exit. Options are ‘first’ or ‘last’.
To download the CSV file used, Click Here.
Example #1: Extracting Least 5 values
In this example least 5 values are extracted and then compared to the other sorted by the sort_values() function.
NaN values are removed before trying this method.
Refer sort_values and dropna().
Python
# importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv( "employees.csv" ) # removing null values data.dropna(inplace = True ) # extracting least 5 least5 = data.nsmallest( 5 , "Salary" ) # display least5 |
Output:
Example #2: Sorting by sort_values()
Python
# importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv( "employees.csv" ) # removing null values data.dropna(inplace = True ) # sorting in ascending order data.sort_values( "Salary" , ascending = True , inplace = True ) # displaying top 5 values data.head() |
Output:
As shown in the output image, the values returned by both functions are similar.