This article will discuss how to sort Pandas Data Frame using various methods in Python.
Creating a Data Frame for Demonstration
Python3
# importing pandas library import pandas as pd # creating and initializing a nested list age_list = [[ 'Afghanistan' , 1952 , 8425333 , 'Asia' ], [ 'Australia' , 1957 , 9712569 , 'Oceania' ], [ 'Brazil' , 1962 , 76039390 , 'Americas' ], [ 'China' , 1957 , 637408000 , 'Asia' ], [ 'France' , 1957 , 44310863 , 'Europe' ], [ 'India' , 1952 , 3.72e + 08 , 'Asia' ], [ 'United States' , 1957 , 171984000 , 'Americas' ]] # creating a pandas dataframe df = pd.DataFrame(age_list, columns = [ 'Country' , 'Year' , 'Population' , 'Continent' ]) df |
Output:
Sorting Pandas Data Frame
In order to sort the data frame in pandas, the function sort_values() is used. Pandas sort_values() can sort the data frame in Ascending or Descending order.
Sorting the Pandas DataFrame in Ascending Order
The code snippet sorts the DataFrame df in ascending order based on the ‘Country’ column. However, it does not store or display the sorted data frame.
Python3
# Sorting by column 'Country' df.sort_values(by = [ 'Country' ]) |
Output:
Sorting the Pandas DataFrame in Descending order
The DataFrame df will be sorted in descending order based on the “Population” column, with the country having the highest population appearing at the top of the DataFrame.
Python3
# Sorting by column "Population" df.sort_values(by = [ 'Population' ], ascending = False ) |
Output:
Sort Pandas DataFrame Based on Sampling
Sorting Pandas Data frame by putting missing values first
Python3
# Sorting by column "Population" # by putting missing values first df.sort_values(by = [ 'Population' ], na_position = 'first' ) |
Output:
Sorting Data frames by multiple columns
Python3
# Sorting by columns "Country" and then "Continent" df.sort_values(by = [ 'Country' , 'Continent' ]) |
Output:
Sorting Data frames by multiple columns but in a different order
Python3
# Sorting by columns "Country" in descending # order and then "Continent" in ascending order df.sort_values(by = [ 'Country' , 'Continent' ], ascending = [ False , True ]) |
Output: