Prerequisites: Basic Understanding Of Pandas
In this article, we are going to see the difference between Pandas Head, Tail And Sample using Python
Pandas is an open-source library that is made mainly for working with relational or labeled data both easily and intuitively. It provides various data structures and operations for manipulating numerical data and time series. The very first line of code after reading the CSV file is to display the data of our given dataset. Pandas provide three such features through which you can display sample datasets. And three such methods are Head, Tail, And Sample.
Difference Between Head, Tail, And Sample
One must analyze how should they display the given data. Usually, many programmers prefer to choose head() and check the starting rows to analyze the data. But sometimes it may not be sufficient. I say use all three of them to analyze the data.
Sample
Using the Sample method, you can display the random data from your dataset. And there are different ways through which you can display the sample data from the dataset.
Example:
Python
import pandas as pd data = { "Anime" : [ "One Piece" , "Naruto" , "Bleach" , "Hunter X Hunter" , "Attack On Titan" , "Gintama" , "Code Geass" , "Death Note" , "Black Lagoon" , "Classroom Of Elite" , "Cowboy Bepop" , "Jujutsu Kaisen" , "Blue Period" ], "Episodes" : [ 1009 , 720 , 366 , 148 , 74 , 366 , 50 , 37 , 24 , 12 , 26 , 24 , 12 ], "Year" : [ 1999 , 2002 , 2004 , 2011 , 2013 , 2006 , 2007 , 2008 , 2006 , 2016 , 1995 , 2020 , 2021 ] } df = pd.DataFrame(data) |
Now that we have our data, try viewing the data using the sample method.
Syntax: df.sample() #returns only one row df.sample(n) #returns n number of row
Python3
print (df.sample()) # just one row print (df.sample( 6 )) # randomly selected 6 row |
Output:
Notice that the sample returns random data that is unordered.
Head
As the simple English meaning Head is used to denote the upper part of the body. In Pandas head is used to display the ordered data from the top. On passing the empty argument, by default, it displays the top 5 rows. By providing ‘n’ value, you can even display n number of data.
Syntax: df.head() #default=5 Rows df.head(n) #n number of rows
Python3
print (df.head()) # default:5 rows print (df.head( 8 )) # first 8 ordered rows |
Output:
Tail
The Tail is opposite to the head. It displays the ordered data from below.
Syntax: df.tail() #default 5 number of rows df.tail(n) #n number of rows
Python3
print (df.tail()) # default:5 rows print (df.tail( 8 )) # last 8 ordered rows |
Output:
Conclusion
The major difference between sample, head, and the tail is: on passing the empty arguments sample returns only one row whereas the head and tail return 5 rows. A sample returns unordered data, whereas head and tail return ordered data.