Pandas is one of the most popular Python packages used in data science. Pandas offer a powerful, and flexible data structure ( Dataframe & Series ) to manipulate, and analyze the data. Visualization is the best way to interpret the data.
Python has many popular plotting libraries that make visualization easy. Some of them are Matplotlib, Seaborn, and Python Plotly. It has great integration with Matplotlib. We can plot a Dataframe using the plot() method. But we need a Dataframe to plot. We can create a Dataframe by just passing a dictionary to the DataFrame() method of the Pandas library.
Plot a Dataframe using Pandas
Let’s create a simple Dataframe:
Python3
# importing required library # In case pandas is not installed on your machine # use the command 'pip install pandas'. import pandas as pd import matplotlib.pyplot as plt # A dictionary which represents data data_dict = { 'name' : [ 'p1' , 'p2' , 'p3' , 'p4' , 'p5' , 'p6' ], 'age' : [ 20 , 20 , 21 , 20 , 21 , 20 ], 'math_marks' : [ 100 , 90 , 91 , 98 , 92 , 95 ], 'physics_marks' : [ 90 , 100 , 91 , 92 , 98 , 95 ], 'chem_marks' : [ 93 , 89 , 99 , 92 , 94 , 92 ] } # creating a data frame object df = pd.DataFrame(data_dict) # show the dataframe # bydefault head() show # first five rows from top df.head() |
Output:
name age math_marks physics_marks chem_marks
0 p1 20 100 90 93
1 p2 20 90 100 89
2 p3 21 91 91 99
3 p4 20 98 92 92
4 p5 21 92 98 94
Pandas Plotting
There are a number of plots available to interpret the data. Each graph is used for a purpose. Some of the plots are BarPlots, ScatterPlots, and Histograms, etc.
Scatter Plot
To get the scatterplot of a dataframe all we have to do is to just call the plot() method by specifying some parameters.
kind=’scatter’,x= ‘some_column’,y=’some_colum’,color=’somecolor’
Python3
# scatter plot df.plot(kind = 'scatter' , x = 'math_marks' , y = 'physics_marks' , color = 'red' ) # set the title plt.title( 'ScatterPlot' ) # show the plot plt.show() |
Output:
There are many ways to customize plots this is the basic one.
Bar Plot
Similarly, we have to specify some parameters for plot() method to get the bar plot.
kind=’bar’,x= ‘some_column’,y=’some_colum’,color=’somecolor’
Python3
# bar plot df.plot(kind = 'bar' , x = 'name' , y = 'physics_marks' , color = 'green' ) # set the title plt.title( 'BarPlot' ) # show the plot plt.show() |
Output:
Line Plot
The line plot of a single column is not always useful, to get more insights we have to plot multiple columns on the same graph. To do so we have to reuse the axes.
kind=’line’,x= ‘some_column’,y=’some_colum’,color=’somecolor’,ax=’someaxes’
Python3
# Get current axis ax = plt.gca() # line plot for math marks df.plot(kind = 'line' , x = 'name' , y = 'math_marks' , color = 'green' , ax = ax) # line plot for physics marks df.plot(kind = 'line' , x = 'name' , y = 'physics_marks' , color = 'blue' , ax = ax) # line plot for chemistry marks df.plot(kind = 'line' , x = 'name' , y = 'chem_marks' , color = 'black' , ax = ax) # set the title plt.title( 'LinePlots' ) # show the plot plt.show() |
Output:
Box Plot
Box plot is majorly used to identify outliers, we can information like median, maximum, minimum, quartiles and so on. Let’s see how to plot it.
Python3
df.plot.box() plt.show() |
Output: