Friday, November 15, 2024
Google search engine
HomeData Modelling & AIPlotly and cufflinks : Advanced Python Data Visualization Libraries

Plotly and cufflinks : Advanced Python Data Visualization Libraries

This article was published as a part of the Data Science Blogathon

Introduction

Data Visualization helps to bridge the gap between numbers and words – Brie E. Anderson

In the modern world of data, data is everywhere and it is only useful when we can seek some information from it. Data Visualization is the most beautiful part of data analysis through which we describe the data interactively and visually. It is that magical technique that can convey information to larger masses at a single glance and tell interesting stories from the data.

Data Visualization has a very important role in making critical business decisions where data is present in a very large amount and it becomes harder for businesses to extract decisive information from it.

data visualization | plotly
Source: Google Images

Data Visualization in Python

In the process of data analysis, after completing data cleaning and data manipulation, the next step is to bring meaningful insights and conclusions from the data which can be achieved through graphs and charts. Python provides several libraries for this purpose. Generally, we are taught about only these two libraries matplotlib and seaborn. These libraries provide features using which we can draw line charts, pie charts, bar plots, box plots, and many more plots. Now, you must have a question if we already have matplotlib and seaborn, then why we need other libraries for data visualization? This is the same question that arrived in my mind when I first heard about plotly and cufflinks. However, when I read about them and used them in my projects then it had completely blown my mind, the power these libraries provide in your hand is amazing.

What is plotlib and why we need it?

Plotly is an open-source and charting library that provides the facility of interactive plotting. The library is available for several programming languages such as Python, R, MATLAB, Arduino, and REST, among others.

Cufflink is also a python library that connects plotly with pandas so that we can create charts directly on data frames. It basically acts as a plugin.

Plotly charts are dynamic in nature, it allows us to hover over the values, zoom in zoom out of graphs, identifies outliers in the dataset. On the other hand, matplotlib and seaborn charts are static, we can’t zoom in or zoom-out, it does not provide detail of every value on the chart. The most important feature of plotly is that it allows us to build dynamic charts for the web directly from python which is not possible in the case of matplotlib. With plotly, we can also create animations and interactive graphs on geographical, scientific, statistical, and financial data.

what is plotly

Authoria Templates

Installing and running plotlib and cufflinks

First, we install these two libraries using the pip command.

pip install plotly
pip install cufflinks

After installing, we import all necessary modules in our python shell or jupyter notebook.

import pandas as pd
import numpy as np
import plotly
import datetime
from datetime import date
import cufflinks as cf

For enabling the offline mode for plotting locally in the jupyter notebook we write the below code.

from plotly import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
cf.go_offline()

Dataset

Before going for the implementation of plotly, we first need to understand our dataset and operations that can be applied to do data visualization using various charts.

We have taken the tata motors shares price dataset of 6 years period(2016-2021) from yahoo finance. The dataset has 7 columns that include the date, close price, open price, high, low, adj close, and volume. Then we have added two columns i.e. year and month in the dataset which will help in plotting the charts. The following code is written below:

Python Code:



Now, we will draw several charts on this dataset.

It is not possible to embed the plotly charts on this blog. Therefore, I have added screenshots of charts. However, if you want to play with charts you can run my code in your jupyter notebook.

Basic Charts

Line Chart

import plotly.express as px
fig = px.line(stock_data, x="Date", y="Adj Close",title = 'TATAMOTORS SHARE PRICE (2016-2021)')
fig.show()

OUTPUT

 

line chart | plotly

 

As we can see above, Plotly provides a set of options to play around with the visualization. On the top right corner of the plot, you can see 10 buttons that offer several functionalities. In the order of buttons from going left to right, we have one to download the plot as a png image, toggle buttons to switch between zoom mode and pan mode, a zoom in and zoom out button, and an autoscale button to reset the scale in the plot, a reset axis button, a button to toggle the spike lines in the graph, and pair of toggle button to switch between different mouse hover modes.

This line chart shows the tata motors share price from 2016 to 2021. We can zoom in for a particular year and a month and can see the share price of each day by hovering on it.

Bar Plot

stock=stock_data.groupby('Year')['Adj Close'].mean()
fig = px.bar(stock, x=stock.index, y=stock.values,title = 'TATAMOTORS SHARE PRICE (2016-2021)',labels={'y':'Price'})
fig.show()

OUTPUT

 

bar plot | plotly

This bar plot shows the mean share price of tata motors in several years.

Bubble Chart

df=stock_data.loc[stock_data['Year']==2021,['Date','Adj Close','Volume','Year','Month']]
fig = px.scatter(df, x="Date", y="Adj Close",
         size='Volume',color='Month',title = 'TATAMOTORS SHARE PRICE (2016-2021)')
fig.show()

OUTPUT

 

bubble chart | plotly

This bubble chart shows the tata motors share price across different months in the year 2021 and the size of the bubble represents the volume of shares traded.

Statistical Charts

Box Plot

fig = px.box(stock_data, x="Year", y="Adj Close",title = 'TATAMOTORS SHARE PRICE (2016-2021)')
fig.show()

OUTPUT

 

boxplot | plotly

This box plot gives the various descriptive stats of the share price of tata motors for different years. It provides the max, min, median, quantile 1, quantile 3, and max/min values excluding outlier.

Histogram

fig = px.histogram(stock_data, x="Adj Close",title = 'TATAMOTORS SHARE PRICE (2016-2021)')
fig.show()

OUTPUT

 

histogram | plotly

The histogram tells us about the count of the share price of tata motors over 6 years. We can that for most of the period the share price was between Rs 160 and Rs 200.

Heatmap

dff= stock_data[['Adj Close','Year','Month']]
dff=dff.groupby(['Year','Month']).mean()
dff=dff.unstack(level=0)
dff.head()
month=['January','February','March','April','May','June','July','August','September','October','November','December']
year=[2016,2017,2018,2019,2020]
import plotly.graph_objects as go
fig = go.Figure(data=go.Heatmap(
        z=dff,
        x=year,
        y=month,
        colorscale='Viridis'))
fig.update_layout(title="TATAMOTORS SHARE PRICE (2016-2021) -> (Jan to Dec)")
fig.show()

OUTPUT

 

heatmap

The heatmap shows the share price of tata motors across months and their respective years. By hovering we can get details of each block.

Financial Charts

Candlestick Chart

fig = go.Figure(data=[go.Candlestick(x=stock_data['Date'],
                open=stock_data['Open'],
                high=stock_data['High'],
                low=stock_data['Low'],
                close=stock_data['Adj Close'])])
fig.update_layout(title="TATAMOTORS SHARE PRICE (2016-2021)")
fig.show()

OUTPUT

 

candelstick chart

Candlestick chart gives the open, close, high, and low price of each day. There is also a slidebar under the chart that we can use to get the chart for any range in 6 years.

Area Chart

fig = px.area(y=stock_data["Adj Close"],x=stock_data['Date'],title = 'TATAMOTORS SHARE PRICE (2016-2021)',labels={'x':'Year','y':'Price'})
fig.update_layout(showlegend = False)
fig.show()

OUTPUT

 

areachart

 

Area Chart gives the area under the stock price. It is useful to see the trend and variation in share price over the years.

Plotly also provides scientific charts, 3-D charts, maps, and animations. You can visit plotly documentation here for more details.

Key Notes

In this blog first, we have seen the importance of data visualization. Then we discussed what’s the need for advanced charting libraries like Plotly and how it provides much more features than matplotlib and seaborn. Lastly, we implemented some interactive Plotly charts on tata motors’ share price data.

References

Featured Image – boostlabs

Google Image – pythonhow

Gif – authoria

plotly documentation – plotly 

About Me

Hi! I am Ashish Choudhary. I am pursuing B.Tech from the JC Bose University of Science & Technology. Data Science is my passion and feels proud to write interesting blogs related to it. Feel free to contact me on Linkedin.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Ashish

24 Aug 2022

RELATED ARTICLES

Most Popular

Recent Comments