Active Product Sales Analysis using Matplotlib in Python

23 July 2024

3

Every modern company that engages in online sales or maintains a specialized e-commerce website now aims to maximize its throughput in order to determine what precisely their clients need in order to increase their chances of sales. The huge datasets handed to us can be properly analyzed to find out what time of day has the highest user activity in terms of transactions.

In this post, We will use Python Pandas and Matplotlib to analyze the insight of the dataset. We can use the column Transaction Date, in this case, to glean useful insights on the busiest time (hour) of the day. You can access the entire dataset here.

Stepwise Implementation

Step 1:

First, We need to create a Dataframe of the dataset, and even before that certain libraries have to be imported.

Python3

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
  
Order_Details = pd.read_csv('Order_details(masked).csv')

Output:

Step 2:

Create a new column called Time that has the DateTime format after converting the Transaction Date column into it. The DateTime format, which has the pattern YYYY-MM-DD HH:MM:SS, can be customized however you choose. Here we’re more interested in obtaining hours, so we can have an Hour column by using an in-built function for the same:

Python3

# here we have taken Transaction 
# date column 
Order_Details['Time'] = pd.to_datetime(Order_Details['Transaction Date']) 
  
# After that we extracted hour  
# from Transaction date column 
Order_Details['Hour'] = (Order_Details['Time']).dt.hour

Step 3:

We then require the “n” busiest hours. For that, we get the first “n” entries in a list containing the occurrence rates of the hours when the transaction took place. To further simplify the manipulation of the provided data in Python, we may utilize value counts for frequencies and tolist() to convert to list format. We are also compiling a list of the associated index values.

Python3

# n =24 in this case, can be modified 
# as per need to see top 'n' busiest hours 
timemost1 = Order_Details['Hour'].value_counts().index.tolist()[:24]  
  
timemost2 = Order_Details['Hour'].value_counts().values.tolist()[:24] 

Step 4:

Finally, we stack the indices (hour) and frequencies together to yield the final result.

Python3

tmost = np.column_stack((timemost1,timemost2)) 
  
print(" Hour Of Day" + "\t" + "Cumulative Number of Purchases \n") 
print('\n'.join('\t\t'.join(map(str, row)) for row in tmost)) 

Step 5:

Before we can create an appropriate data visualization, we must make the list slightly more customizable. To do so, we gather the hourly frequencies and perform the following tasks:

Python3

timemost = Order_Details['Hour'].value_counts() 
timemost1 = [] 
  
for i in range(0,23): 
    timemost1.append(i) 
      
timemost2 = timemost.sort_index() 
timemost2.tolist() 
timemost2 = pd.DataFrame(timemost2) 

Step 6:

For data visualization, we will proceed with Matplotlib for better comprehensibility, as it is one of the most convenient and commonly used libraries. But, It is up to you to choose any of the pre-existing libraries like Matplotlib, Ggplot, Seaborn, etc., to plot the data graphically.

The commands written below are mainly to ensure that X-axis takes up the values of hours and Y-axis takes up the importance of the number of transactions affected, and also various other aspects of a line chart, including color, font, etc., to name a few.

Python3

plt.figure(figsize=(20, 10)) 
  
plt.title('Sales Happening Per Hour (Spread Throughout The Week)', 
          fontdict={'fontname': 'monospace', 'fontsize': 30}, y=1.05) 
  
plt.ylabel("Number Of Purchases Made", fontsize=18, labelpad=20) 
plt.xlabel("Hour", fontsize=18, labelpad=20) 
plt.plot(timemost1, timemost2, color='m') 
plt.grid() 
plt.show() 

The results are indicative of how sales typically peak in late evening hours prominently, and this data can be incorporated into business decisions to promote a product during that time specifically.

Active Product Sales Analysis using Matplotlib in Python

Stepwise Implementation

Step 1:

Python3

Step 2:

Python3

Step 3:

Python3

Step 4:

Python3

Step 5:

Python3

Step 6:

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Interview With Willem Dewulf – CEO of ProBackup by Shauli Zacks

Recent Comments

EDITOR PICKS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR POSTS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR CATEGORY

ABOUT US

FOLLOW US