Prediction of Wine type using Deep Learning

27 July 2024

4

We use deep learning for the large data sets but to understand the concept of deep learning, we use the small data set of wine quality. You can find the wine quality data set from the UCI Machine Learning Repository which is available for free. The aim of this article is to get started with the libraries of deep learning such as Keras, etc and to be familiar with the basis of neural network.
About the Data Set :
Before we start loading in the data, it is really important to know about your data. The data set consist of 12 variables that are included in the data. Few of them are as follows –

Fixed acidity : The total acidity is divided into two groups: the volatile acids and the nonvolatile or fixed acids.The value of this variable is represented by in gm/dm3 in the data sets.
Volatile acidity: The volatile acidity is a process of wine turning into vinegar. In this data sets, the volatile acidity is expressed in gm/dm3.
Citric acid : Citric acid is one of the fixed acids in wines. It’s expressed in g/dm3 in the data sets.
Residual Sugar : Residual Sugar is the sugar remaining after fermentation stops, or is stopped. It’s expressed in g/dm3 in the data set.
Chlorides : It can be a important contributor to saltiness in wine. The value of this variable is represented by in gm/dm3 in the data sets.
Free sulfur dioxide : It is the part of the sulfur dioxide that is added to a wine. The value of this variable is represented by in gm/dm3 in the data sets.
Total Sulfur Dioxide : It is the sum of the bound and the free sulfur dioxide.The value of this variable is represented by in gm/dm3 in the data sets.

Step #1: Know your data.

Loading the data.

Python3

# Import Required Libraries 
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
 
# Read in white wine data
white = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv", sep =';')
 
# Read in red wine data
red = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv", sep =';')

First rows of `red`.

Python3

# First rows of `red`
red.head()

Output:

Last rows of `white`.

Python3

# Last rows of `white`
white.tail()

Output:

Take a sample of five rows of `red`.

Python3

# Take a sample of five rows of `red`
red.sample(5)

Output:

Data description –

Python3

# Describe `white`
white.describe()

Output:

Check for null values in `red`.

Python3

# Double check for null values in `red`
pd.isnull(red)

Output:

Step #2: Distribution of Alcohol.

Creating Histogram.

Python3

# Create Histogram
fig, ax = plt.subplots(1, 2)
 
ax[0].hist(red.alcohol, 10, facecolor ='red',
              alpha = 0.5, label ="Red wine")
 
ax[1].hist(white.alcohol, 10, facecolor ='white',
           ec ="black", lw = 0.5, alpha = 0.5,
           label ="White wine")
 
fig.subplots_adjust(left = 0, right = 1, bottom = 0, 
               top = 0.5, hspace = 0.05, wspace = 1)
 
ax[0].set_ylim([0, 1000])
ax[0].set_xlabel("Alcohol in % Vol")
ax[0].set_ylabel("Frequency")
ax[1].set_ylim([0, 1000])
ax[1].set_xlabel("Alcohol in % Vol")
ax[1].set_ylabel("Frequency")
 
fig.suptitle("Distribution of Alcohol in % Vol")
plt.show()

Output:

Splitting the data set for training and validation.

Python3

# Add `type` column to `red` with price one
red['type'] = 1
 
# Add `type` column to `white` with price zero
white['type'] = 0
 
# Append `white` to `red`
wines = red.append(white, ignore_index = True)
 
# Import `train_test_split` from `sklearn.model_selection`
from sklearn.model_selection import train_test_split
X = wines.ix[:, 0:11]
y = np.ravel(wines.type)
 
# Splitting the data set for training and validating 
X_train, X_test, y_train, y_test = train_test_split(
           X, y, test_size = 0.34, random_state = 45)

Step #3: Structure of Network

Python3

# Import `Sequential` from `keras.models`
from keras.models import Sequential
 
# Import `Dense` from `keras.layers`
from keras.layers import Dense
 
# Initialize the constructor
model = Sequential()
 
# Add an input layer
model.add(Dense(12, activation ='relu', input_shape =(11, )))
 
# Add one hidden layer
model.add(Dense(9, activation ='relu'))
 
# Add an output layer
model.add(Dense(1, activation ='sigmoid'))
 
# Model output shape
model.output_shape
 
# Model summary
model.summary()
 
# Model config
model.get_config()
 
# List all weight tensors
model.get_weights()
model.compile(loss ='binary_crossentropy', 
  optimizer ='adam', metrics =['accuracy'])

Output:

Step #4: Training and Prediction

Python3

# Training Model
model.fit(X_train, y_train, epochs = 3,
           batch_size = 1, verbose = 1)
  
# Predicting the Value
y_pred = model.predict(X_test)
print(y_pred)

Output:

Prediction of Wine type using Deep Learning

Step #1: Know your data.

Python3

Python3

Python3

Python3

Python3

Python3

Step #2: Distribution of Alcohol.

Python3

Python3

Step #3: Structure of Network

Python3

Step #4: Training and Prediction

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

5 Best VPNs for Brunei in 2025: Surf & Stream Privately by Raven Wu

NordVPN vs. Mullvad VPN 2025: Which VPN Is Better? by Gjurgjica Panova

Surfshark vs. Atlas VPN 2025: Which VPN Is Better? by Gjurgjica Panova

PureVPN vs. Private Internet Access 2025: Which Is Better? by Gjurgjica Panova

Recent Comments

EDITOR PICKS

5 Best VPNs for Brunei in 2025: Surf & Stream Privately by Raven Wu

NordVPN vs. Mullvad VPN 2025: Which VPN Is Better? by Gjurgjica Panova

Surfshark vs. Atlas VPN 2025: Which VPN Is Better? by Gjurgjica Panova

POPULAR POSTS

5 Best VPNs for Brunei in 2025: Surf & Stream Privately by Raven Wu

NordVPN vs. Mullvad VPN 2025: Which VPN Is Better? by Gjurgjica Panova

Surfshark vs. Atlas VPN 2025: Which VPN Is Better? by Gjurgjica Panova

POPULAR CATEGORY

ABOUT US

FOLLOW US