Tensorflow is an open-source machine learning library developed by Google. One of its applications is to developed deep neural networks. The module tensorflow.nn
provides support for many basic neural network operations.
An activation function is a function which is applied to the output of a neural network layer, which is then passed as the input to the next layer. Activation functions are an essential part of neural networks as they provide non-linearity, without which the neural network reduces to a mere logistic regression model. The most widely used activation function is the Rectified Linear Unit (ReLU). ReLU is defined as . ReLU has become a popular choice in recent times due to the following reasons:
- Computationally faster: The ReLU is a highly simplified function which is easily computed.
- Fewer vanishing gradients: In machine learning, the update to a parameter is proportional to the partial derivative of the error function with respect to that parameters. If the gradient becomes extremely small, the updates will not be effective and the network might stop training at all. The ReLU does not saturate in the positive direction, whereas other activation functions like sigmoid and hyperbolic tangent saturate in both directions. Therefore, it has fewer vanishing gradients resulting in better training.
The function nn.relu()
provides support for the ReLU in Tensorflow.
Syntax: tf.nn.relu(features, name=None)
Parameters:
features: A tensor of any of the following types: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, half, uint32, uint64.
name (optional): The name for the operation.Return type: A tensor with the same type as that of features.
# Importing the Tensorflow library import tensorflow as tf # A constant vector of size 6 a = tf.constant([ 1.0 , - 0.5 , 3.4 , - 2.1 , 0.0 , - 6.5 ], dtype = tf.float32) # Applying the ReLu function and # storing the result in 'b' b = tf.nn.relu(a, name = 'ReLU' ) # Initiating a Tensorflow session with tf.Session() as sess: print ( 'Input type:' , a) print ( 'Input:' , sess.run(a)) print ( 'Return type:' , b) print ( 'Output:' , sess.run(b)) |
Output:
Input type: Tensor("Const_10:0", shape=(6, ), dtype=float32) Input: [ 1. -0.5 3.4000001 -2.0999999 0. -6.5 ] Return type: Tensor("ReLU_9:0", shape=(6, ), dtype=float32) Output: [ 1. 0. 3.4000001 0. 0. 0. ]
Leaky ReLU:
The ReLU function suffers from what is called the “dying ReLU” problem. Since the slope of the ReLU function on the negative side is zero, a neuron stuck on that side is unlikely to recover from it. This causes the neuron to output zero for every input, thus rendering it useless. A solution to this problem is to use Leaky ReLU which has a small slope on the negative side.
The function nn.leaky_relu()
provides support for the ReLU in Tensorflow.
Syntax: tf.nn.leaky_relu(features, alpha, name=None)
Parameters:
features: A tensor of any of the following types: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, half, uint32, uint64.
alpha: The slope of the function for x < 0. Default value is 0.2.
name (optional): The name for the operation.Return type: A tensor with the same type as that of features.
# Importing the Tensorflow library import tensorflow as tf # A constant vector of size 6 a = tf.constant([ 1.0 , - 0.5 , 3.4 , - 2.1 , 0.0 , - 6.5 ], dtype = tf.float32) # Applying the Leaky ReLu function with # slope 0.01 and storing the result in 'b' b = tf.nn.leaky_relu(a, alpha = 0.01 , name = 'Leaky_ReLU' ) # Initiating a Tensorflow session with tf.Session() as sess: print ( 'Input type:' , a) print ( 'Input:' , sess.run(a)) print ( 'Return type:' , b) print ( 'Output:' , sess.run(b)) |
Output:
Input type: Tensor("Const_2:0", shape=(6,), dtype=float32) Input: [ 1. -0.5 3.4000001 -2.0999999 0. -6.5 ] Return type: Tensor("Leaky_ReLU_1/Maximum:0", shape=(6,), dtype=float32) Output: [ 1. -0.005 3.4000001 -0.021 0. -0.065 ]