Welcome to this article on neural network. A neural network is one of the popular machine learning algorithms of the decade. Neural networks have always proven their outperforming speed and accuracy as compared to traditional machine learning algorithms. With various variants like CNN (Convolutional Neural Networks), RNN(Recurrent Neural Networks), ANN(Artificial Neural Networks), AutoEncoders, Boltzmann Machines, Deep Learning etc. neural networks are getting popular for data scientists and machine learning enthusiasts.
Let’s understand what is neural network?
Intelligence is one of the beautiful gifts from God to mankind that helped us achieve impossible things. This intelligence is an intangible attribute that humans benefit from.
Scientists and researchers have always been fascinated by the working of the brain and always tried to mimic human behaviour by mimicking the structure of the brain.
One of the most fascinating things about the brain is that there are around 100 billion neurons present in it and also every neuron can form thousands of links with other neurons in this way, giving a typical brain well over 100 trillion possible links.
The brain receives the stimulus from the outside world and processes it on input and then generates the output. As the task gets complicated, multiple neurons form a complex network, passing information among themselves.
An Artificial Neural Network tries to mimic a similar behaviour. The network you see below is a neural network made of interconnected neurons.
Let’s have a look, what is Deep Learning?
Deep Learning is a subfield of machine learning which is concerned with the algorithms inspired by the structure and function of the brain. Neural networks is one of the most powerful and widely used algorithms.
At first look, neural networks may seem like a black box; an input layer that gets the data or input data into the “hidden layers” and after processing, we can see the information provided by the output layer.
Neural network (NN) is a set of layers of highly interconnected processing elements also called the neurons or nodes that make a series of transformations on the data to generate its own understanding of it (commonly known as features). They are modeled after the human brain and the neural network has the goal of having machines mimic how the brain works.
A neural network is composed of 3 main types of layers:
- Input layer — It is used to pass in our input (can be an image, text or any suitable type of data for Neural Network).
- Hidden Layer — These are the layers in between the input and output layers. These layers are responsible for learning the mapping between input and output. They perform all the operations and also process and reduce the data to extract insights from it.
- Output Layer — This layer is responsible for giving us the output of the NN given our inputs.
The main working of a neural network occurs at the Hidden Layer.
It is in the hidden layers where all the processing actually happens through a system of connections known as neurons or nodes which are characterized by weights and biases (commonly referred to as w and b).
At this stage, the input is received, the neuron or node calculates a weighted sum (multiplying weight by the input) and adding the bias.
Then it finds the result using a pre-set activation function in a neural network where it decides whether it should be ‘fired’ or not. Afterward, the neuron transmits the information downstream to other connected neurons in a process called ‘forward pass’. At the end of this process, the last hidden layer is linked to the output layer which has one neuron for each possible desired output.
A perceptron neural network, also known as the atomic unit of neural network uses a function to learn a binary classifier by mapping a vector of binary variables to a single binary output and it can also be used in supervised learning.
The perceptron follows these steps:-
- Multiply all the inputs by their respective weights w which is a real number that tells how important the corresponding inputs are to the output,
- Add them together referred to as a weighted sum: ∑ wj xj,
Apply the activation function, in other words, determine whether the weighted sum is greater than a threshold value, where the threshold is equivalent to bias, and assign 1 or less and assign 0 as an output.
One of the best parts of these deep learning algorithms is that we can vary the weights and the bias to obtain distinct models of our decision-making. Even we can assign more weight to those inputs so that if they are positive, it will add more favour to our desired output. Also, because the bias can be understood as a measure of how difficult or easy it is to output 1, we can drop or raise its value if we want to make more or less likely the desired output to happen.
One of the drawbacks of perceptron neural network is that it can only give value either 1 or 0 and also small changes in weights or bias, even in only one perceptron, can severely change our output going from 1 to 0 or vice versa since, perceptron can either be activated (1) or not (0).
What we really want is to be able to gradually change the behaviour of our network by introducing very minute modifications in the weights or bias. Here is where a more modern type of neuron comes in handy sigmoid neurons.
The main difference between a sigmoid neuron and a normal perceptron is that the input and the output can be any continuous value between 0 and 1 for a sigmoid neuron whereas a perceptron contains either 1 or 0.
The output is obtained after applying the sigmoid function to the inputs considering the weights, w, and the bias, b.
Activation Function in neural network
Here, are all activation functions that can be used in neural networks.
1. Binary Step Function: –
An activation function neural network which is a based classifier i.e. whether or not the neuron should be activated based on the value from the linear transformation. In simple words, if the input to the activation function in neural network is greater than a threshold, then the neuron is activated, else it is deactivated, i.e. its output is not considered for the next hidden layer. Let us look at it mathematically-
f (x) = 1, x>=0
= 0, x<0
2. Sigmoid Function: –
One of the most widely used non-linear activation functions. Sigmoid transforms the values between the range 0 and 1.
f (x) = 1/(1+e-x)
3. Tanh Function: –
The tanh function is very similar to the sigmoid function. The only difference is that it is symmetric around the origin. The range of values, in this case, is from -1 to 1. Thus the inputs to the next layers will not always be of the same sign. The tanh function is defined as-
tanh (x)=2 * sigmoid(2x) – 1
4. ReLU: –
The ReLU function is another non-linear activation function in neural network that has gained popularity in the deep learning domain recently. ReLU stands for the Rectified Linear Unit. The main advantage of using the ReLU function over other activation functions is that it does not activate all the neurons at the same time.
This means that the neurons will only be deactivated if the output of the linear transformation is less than 0.
5. Leaky ReLU: –
The Leaky ReLU function is nothing but an improved version of the ReLU function. As we saw for the ReLU function, the gradient is 0 for x<0, which would deactivate the neurons in that region. Leaky ReLU is defined to address this problem. Instead of defining the Relu function as 0 for negative values of x, we define it as an extremely small linear component of x. Here is the mathematical expression-
f (x) = 0.01 * x, x<0
= x, x>=0