Deep learning 101. Chapter 1.

What is neural networks?

What is Deep learning?

Deep learning were invented in 1960s, it caught wind around 1980s

Everybody thought that deep learning or neural networks were this new things that is going to impact the world, is going to solve all the world problems, and then it kind of slowly died off over the next decades.

Why did neural network not survive for a long time?

So what happened?

Was it a not a good invention?

The reason, it the fact that the technology back then was not up to the right standard in order to facilitate neural networks.

In order to Deep learning and Neural networks to work properly you need:

Lots of data
Processing power to process that data and facilitate neural networks.
How did the storage looked back in time?
5 mb hard drive in 1956:
Hard drive in 1980
right now: 20$ for 256 gb drive

Still what is Deep learning?

Idea behind deep learning is to mimic how the human brain operates and recreate it

Human brains is one of the most creative, powerful tool on this planet for learning and adapting. If computer could copy that , we can just leverage that.

Example of brain neurons:

These neurons are connected to 1000s of other neighbour neurons.

These neurons are responsible for any activities we perform.

How do we recreate this in computer?

We create a artificial structure, called artificial neural net
where we have nodes i.e neurons
we have neurons for input values in input layers
these are the values you know about a certain situation.
for instance if you're modeling something, you want to predict something to start your predictions off.
Output layer: The value we want to predict.
Fraudulent transaction vs real transactions
Spam vs Ham or whatever
In Between these two layers, there is hidden layers

For example,

if you heard a roar of a lion or a whistle from someone
it's the input, which can be represented as input layer
Information are collected through senses and passed to the input layers of the brain
These information are going through billions & billions of neurons before a command is send from the brain.
Brain process this information of sound along with other noises
this is hidden layers
There are billions and billions of neurons to process the information.
There are lots of hidden layers in the brain.
Finally the information passes through the hidden layers to output layer
which may be make you turn around and see what it is or maybe make you runaway

Next, we will discuss:

The Neuron
The Activation Function
How do Neural Networks work?
How do Neural Networks learn?
Gradient Descent
Stochastic Gradient Descent
Backpropagation

The Neuron:

The very first step to creating artificial neural networks is to recreate a neuron.

So how do we do that?

Let's see how neuron looks first

This is how the neuron looks like. It has

Body
Dendrites
Axon

Neuron by themselves are like an ant,

They can't build ant hill by themselves.
They can't establish colony
They barely can carry food for survivals.

But the game changes when there are lots of ants, millions of ants they can do anything.

Similarly, How do neurons work together then?

That's what the dendrites and axon are for.

Dendrites are receiver of signal
Axon are transmitters of the signal

Axon of A Neuron ==passes spikes to ==> Dendrites of neighbouring neurons.

Now let's see how we can represent this on a computer?

In the right side image:

Just like how a neuron is composed:

Dendrites (input layer)
Nucleus , a process of information
Axon (output layer)

Similarly, in a artificial neuron on the left side of the image:

We have input layers X, these are independent variables
Linear function & Activation function to process information
Output layer y.

Whatever inputs you are putting in, that's for 1 row, and the output you get back is for the exact same row.

What is independent variables?

Age of a person
Race
Salary of a person
location

We need to standardize them:
Has mean of zero and variance of One
Or normalize them
you subtract the minimum value and then you / (max - min) i.e range to get the value between 0 & 1

What can be our output value?

Binary value i.e 0 or 1
Male or female
Categorical Value
Continuous value i,e price

Synapses:

Tiny gaps between neurons that allow them to communicate with each other.

In artificial neural networks, synapses between nodes, they are assigned weights.

Comparing to artificial neuron node:

Here, w1, w2, w3, wn are weights assigned to synapeses.

Weights are crucial for neural networks:

Weights are how neural networks learn
By adjusting weight, in every single case, neural network decides what signal is important vs not to a certain neuron,
What signal gets passed along vs what signals are dropped
Or to what strength or extent signals gets passed along.
They are the parameter that gets tuned/adjusted during the process of learning.

When you are training your artificial neural network, you're basically adjusting all of the weights in all of the synapses across the whole neural network.

Thus the birth of Gradient Descent and Backpropagation.

So signals go to neuron and what happens to the neuron?

What processing happens inside the neuron?

All of the values passed X1, X2, X3, Xn & W1, W2, W3, Wn gets added up.
We calculate the weighted sum .
Then it applies activation function
It is applied to weighted sum

Depending on the function , the neuron will either pass the signal or it won't pass the signal on to the next neuron.

What is Activation Function?

Threshold function (yes or no type)
if value < 0, return 0
0 <= value, return 1
Sigmoid function
It's a function which is used in logistic regression.
It is smooth and useful in the output layer, especially when trying to predict probability.
Rectifier Function.
Hyperbolic Tangent Function.

Assuming your independent variable is binary i.e 0 or 1. Which activation function would you use?

There is two option
Threshold activation (either 0 or 1)
Sigmoid activation (between 0 and 1)

How many layers of hidden layer it has to be to considered deep learning neural network?

Shallow vs. Deep:
A neural network with 1-2 hidden layers is typically considered shallow.
Networks with 3 or more hidden layers are generally called deep.
Typical ranges:
Many practical deep learning models have anywhere from 3 to 150 layers.
Some very large models can have hundreds or even thousands of layers.
It depends on the task:
Simpler problems might only need a few layers.
More complex tasks often benefit from more layers.
Diminishing returns:
Adding more layers doesn't always improve performance.
At some point, extra layers can lead to overfitting or training difficulties.
The number of neurons per layer, type of layers, and other architectural choices are also important.

This is the note i took from udemy lecture & CSCE 598 Deep learning class @ ULL.

https://www.udemy.com/course/deeplearning/learn/lecture/6743222#overview

#deeplearning #neuralnetworks #machinelearning #chapter1