Directed neural network. Auto encoders encode itself. It takes some input goes through hidden layers, then outputs. It aims for the outputs to be identical to the inputs.
It is not pure type of unsupervised type of unsupervised learning algorithm. They are selfsupervised learning algorithm.
In Boltzmann Machine we didn't have to compare to any kind of labels,
In Self organizing map we didn't have to compare to anything.
Here we are looking for things, we are comparing certain values which are inputs.
It's kinda of in the verge between supervised and unsupervised. We have inputs, they get encoded abd tgeb tget get decoded and they are compared through the training.
What are Auto Encoders used for?
Feature detection: Once you encode you data these hidden nodes will represents certain features that are important.
Build powerful recommendation system.
Used for encoding data, take data with lots and lots of values and encoded into a smaller representation. Then we can only take the decoder part which will help optimize space.
How are biases represented?
How to Train an Auto Encoder:
Steps Summary:
Input Data Setup:
The array represents observations (users) and features (movies).
Ratings form the input vector x.
Data Input:
User's ratings are passed as the input vector x
Encoding Step:
The input vector x is encoded into a lower-dimensional vector z through a function f, such as a sigmoid function: z=f(Wx+b) where W is the weight matrix and b is the bias.
Decoding Step:
z is decoded back to reconstruct the original vector y, which aims to match the input vector x.
Loss Calculation:
Compute the reconstruction error: d(x,y)=∣∣x−y∣∣
The goal is to minimize this error.
Backpropagation:
Errors are propagated backward through the network to adjust weights W and biases b.
Weight Updates:
Two methods for updating weights:
Reinforcement Learning: Update weights after each observation.
Batch Learning: Update weights after processing a batch of observations.
Epoch Completion:
One full pass of the dataset through the network is an epoch.
Repeat the process for multiple epochs to improve performance.
How to use Overcomplete Hidden Layers in Autoencoders for Feature Extraction:
Overcomplete hidden layers is an underlying concept in most of the variations of autoencoders.
If we have 4 input node , 2 hidden node and 4 output node.
What if we wanted to increase the number of nodes in hidden layer than input layer?
for example:
Why do we want more nodes in our hidden layer?
For example feature extraction.
Having good number of hidden nodes would allow us to extract more features.
But we have a problem:
Auto Encoder can cheat.
The goal of auto encoder can cheat, the auto encoder goal is to get the outputs to equate to the inputs.
As soon as we give it more than 4 hidden nodes, it can assign each node to each input in the input yielding the same result in the output.
Information is just going to fly through
We will just have extra nodes that are not being used.
This could be the end state of this model, and this model is just going to be useless. Which is not going to extract any new information for us.
Sparse AutoEncoders:
Sparse auto encoders is used every where. Sometimes also interchangeable with auto encoders. A sparse encoder is an auto encoder where hidden layer is greater than the input layer. But a regularization technique which introduces sparsity has been applied.
A regularization technique basically means something that helps prevents overfitting or stabilizes the algorithm
In this case if the information was just flying through then it'd be overfitting.
How is it different?
It puts a constraint or a penalty on a loss function, which doesn't allow the autoencoder to use all of its hidden layer every single time. At any given time, auto encoder can onnly use certain number of nodes from it's hidden layer.
Denoising Auto Encoders:
It's a regularization technique, to combat the problem of when we have more nodes in hidden layer than in the input layer . Then the auto encoders just simply copy these values over without findings any meaning.
we take our input, we modify some of our inputs to 0. This happens randomly. Once we put the data through auto encoders. We then compare the outputs with the original value and not with the modified inputs.
This type of auto encoder is stochastic type of auto encoder.
Contractive Autoencoders:
Another regularization techniques like sparse and denoising autoencoders. What it does is leverage the whole training processes. They add penalty into the loss function as it propagates back to the network and simply does not allow the auto encoders to just tsimply just copy these values across.
What is Stacked AutoEncoders:
IF we add two layers of hidden layer into our auto encoer then we have two stage of encoding and one stage of decoding. This is very powerful algorithm. This model (directed models) can supersede the results achieved by deep belief networks (undirected networks), which is very important breakthrough.
Deep autoencoders:
Stacked autoencoders are not the same thing as the stacked auto encoders.
This are RBMs that are stacked, pre-trained layer by layer, unrolled, fine tuned with back propagation. They you do get the directionality. Deep autoencoders comes from RBM.
What are Auto Encoders?
Directed neural network. Auto encoders encode itself. It takes some input goes through hidden layers, then outputs. It aims for the outputs to be identical to the inputs.
It is not pure type of unsupervised type of unsupervised learning algorithm. They are self supervised learning algorithm.
Here we are looking for things, we are comparing certain values which are inputs.
It's kinda of in the verge between supervised and unsupervised. We have inputs, they get encoded abd tgeb tget get decoded and they are compared through the training.
What are Auto Encoders used for?
How are biases represented?
How to Train an Auto Encoder:
Steps Summary:
Input Data Setup:
Overcomplete hidden layers is an underlying concept in most of the variations of autoencoders.
If we have 4 input node , 2 hidden node and 4 output node.
What if we wanted to increase the number of nodes in hidden layer than input layer?
for example:
Having good number of hidden nodes would allow us to extract more features.
But we have a problem:
This could be the end state of this model, and this model is just going to be useless. Which is not going to extract any new information for us.
Sparse AutoEncoders:
Sparse auto encoders is used every where. Sometimes also interchangeable with auto encoders. A sparse encoder is an auto encoder where hidden layer is greater than the input layer. But a regularization technique which introduces sparsity has been applied.
In this case if the information was just flying through then it'd be overfitting.
It puts a constraint or a penalty on a loss function, which doesn't allow the autoencoder to use all of its hidden layer every single time. At any given time, auto encoder can onnly use certain number of nodes from it's hidden layer.
Denoising Auto Encoders:
It's a regularization technique, to combat the problem of when we have more nodes in hidden layer than in the input layer . Then the auto encoders just simply copy these values over without findings any meaning.
we take our input, we modify some of our inputs to 0. This happens randomly. Once we put the data through auto encoders. We then compare the outputs with the original value and not with the modified inputs.
This type of auto encoder is stochastic type of auto encoder.
Contractive Autoencoders:
Another regularization techniques like sparse and denoising autoencoders. What it does is leverage the whole training processes. They add penalty into the loss function as it propagates back to the network and simply does not allow the auto encoders to just tsimply just copy these values across.
What is Stacked AutoEncoders:
IF we add two layers of hidden layer into our auto encoer then we have two stage of encoding and one stage of decoding. This is very powerful algorithm. This model (directed models) can supersede the results achieved by deep belief networks (undirected networks), which is very important breakthrough.
Deep autoencoders:
Stacked autoencoders are not the same thing as the stacked auto encoders.
This are RBMs that are stacked, pre-trained layer by layer, unrolled, fine tuned with back propagation. They you do get the directionality. Deep autoencoders comes from RBM.