In preparation for building a more full scale, regularized auto-encoder for motor anomaly detection (which you can find here), I wanted to experiment with implementing one from scratch. This model is trained to replicate images from the MNIST dataset, while gaining an inert understanding for what certain classes present themselves as (what does a 7 typically look like). Learn more below.
Auto-encoders are neural networks used for unsupervised learning, particularly for dimensionality reduction and feature learning. It consists of two main components: an encoder and a decoder.
-
Encoder: The encoder compresses the input data into a lower-dimensional representation (latent space). This compressed representation captures the most important features of the input data.
-
Decoder: The decoder takes the compressed representation and attempts to reconstruct the original input data from it. The goal is to minimize the difference between the input data and the reconstructed output, as measured by some loss function "L".
Auto-encoders are widely used in various applications, including anomaly detection, image denoising, and data compression. In the process of encoding and decoding, the model learns to identify patterns and structures in the data.
This implementation uses a Jupyter notebook for easy prototyping.
- PyTorch, for model architecture, training loops and tensor operations
- Matplotlib, for visualization
The model provided great training feedback:
Below are some reconstruction results from MNIST:
As mentioned above, auto-encoders can also be used to de-noise images. This is because their "understanding" of features in an image can help it infer where pixels should be. I wanted to experiment with this myself. I applied a strong Gaussian blur to MNIST images, and fed them into the model to be reconstructed.
As seen above, the model is typically able to slightly denoise the image, using its understanding of various digits to infer pixel values (brightness). This was really cool to see.