Recently the term deep learning has become a buzz word in the field of technology. People are excited about using deep learning to develop products that solve problems (including those in the field of Artificial Intelligence a.k.a AI) which seemed impossible (or too futuristic) in the past.
A few days ago, there was news about how a computer system trained to play Go using deep learning beat a top human Go player that caught my attention. Until now Go (unlike chess) was a game that computers couldn’t always beat good human players at. So I was intrigued. What is this deep learning? I set out to figure it out using some papers, lectures and this book.
“If you can’t explain it simply, you don’t understand it well enough.”
So this post is an attempt to explain some of these concepts and to ensure that I understand them well.
Let’s start with concisely stating what deep learning is and then go into further details.
Deep Learning is an approach to Machine Learning based on the knowledge of
- learning algorithms
- applied mathematics
Deep Learning has recently been revived in popularity due to increased computation ability of CPUs and general purpose GPUs, huge amount of datasets (quite a bit of it labeled) and new techniques to train deeper neural networks.
Let’s take a step back to understand how deep learning fits with AI.
- Classical algorithms in Computer Science map representations to output.
- The goal of AI is to learn how to generate these representations in order to convert them to output, without human intervention.
- There are many factors that humans use to understand / classify the observed input data into output.
- Usually, humans are required to disentangle these features and remove the redundant ones to specify them as representations in classical algorithms.
Deep Learning organizes these features in a hierarchy where simpler concepts combine to generate a complex one.
Deep learning uses multiple layered neural networks to understand inputs and extract abstract concepts (a.k.a. features) from it. These abstract concepts are combined hierarchically to generate complex representations that in turn generate output.
It is also worth noting that Deep Learning does not attempt to simulate the human brain. Instead it takes the knowledge we have about learning hierarchical representations automatically from data and uses it to build computer systems that solve tasks requiring intelligence.