Dr Andy Corbett

by Dr Andy Corbett

Lesson

2. What are Deep Neural Networks?

In this video you will...
  • ✅ See a first-hand graphical representation of the building blocks that compose every neural network.
  • ✅ Learn how to fit components together to construct wide and deep MLPs.
  • ✅ Review the back propagation process--the essential mechanism by which networks are trained.

We seem to be talking about AI and deep neural networks all the time. But can any of us actually write one down? That's what we shall do in this video. We will write down the prototype neural network: a multi-layered perceptron (MLP).

The MLP is modeled on the human brain, and as such composed of many small neurons--or component models. Each of these come with parameters, and each of those needs to be trained so that the model converges on its target.

What do we mean by component models?


Let's pull up for a moment ask ourselves, how can we interpret models? Suppose that there has been a new law forcing our bananas to be straighter--the curvy ones are getting the chop! We would need a model to predict the curvature--or alternatively the eccentricity--from easily observed factors such as the weight.

So suppose I have the weights and arc lengths of a bunch of bananas

x=[(119g,29cm),(124g,32cm),(101g,27cm),(132g,34cm),...].\mathbf{x} = [(119g, 29cm), (124g, 32cm), (101g, 27cm), (132g, 34cm), ... ].

and for these we have recorded their eccentricities: y=[0.21,0.30,0.25,0.18,...]y = [0.21, 0.30, 0.25, 0.18, ...]. We have a supervised learning task. What is the most simple model that we could try to explore this relationship with? A linear model:

y=mx+c.y = \mathbf{m}\cdot\mathbf{x} + c.

It is interpretable as we have a physical understanding of the parameters mm and cc as the scaling factor and baseline eccentricity.

But let's say the problem becomes more complicated. In that case we may need to add other functions to our model, such as a logistic function σ\sigma:

y=σ(mx+c).y = \sigma(\mathbf{m}\cdot\mathbf{x} + c).

Did someone mention something about features?


Okay, the features. Well they are the columns of x\mathbf{x}, which for our slippery example is weight and arclength.

But perhaps we need much more complexity.

A linear model supports no interaction between these features. A deep neural network is quite the opposite: many models of the form y=σ(mx+c)y = \sigma(\mathbf{m}\cdot\mathbf{x} + c) are stacked on top of each other, connected in every possible way, for the purpose of combining the features in a way which might predict the answer.

In fact, since neural networks are quite robust with respect to the type of input data, we could even substitute the measurements made for a picture of the banana. That's the power of deep learning.

In this video we'll take a look at how these large conglomerate models are formed and optimised at a high level, but with all the details in plain sight.