Dr Andy Corbett

<Callout type='success' heading='In this video you will...'>

- ✅ See a first-hand graphical representation of the building blocks that compose every neural network.
- ✅ Learn how to fit components together to construct wide and deep MLPs.
- ✅ Review the back propagation process--the essential mechanism by which networks are trained.

</Callout>

We seem to be talking about AI and deep neural networks all the time. But can any of us actually write one down? That's what we shall do in this video. **We will write down the prototype neural network: a multi-layered perceptron (MLP)**.

<Image
	src={"https://hackmd.io/_uploads/ByNq6-WV6.png"}
	alt={""}
	width={960}
	height={640}
/>

The MLP is modeled on the human brain, and as such composed of many small neurons--or component models. Each of these come with parameters, and each of those needs to be trained so that the model converges on its target.

## What do we mean by component models?

Let's pull up for a moment ask ourselves, how can we interpret models? Suppose that there has been a new law forcing our bananas to be straighter--the curvy ones are getting the chop! We would need a model to predict the curvature--or alternatively the eccentricity--from easily observed factors such as the weight.

<Image
	src={"https://hackmd.io/_uploads/SJKFY-b4T.png"}
	alt={""}
	width={1000}
	height={665}
/>

So suppose I have the weights and arc lengths of a bunch of bananas

$$
\mathbf{x} = [(119g, 29cm), (124g, 32cm), (101g, 27cm), (132g, 34cm), ... ].
$$

and for these we have recorded their eccentricities: $y = [0.21, 0.30, 0.25, 0.18, ...]$. We have a supervised learning task. What is the most simple model that we could try to explore this relationship with? A linear model:

$$
y = \mathbf{m}\cdot\mathbf{x} + c.
$$

It is interpretable as we have a physical understanding of the parameters $m$ and $c$ as the scaling factor and baseline eccentricity.

But let's say the problem becomes more complicated. In that case we may need to add other functions to our model, such as a logistic function $\sigma$:

$$
y = \sigma(\mathbf{m}\cdot\mathbf{x} + c).
$$

<Image
	src={"https://hackmd.io/_uploads/Hktf1fb46.png"}
	alt={""}
	width={600}
	height={400}
/>

## Did someone mention something about features?

Okay, the features. Well they are the columns of $\mathbf{x}$, which for our slippery example is weight and arclength.

> But perhaps we need much more complexity.

A linear model supports no interaction between these features. A **deep neural network** is quite the opposite: many models of the form $y = \sigma(\mathbf{m}\cdot\mathbf{x} + c)$ are stacked on top of each other, connected in every possible way, for the purpose of combining the features in a way which might predict the answer.

In fact, since neural networks are quite robust with respect to the type of input data, we could even substitute the measurements made for a picture of the banana. That's the power of deep learning.

In this video we'll take a look at how these large conglomerate models are formed and optimised at a high level, but with all the details in plain sight.

<Image
	src={"https://hackmd.io/_uploads/HJvmHfb4a.png"}
	alt={""}
	width={1000}
	height={561}
/>


2. What are Deep Neural Networks?

What do we mean by component models?

Did someone mention something about features?

Return to Lesson Index

Next Lesson