Deploying as a serverless image

In the last lesson we wrote the stack code to classify an image using a website.

This time, we're going to deploy the app to the cloud so that we can share it with everyone!

⚠️ Your app will be avalible to everyone on the internet. Careful where you share the URL until you finish protecting your site. Before you send it to your grandma showing her that the AI classifies pictures of her as an aircraft, you should at least consider:

CORS: Control which domains can interact with your server, reducing the attack surface.
HTTPS: Ensures data integrity and confidentiality.
Authentication & Authorization: Make sure only permitted users can perform actions.
Rate Limit: Prevent abuse and maintain availability.
Sanitize Input: Stop injection attacks by validating and sanitizing user inputs.
Security Headers: Help the browser enforce security policies.
Logging & Monitoring: Early detection of malicious activities.
Validation & Encoding: Confirm the integrity of data being sent or received.
Encryption: Protect data at rest or in transit.
Update Dependencies: Keep all libraries and components up-to-date to minimize vulnerabilities.

🗄 Serverless

Wouldn't it be amazing if we could just upload our code to a magical server, which was always online, behaved in the same way each time it handled a request, and could handle as many requests as we could ever need? And was completely free?

It would!

However, we don't have that. But we do have "serverless" architecture, which is pretty close - just not (quite) free.

To be clear, there are still physical machines out there running our code! We just don't have to think too much about how they work, or how to manage them. We just upload our code, and it runs.

☁️ Serverless is a cloud computing execution model where the cloud provider dynamically manages the allocation and provisioning of servers. A serverless application runs in stateless compute containers that are event-triggered, ephemeral (may last for a single invocation), and fully managed by the cloud provider. Pricing is based on the number of executions rather than pre-purchased compute capacity.

It's not that there are no servers, it's that you don't have to worry about them. You can focus on your application and not the infrastructure.

We've already made our app Lambda compatible, with the code at the end of our main.py script:

# Define the Lambda handler
handler = Mangum(app)


# Prevent Lambda showing errors in CloudWatch by handling warmup requests correctly
def lambda_handler(event, context):
    if "source" in event and event["source"] == "aws.events":
        print("This is a warm-ip invocation")
        return {}
    else:
        return handler(event, context)

This allows us to server our FasrAPI application using an AWS lambda.

The Mangum wrapper is what facilitates this, and the def lambda_handler acts as the entry point to our Docker image.

🐳 Docker

Now there's one more thing we need to think about before we can deploy our code to the cloud. We've been using Python to write our code, and we've been using a few libraries to help us out.

When we run our code on our own machine, we have a specific version of Python installed, and we have specific versions of the libraries we're using installed. If we run our code on a different computer, we might have a different version of Python, or different versions of the libraries we're using. This means that our code might behave differently on different computers.

In order to make sure that our code works in the same way each time it is run, we need to make sure that the environment it is run in is the same each time. This is where Docker comes in.

We can package our code, and all of its dependencies into a single "image", which can then be run on any machine that has Docker installed.

Images are read-only templates that contain a set of instructions for creating a container that can run on the Docker platform. It's a single file with everything required to run an application, including code, runtime, system tools, system libraries, and settings. Images are often created with the build command, and they'll produce a container when started with run.

Containers are a runnable instance of an image. You can create, start, stop, move, or delete a container using the Docker API or CLI. You can connect a container to one or more networks, attach storage to it, or even create a new image based on its current state. By default, a container is relatively well isolated from other containers and its host machine. You can control how isolated a container's network, storage, or other underlying subsystems are from other containers or from the host machine.

Dockerfile

The set of instructions for creating a Docker image is called a "Dockerfile":

touch Dockerfile

It's a text file that contains all of the commands a user could call on the command line to assemble an image. This might include:

Which base image to use (this is might be an almost empty operating system with some basic utilities installed)
Which files to copy into the image
Which commands to run when the image is built
Which ports to expose
Which commands to run when the image is started

Here's what our Dockerfile looks like:

FROM public.ecr.aws/lambda/python:3.9

# Install dependancies
COPY ./requirements.txt /requirements.txt
RUN pip install --no-cache-dir -r /requirements.txt

# Copy app code
COPY . .

# Lambda handler
CMD ["main.lambda_handler"]

We're using a pre-built image from Amazon, which has Python 3.9 installed.
We copy our requirements.txt file into the image, and install the dependencies.
We copy the rest of our code into the image.
We tell Docker which function to run when the image is started.

In our Dockerfile we use the CMD command to tell Docker to run our lambda_handler function in main.py when the image is started.

🏗 Building the Image

We don't have to do this manually, but we can check that everything is working by building the image and running it locally.

However, when on AWS the app is run by Lambda, and locally we use Uvicorn. So we need to make some changes to the Dockerfile.

The base image needs to be:

FROM python:3.9

And the CMD command in our Dockerfile to run Uvicorn instead of Lambda.

# Expose the port the app runs on
EXPOSE 8000

# Command to run the application using Uvicorn
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

To build the image run:

docker build -t image-classification .

The -t flag tells Docker to tag the image with the name "image-classification".

And then run the container with:

docker run -p 8000:8000 image-classification

The -p flag tells Docker to map port 8000 on the host machine to port 8000 in the container.

If you visit http://localhost:8000/ in your browser, you should see the app running in the container!

⚠️ You need to change the last lines of the dockerfile back to:

FROM public.ecr.aws/lambda/python:3.9

# Lambda handler
CMD ["main.lambda_handler"]

before you deploy to AWS Lambda!

☁️ Serverless -less

We're going to use a tool called serverless to deploy our code to AWS Lambda. Serverless is a framework that makes it easy to deploy serverless applications to the cloud. I wish they had called it "serverlessless", but they didn't. So now the term is even more confusing than it needs to be.

We need to tell serverless how to build our image, and how to deploy it. We do this by creating a serverless.yml file:

service: image-classification-service

provider:
 name: aws
 memorySize: 4096
 region: eu-west-2
 timeout: 60
 ecr:
  images:
   image-classification:
    path: ./
    platform: linux/amd64

functions:
 AppFunction:
  image:
   name: image-classification
  url: true
  events:
   - schedule: rate(1 minute)

We give our service a name.
We tell serverless which cloud provider we're using.
Which region we want to deploy to, eu-west-2 (London) in this case.
How much working memory to allocate to our function (4096MB = 4GB).
How long to wait for our function to finish before timing out (60 seconds).
Elastic Container Registry (ECR) is a service that allows us to store Docker images. We tell serverless to build a single image, called "image-classification", from the Dockerfile in the current directory. We also tell it to build the image for the linux/amd64 platform.
We tell serverless to create a function called "AppFunction", and to use the image we just built.
And we want to poke our function every minute to keep it "warm" so that it responds quickly to requests.

☃️ Cold starts are when a function is invoked for the first time, or after it has been idle for a while. The function needs to be loaded into memory, and the container needs to be started. This can take a few seconds, which is not ideal for a web app. We can keep our function "warm" by invoking it every minute, which means it should respond quickly to requests.

⚠️ Make sure that you've got the serverless framework installed:

npm install -g serverless

And then deploy the app to the cloud!

serverless deploy

After a while (might be a few minutes), a URL will be printed out in the terminal. You can open this URL in your browser, and you should see the app running in the cloud!

If you wish to remove your app from the cloud, you can run:

serverless remove

⏳️ Limits

AWS Lambdas are very very cheap per invocation (like fractions of a penny cheap), and AWS offers a reasonable amount of free invocations per month.

Nonetheless, if you're concerned you can add a rate limit to your application by modifying the serverless.yml file:

functions:
 AppFunction:
  memorySize: 4096 # Memory size in MB
  timeout: 60 # Timeout in seconds
  image:
   name: image-classification-example
  url: true
  events:
   - schedule: rate(1 minute) # Ping the function every 1 minute to keep it warm
   - http:
      path: api/v1/resource
      method: get
      cors: true
      throttling:
       maxRequestsPerSecond: 10

📑 APPENDIX

🎽 How to Run

🧱 Build Frontend

Navigate to the frontend/ directory:

cd frontend

Install any missing frontend dependancies:

npm install

Build the files for distributing the frontend to clients:

npm run build

🖲 Run the Backend

Go back to the project root directory:

cd ..

Activate the virtual environment, if you haven't already:

source .venv/bin/activate

Install any missing packages:

pip install -r requirements.txt

If you haven't already, train a CNN:

python scripts/train.py

Continue training an existing model:

python scripts/continue_training.py

Serve the web app:

python -m uvicorn main:app --port 8000 --reload

🚀 Deploy

Deploy to the cloud:

serverless deploy

Remove from the cloud:

severless remove

🗂️ Updated Files

Project structure

.
├── .venv/
├── .gitignore
├── .serverless/
├── resources
│   └── dog.jpg
├── frontend
│   ├── build/
│   ├── node_modules/
│   ├── public/
│   ├── src
│   │   ├── App.css
│   │   ├── App.test.tsx
│   │   ├── App.tsx
│   │   ├── ImageUpload.tsx
│   │   ├── index.css
│   │   ├── index.tsx
│   │   ├── logo.svg
│   │   ├── react-app-env.d.ts
│   │   ├── reportWebVitals.ts
│   │   ├── setupTests.ts
│   │   └── Sum.tsx
│   ├── .gitignore
│   ├── package-lock.json
│   ├── package.json
│   ├── README.md
│   └── tsconfig.json
├── output
│   ├── activations_conv2d/
│   ├── activations_conv2d_1/
│   ├── activations_conv2d_2/
│   ├── activations_dense/
│   ├── activations_dense_1/
│   ├── model.h5
│   ├── sample_images.png
│   └── training_history.png
├── scripts
│   ├── classify.py
│   ├── continue_training.py
│   └── train.py
├── Dockerfile
├── main.py
├── README.md
├── requirements.txt
└── serverless.yml

`Dockerfile`

FROM public.ecr.aws/lambda/python:3.9

# Install dependancies
COPY ./requirements.txt /requirements.txt
RUN pip install --no-cache-dir -r /requirements.txt

# Copy app code
COPY . .

# Lambda handler
CMD ["main.lambda_handler"]

`serverless.yml`

service: image-classification-service

provider:
 name: aws
 memorySize: 4096
 region: eu-west-2
 timeout: 60
 ecr:
  images:
   image-classification:
    path: ./
    platform: linux/amd64

functions:
 AppFunction:
  image:
   name: image-classification
  url: true
  events:
   - schedule: rate(1 minute)

Web App

11. Deploying as a serverless image