Dr Andy Corbett

by Dr Andy Corbett

Lesson

Support Vector Machines

9. A Code Walkthrough for SVMs

📂 Resources

Download the resources for this lesson here.

In this lesson we shall put our theory into practice and walk through a scikit-learn implementation of an SVM classifier.

📑 Learning Objectives
  • A first touch on coding a linear SVM with scikit-learn.
  • Exploring hyperparameters and kernel choices
  • Visualising output

Data generation and set-up


Let's attempt to solve a binary classification problem programmatically. First, let's import the necessary python packages and functions.

import matplotlib.pyplot as plt
import numpy as np
from sklearn import svm
from sklearn.utils import shuffle

Now we generate some synthetic data to analyse. For this example let's warm up with a couple of disjoint blobs.


# Set global constants
NUM_DATA = 200  # Number of data points in each cluster
SPREAD = 0.13  # Standard deviation of the blobs
REDUCTION = 0.2  # Reduction factor to use later on
CPARAM = 10  # Hyper-parameter of the SVM
SEED = 31  # Random seed for reproducibility
np.random.seed(SEED)


# Generate two 'piles' of data
def get_blobs(num_samples, std_dev):
    """Generate two 2D normal distributions in the NE and SW quadrants."""
    cov = np.asarray([[std_dev, 0], [0, std_dev]])
    mean_ne = np.asarray(2*[2.5,])
    mean_sw = np.asarray(2*[1.5,])
    ne = np.random.multivariate_normal(mean=mean_ne, cov=cov, size=num_samples)
    sw = np.random.multivariate_normal(mean=mean_sw, cov=cov, size=num_samples)
    return ne, sw

ne, sw = get_blobs(NUM_DATA, SPREAD * REDUCTION)

Then let's prepare the data for analysis.

# Shuffle data points for good measure
X = np.concatenate((ne, sw))
y = np.asarray(len(ne)*[1,] + len(sw)*[-1,])
X, y = shuffle(X, y, random_state=SEED)

SVM Model


Now we'll fit the SVM model contained in scikit-learn to the (X, y) data.

# Fit an SVM Classifier
clf = svm.SVC(kernel='linear', C=CPARAM)
clf.fit(X, y)

As a demonstration, let's see how to predict the points in the centre of each of these blobs.

# Test the output
a = [1, 1]
b = [3, 3]
pred_a, pred_b = clf.predict([a, b])
print('Example predictions:')
print(f'\t {a} is classified by {pred_a}')
print(f'\t {b} is classified by {pred_b}')
print(f'Number of support vectors: {clf.n_support_} in classes {clf.classes_}')
print(f'{clf.coef_}')

In this example, we'll simply inspect the output graphically by plotting the decision boundary alongside the classification margin of the SVM. This is the luxury that two dimensions offer!

# Hyperplane equation: c[0]*y + c[1]*x + intercept_[0] = 0
c = clf.coef_[0]
slope = -c[0] / c[1]
y0 = slope * x0 - (clf.intercept_[0] / c[1])

# Margin boundaries
margin = 1 / np.sqrt(np.sum(clf.coef_**2))
y_neg = y0 - np.sqrt(1 + slope**2) * margin
y_pos = y0 + np.sqrt(1 + slope**2) * margin

Now we can generate a plot. In particular we can single out the support vectors and govern the final solution by calling clf.support_vectors_.

# Plot results
fig, ax = plt.subplots(1, 1, figsize=[6, 6])
plt.subplots_adjust(wspace=SPACE, hspace=SPACE)
ax.tick_params(direction='in')
ax.set_xlim(AX_MIN, AX_MAX)
ax.set_ylim(AX_MIN, AX_MAX)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

# Plot data and ground truth
gold = ax.scatter(sw[:, 0], sw[:, 1], s=10, color='goldenrod')
blue = ax.scatter(ne[:, 0], ne[:, 1], s=10, color='navy')
gt, = ax.plot(
        x0, -x0 + 4, color='lightcoral', linestyle='--', linewidth=2,
)
svs = plt.scatter(
            clf.support_vectors_[:, 0],
            clf.support_vectors_[:, 1],
            s=80,
            facecolors='none',
            zorder=10,
            edgecolors='fuchsia',
)
pred, = plt.plot(x0, y0, "g-")
mar, = plt.plot(x0, y_neg, "g--")
plt.plot(x0, y_pos, "g--")
ax.legend(
    [svs, mar, pred, gt],
    ['Support vectors', '$y=\pm1$', '$y=0$', 'GT'],
    loc='upper left',
    framealpha=1.,
)
plt.show()
Strict Margin Classification

Figure 1. The output of our code demo, displaying an SVM classifying two clusters whilst observing a hard boundary margin

Only very few support vectors are needed to implement the trained model. The support vectors also highlight the most sensitive points that decide a classification.