SVM – Visualizing the kernel function

In the last post we saw about SVM Kernels and what it actually does. To catch up with the rest of the article on this topic, look at the link below:

Link: https://codingmachinelearning.wordpress.com/2016/07/25/support-vector-machines-kernel-explained/

In this post we will continue from where we previously left off. Now we have generated a toy data set looking like a doughnut where your linear decision boundary does not perform well. Look at the data set below

Capture

This is how our data set looks. We will define a kernel for this data set and see how this data can be projected up to a 3-dim surface so that the points can be linearly separable.

Kernel is just a function which takes input as an n-dim vector and gives a n+k dimension vector as output, where k is the number of dimension we have projected the data up

In our case we let k=1. So our kernel takes input as 2 dim (x,y) and projects the data to 3 dim (x,y,z)

We define kernel as follows: K(x,y)=(x, y, x2+y2 )

That is, if we have points (x,y) =(1,2) then K(x,y)=(1,2,5) and so on every point in the data set gets transformed. Now if we visualise the data in 3-dim space we have the following transformation of the data.

Capture1
This is how the kernel transformation works. In this case we have an intuitive idea of what should be the kernel function. But coming with the best kernel for a given problem is close to impossible. So typically in practise choose the Gaussian or RBF Kernel or the polynomial kernel. Remember kernel is just a mathematical function that projects the data up.

Now once this is done, we can see a clear separation of the data set in the higher dimensional plane, so we can apply SVM to get a linear hyperplane that separates the points. Once we have the hyperplane we can project the plane down to a lower dimension and it will look like circle as we saw in the previous post.

Scikit learn provides opportunity to define your own kernel. But this is rarely used. We can use the default option by setting kernel as non-linear or polynomial.

Code for generating the above plots is provided in my github account. The link for the same is given below

Link: https://github.com/vsuriya93/coding-machine-learning/blob/master/SVM/Kernel_transform.py

In the next post we will pick up from here. Now that we have our data projected up – how to get a large margin classifier here? We will be using scikit for building the classifier and plotting the decision boundary.

To summarize:

  • Kernels are mathematical functions
  • They project data from n-dim to n+k-dim space
  • k tells the number of dimensions to project up
  • We can manually define kernels – but this is rarely done

References:

Leave a comment