Gaussian Discriminant Analysis

GDA is a generative learning algorithm, where it learns P(x|y) rather than,
learning the mapping function between features(x) and labels(y) i.e. what discriminative algorithms do. In this type of algorithm, we try to model the
distribution of the features knowing their source(labels) assuming they come
from a Gaussian distribution.

Getting Started

What is Gaussian Distribution?

It is a classic distribution over single scalar random variable ‘x’, parameterized
by mean(mu) and standard deviation(sigma). Which looks like a typical bell
curve. It’s probability density function is as follows:

What is Multivariate Gaussian?

A multivariate gaussian is a generalization of the gaussian defined over one
dimensional random variable, to multiple random variable at the same time. These
are vector valued random variable rather than univariate random variable.

It’s probability density function is as follows:

A multivariate gaussian in 2 dimension would look something like this, where the right hand side image shows the contour plot of the gaussian.

## Maximum Likelihood Estimates
Multivariate gaussian is parameterized by mean(mu) which controls the location of the gaussian and covariance matrix(sigma) which controls the shape of the gaussian.

How to fit the training set?

In order to fit these parameters, we need to maximize the joint likelihood. Once
we do this we would have the maximum likelihood estimates for mu and sigma.

Refer this link for the derivation
of maximizing the joint likelihood.

Results

In the above code, GDA is used to perform bi-class classification, by modelling
class A and class B separately. Once we fit Multivariate Gaussians to each class
individually, we can get the probability of any new data point from the probability
density function.

Scatter plot of dataset

A two dimensional dataset with 100 data points from class A and 100 data points from
class B, where the red points show their mean.

Contour plot with decision boundary

After fitting Gaussians to both class independently, we can get an approximation
of the decision boundary as show in the contour plot.

3D visualization of Gaussians

Better visualization of how GDA fits gaussians to the dataset would be as follows.

Note: This plot is not for the provided dataset. This is to get an picture as of
how GDA fits gaussians to distribution.