Introduction

Gaussian Mixtures | Introduction

Expectation-Maximization for Gaussian Mixture
Gaussian Mixture is a type model corresponding to the probability distribution of a collection of data points within the data set. It is a probabilistic model which assumes that the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters. 

Gaussian Mixture uses an EM (Expectation-Maximization) algorithm to determine the parameters. The algorithm starts from an initial estimate of the density contours which can be controlled by the init_params parameter in SciKit Learn. The options are kmeans, k-means ++, random, and random_from_data. The algorithm then proceeds to update the density contours until convergence. Each iteration consists of an E-step and an M-step.

E-step: compute the membership weights for all data points and density contours (weights of Gaussians)
M-step: use the membership weights to calculate new density contours (parameters of Gaussians)

EM steps are continuously taken until a stopping condition has been reached. It is possible to think of mixture models as a method to incorporate information about the covariance structure of the data and the centers of the latent Gaussians into k-means, thereby extending upon K-Means clustering.

Image From: Murphy, K. P. (2021). Figure 11.11. *In Machine Learning: A Probabilistic Perspective*. textbook, MIT Press.

Covariance is determined by the covariance_type parameter in SciKit Learn. In Gaussian Mixtures, the density contours are ellipsoids. The covariance determines the directions and the lengths of the axes of these density contours. SciKit Learn allows for use of four different types of covariance including diagonal, spherical, tied and full covariance. These are demonstrated in the next chapters.