top of page

Gaussian Mixtures | Conclusions

  • Gaussian Mixtures assume the dataset is described by a finite numbers of weighted Gaussians

  • Expectation-Maximization is used to determined the weights and mean/covariances of the Gaussians

  • The initial centroids, controlled by the init_params parameter in SciKit Learn allow for initialization using kmeans, kmeans++, random, and random_from_data

  • Each iteration consists of an E-step and an M-step:

    • E-step: compute the membership weights for all data points and centroids

    • M-step: use the membership weights to calculate a new centroid

  • Mixture models generalize K-Means clustering by incorporating information about the covariance structure of the data and the centers of the latent Gaussians

  • The covariance adjusts the directions and lengths of the axes of the ellipsoidal density contours

  • SciKit Learn has four different parameter values for the covariance: diagonal, spherical, tied and full covariance

bottom of page