Class MultivariateGaussianDistribution

java.lang.Object
smile.stat.distribution.MultivariateGaussianDistribution
All Implemented Interfaces:
Serializable, MultivariateDistribution, MultivariateExponentialFamily

public class MultivariateGaussianDistribution extends Object implements MultivariateDistribution, MultivariateExponentialFamily
Multivariate Gaussian distribution.
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    final boolean
    True if the covariance matrix is diagonal.
    final double[]
    The mean vector.
    final Matrix
    The covariance matrix.
  • Constructor Summary

    Constructors
    Constructor
    Description
    MultivariateGaussianDistribution(double[] mean, double variance)
    Constructor.
    MultivariateGaussianDistribution(double[] mean, double[] variance)
    Constructor.
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    double
    cdf(double[] x)
    Algorithm from Alan Genz (1992) Numerical Computation of Multivariate Normal Probabilities, Journal of Computational and Graphical Statistics, pp.
    cov()
    The covariance matrix of distribution.
    double
    Shannon's entropy of the distribution.
    fit(double[][] data)
    Estimates the mean and diagonal covariance by MLE.
    fit(double[][] data, boolean diagonal)
    Estimates the mean and covariance by MLE.
    int
    The number of parameters of the distribution.
    double
    logp(double[] x)
    The density at x in log scale, which may prevents the underflow problem.
    M(double[][] data, double[] posteriori)
    The M step in the EM algorithm, which depends on the specific distribution.
    double[]
    The mean vector of distribution.
    double
    p(double[] x)
    The probability density function for continuous distribution or probability mass function for discrete distribution at x.
    double[]
    Generate a random multivariate Gaussian sample.
    double[][]
    rand(int n)
    Generates a set of random numbers following this distribution.
    double
    Returns the scatter of distribution, which is defined as |Σ|.
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

    Methods inherited from interface smile.stat.distribution.MultivariateDistribution

    likelihood, logLikelihood
  • Field Details

    • mu

      public final double[] mu
      The mean vector.
    • sigma

      public final Matrix sigma
      The covariance matrix.
    • diagonal

      public final boolean diagonal
      True if the covariance matrix is diagonal.
  • Constructor Details

    • MultivariateGaussianDistribution

      public MultivariateGaussianDistribution(double[] mean, double variance)
      Constructor. The distribution will have a diagonal covariance matrix of the same variance.
      Parameters:
      mean - mean vector.
      variance - variance.
    • MultivariateGaussianDistribution

      public MultivariateGaussianDistribution(double[] mean, double[] variance)
      Constructor. The distribution will have a diagonal covariance matrix. Each element has different variance.
      Parameters:
      mean - mean vector.
      variance - variance vector.
    • MultivariateGaussianDistribution

      public MultivariateGaussianDistribution(double[] mean, Matrix cov)
      Constructor.
      Parameters:
      mean - mean vector.
      cov - covariance matrix.
  • Method Details

    • fit

      public static MultivariateGaussianDistribution fit(double[][] data)
      Estimates the mean and diagonal covariance by MLE.
      Parameters:
      data - the training data.
      Returns:
      the distribution.
    • fit

      public static MultivariateGaussianDistribution fit(double[][] data, boolean diagonal)
      Estimates the mean and covariance by MLE.
      Parameters:
      data - the training data.
      diagonal - true if covariance matrix is diagonal.
      Returns:
      the distribution.
    • length

      public int length()
      Description copied from interface: MultivariateDistribution
      The number of parameters of the distribution. The "length" is in the sense of the minimum description length principle.
      Specified by:
      length in interface MultivariateDistribution
      Returns:
      the number of parameters of the distribution.
    • entropy

      public double entropy()
      Description copied from interface: MultivariateDistribution
      Shannon's entropy of the distribution.
      Specified by:
      entropy in interface MultivariateDistribution
      Returns:
      Shannon entropy
    • mean

      public double[] mean()
      Description copied from interface: MultivariateDistribution
      The mean vector of distribution.
      Specified by:
      mean in interface MultivariateDistribution
      Returns:
      the mean vector.
    • cov

      public Matrix cov()
      Description copied from interface: MultivariateDistribution
      The covariance matrix of distribution.
      Specified by:
      cov in interface MultivariateDistribution
      Returns:
      the covariance matrix.
    • scatter

      public double scatter()
      Returns the scatter of distribution, which is defined as |Σ|.
      Returns:
      the scatter of distribution.
    • logp

      public double logp(double[] x)
      Description copied from interface: MultivariateDistribution
      The density at x in log scale, which may prevents the underflow problem.
      Specified by:
      logp in interface MultivariateDistribution
      Parameters:
      x - a real vector.
      Returns:
      the log density.
    • p

      public double p(double[] x)
      Description copied from interface: MultivariateDistribution
      The probability density function for continuous distribution or probability mass function for discrete distribution at x.
      Specified by:
      p in interface MultivariateDistribution
      Parameters:
      x - a real vector.
      Returns:
      the desnity.
    • cdf

      public double cdf(double[] x)
      Algorithm from Alan Genz (1992) Numerical Computation of Multivariate Normal Probabilities, Journal of Computational and Graphical Statistics, pp. 141-149.

      The difference between returned value and the true value of the CDF is less than 0.001 in 99.9% time. The maximum number of iterations is set to 10000.

      Specified by:
      cdf in interface MultivariateDistribution
      Parameters:
      x - a real vector.
      Returns:
      the probability.
    • rand

      public double[] rand()
      Generate a random multivariate Gaussian sample.
      Returns:
      a random sample.
    • rand

      public double[][] rand(int n)
      Generates a set of random numbers following this distribution.
      Parameters:
      n - the number of random samples to generate.
      Returns:
      an array of random samples.
    • M

      public MultivariateMixture.Component M(double[][] data, double[] posteriori)
      Description copied from interface: MultivariateExponentialFamily
      The M step in the EM algorithm, which depends on the specific distribution.
      Specified by:
      M in interface MultivariateExponentialFamily
      Parameters:
      data - the input data for estimation
      posteriori - the posteriori probability.
      Returns:
      the (unnormalized) weight of this distribution in the mixture.
    • toString

      public String toString()
      Overrides:
      toString in class Object