Inheritance diagram for nipy.algorithms.clustering.bgmm:
Bayesian Gaussian Mixture Model Classes: contains the basic fields and methods of Bayesian GMMs the high level functions are/should be binded in C
The base class BGMM relies on an implementation that perfoms Gibbs sampling
A derived class VBGMM uses Variational Bayes inference instead
A third class is introduces to take advnatge of the old C-bindings, but it is limited to diagonal covariance models
Author : Bertrand Thirion, 2008-2011
Bases: nipy.algorithms.clustering.gmm.GMM
This class implements Bayesian GMMs
this class contains the follwing fields k: int,
the number of components in the mixture
dof : array of shape (k): the posterior dofs
Methods
Initialize the structure with the dimensions of the problem Eventually provide different terms
returns the averaged log-likelihood of the mode for the dataset x
Parameters : | x: array of shape (n_samples,self.dim) :
tiny = 1.e-15: a small constant to avoid numerical singularities : |
---|
Evaluate the Bayes Factor of the current model using Chib’s method
Parameters : | x: array of shape (nb_samples,dim) :
z: array of shape (nb_samples), type = np.int :
nperm=0: int :
verbose=0: verbosity mode : |
---|---|
Returns : | bf (float) the computed evidence (Bayes factor) : |
Notes
See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995
Computation of bic approximation of evidence
Parameters : | like, array of shape (n_samples, self.k) :
tiny=1.e-15, a small constant to avoid numerical singularities : |
---|---|
Returns : | the bic value, float : |
Checking the shape of sifferent matrices involved in the model
essentially check that x.shape[1]==self.dim
x is returned with possibly reshaping
Compute the probability of the current parameters of self given x and z
Parameters : | x: array of shape (nb_samples, dim), :
z: array of shape (nb_samples), type = np.int, :
perm: array ok shape(nperm, self.k),typ=np.int, optional :
|
---|
Estimation of the model given a dataset x
Parameters : | x array of shape (n_samples,dim) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
verbose=0: verbosity mode : |
---|---|
Returns : | bic : an asymptotic approximation of model evidence |
See bayes_factor(self, x, z, nperm=0, verbose=0)
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters : | x, array of shape (nb_samples,self.dim) :
nocheck: boolean, optional, :
|
---|
Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters : | x array of shape (n_samples,dim) :
|
---|
initialize z using a k-means algorithm, then upate the parameters
Parameters : | x: array of shape (nb_samples,self.dim) :
|
---|
Estimation of self given x
Parameters : | x array of shape (n_samples,dim) :
z = None: array of shape (n_samples) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
ninit=1: number of initialization performed :
verbose=0: verbosity mode : |
---|---|
Returns : | the best model is returned : |
return the likelihood of the model for the data x the values are weighted by the components weights
Parameters : | x array of shape (n_samples,self.dim) :
|
---|---|
Returns : | like, array of shape(n_samples,self.k) :
|
return the MAP labelling of x
Parameters : | x array of shape (n_samples,dim) :
like=None array of shape(n_samples,self.k) :
|
---|---|
Returns : | z: array of shape(n_samples): the resulting MAP labelling :
|
Returns the likelihood of the mixture for x
Parameters : | x: array of shape (n_samples,self.dim) :
|
---|
Set manually the weights, means and precision of the model
Parameters : | means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
weights: array of shape (self.k) : |
---|
compute the population, i.e. the statistics of allocation
Parameters : | z array of shape (nb_samples), type = np.int :
|
---|---|
Returns : | hist : array shape (self.k) count variable |
Compute the probability of the current parameters of self given the priors
sample the indicator and parameters
Parameters : | x array of shape (nb_samples,self.dim) :
niter=1 : the number of iterations to perform mem=0: if mem, the best values of the parameters are computed : verbose=0: verbosity mode : |
---|---|
Returns : | best_weights: array of shape (self.k) : best_means: array of shape (self.k, self.dim) : best_precisions: array of shape (self.k, self.dim, self.dim) : possibleZ: array of shape (nb_samples, niter) :
|
sample the indicator and parameters the average values for weights,means, precisions are returned
Parameters : | x = array of shape (nb_samples,dim) :
niter=1: number of iterations : |
---|---|
Returns : | weights: array of shape (self.k) : means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
|
Notes
All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).
fix: implement a permutation procedure for components identification
sample the indicator from the likelihood
Parameters : | like: array of shape (nb_samples,self.k) :
|
---|---|
Returns : | z: array of shape(nb_samples): a draw of the membership variable : |
Set the prior of the BGMM
Parameters : | prior_means: array of shape (self.k,self.dim) : prior_weights: array of shape (self.k) : prior_scale: array of shape (self.k,self.dim,self.dim) : prior_dof: array of shape (self.k) : prior_shrinkage: array of shape (self.k) : |
---|
Function to plot a GMM, still in progress Currently, works only in 1D and 2D
Parameters : | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
|
---|
Function to plot a GMM – Currently, works only in 1D
Parameters : | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
mpaxes: axes handle to make the figure, optional, :
|
---|
Returns the log-likelihood of the mixture for x
Parameters : | x array of shape (n_samples,self.dim) :
|
---|---|
Returns : | ll: array of shape(n_samples) :
|
Idem initialize_and_estimate
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters : | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns : | like, array of shape(n_samples,self.k) :
|
Notes
Hopefully faster
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters : | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns : | like, array of shape(n_samples,self.k) :
|
update function (draw a sample of the GMM parameters)
Parameters : | x array of shape (nb_samples,self.dim) :
z array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the mean
Parameters : | x: array of shape (nb_samples,self.dim) :
z: array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the precisions
Parameters : | x array of shape (nb_samples,self.dim) :
z array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, resample the weights parameter
Parameters : | z array of shape (nb_samples), type = np.int :
|
---|
Bases: nipy.algorithms.clustering.bgmm.BGMM
Subclass of Bayesian GMMs (BGMM) that implements Variational Bayes estimation of the parameters
Methods
returns the averaged log-likelihood of the mode for the dataset x
Parameters : | x: array of shape (n_samples,self.dim) :
tiny = 1.e-15: a small constant to avoid numerical singularities : |
---|
Evaluate the Bayes Factor of the current model using Chib’s method
Parameters : | x: array of shape (nb_samples,dim) :
z: array of shape (nb_samples), type = np.int :
nperm=0: int :
verbose=0: verbosity mode : |
---|---|
Returns : | bf (float) the computed evidence (Bayes factor) : |
Notes
See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995
Computation of bic approximation of evidence
Parameters : | like, array of shape (n_samples, self.k) :
tiny=1.e-15, a small constant to avoid numerical singularities : |
---|---|
Returns : | the bic value, float : |
Checking the shape of sifferent matrices involved in the model
essentially check that x.shape[1]==self.dim
x is returned with possibly reshaping
Compute the probability of the current parameters of self given x and z
Parameters : | x: array of shape (nb_samples, dim), :
z: array of shape (nb_samples), type = np.int, :
perm: array ok shape(nperm, self.k),typ=np.int, optional :
|
---|
estimation of self given x
Parameters : | x array of shape (nb_samples,dim) :
z = None: array of shape (nb_samples) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
verbose=0: :
|
---|
computation of evidence bound aka free energy
Parameters : | x array of shape (nb_samples,dim) :
like=None: array of shape (nb_samples, self.k), optional :
verbose=0: verbosity model : |
---|---|
Returns : | ev (float) the computed evidence : |
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters : | x, array of shape (nb_samples,self.dim) :
nocheck: boolean, optional, :
|
---|
Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters : | x array of shape (n_samples,dim) :
|
---|
initialize z using a k-means algorithm, then upate the parameters
Parameters : | x: array of shape (nb_samples,self.dim) :
|
---|
Estimation of self given x
Parameters : | x array of shape (n_samples,dim) :
z = None: array of shape (n_samples) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
ninit=1: number of initialization performed :
verbose=0: verbosity mode : |
---|---|
Returns : | the best model is returned : |
return the likelihood of the model for the data x the values are weighted by the components weights
Parameters : | x: array of shape (nb_samples, self.dim) :
|
---|---|
Returns : | like: array of shape(nb_samples, self.k) :
|
return the MAP labelling of x
Parameters : | x array of shape (nb_samples,dim) :
like=None array of shape(nb_samples,self.k) :
|
---|---|
Returns : | z: array of shape(nb_samples): the resulting MAP labelling :
|
Returns the likelihood of the mixture for x
Parameters : | x: array of shape (n_samples,self.dim) :
|
---|
Set manually the weights, means and precision of the model
Parameters : | means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
weights: array of shape (self.k) : |
---|
compute the population, i.e. the statistics of allocation
Parameters : | like array of shape (nb_samples, self.k): :
|
---|
Compute the probability of the current parameters of self given the priors
sample the indicator and parameters
Parameters : | x array of shape (nb_samples,self.dim) :
niter=1 : the number of iterations to perform mem=0: if mem, the best values of the parameters are computed : verbose=0: verbosity mode : |
---|---|
Returns : | best_weights: array of shape (self.k) : best_means: array of shape (self.k, self.dim) : best_precisions: array of shape (self.k, self.dim, self.dim) : possibleZ: array of shape (nb_samples, niter) :
|
sample the indicator and parameters the average values for weights,means, precisions are returned
Parameters : | x = array of shape (nb_samples,dim) :
niter=1: number of iterations : |
---|---|
Returns : | weights: array of shape (self.k) : means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
|
Notes
All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).
fix: implement a permutation procedure for components identification
sample the indicator from the likelihood
Parameters : | like: array of shape (nb_samples,self.k) :
|
---|---|
Returns : | z: array of shape(nb_samples): a draw of the membership variable : |
Set the prior of the BGMM
Parameters : | prior_means: array of shape (self.k,self.dim) : prior_weights: array of shape (self.k) : prior_scale: array of shape (self.k,self.dim,self.dim) : prior_dof: array of shape (self.k) : prior_shrinkage: array of shape (self.k) : |
---|
Function to plot a GMM, still in progress Currently, works only in 1D and 2D
Parameters : | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
|
---|
Function to plot a GMM – Currently, works only in 1D
Parameters : | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
mpaxes: axes handle to make the figure, optional, :
|
---|
Returns the log-likelihood of the mixture for x
Parameters : | x array of shape (n_samples,self.dim) :
|
---|---|
Returns : | ll: array of shape(n_samples) :
|
Idem initialize_and_estimate
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters : | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns : | like, array of shape(n_samples,self.k) :
|
Notes
Hopefully faster
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters : | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns : | like, array of shape(n_samples,self.k) :
|
update function (draw a sample of the GMM parameters)
Parameters : | x array of shape (nb_samples,self.dim) :
z array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the mean
Parameters : | x: array of shape (nb_samples,self.dim) :
z: array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the precisions
Parameters : | x array of shape (nb_samples,self.dim) :
z array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, resample the weights parameter
Parameters : | z array of shape (nb_samples), type = np.int :
|
---|
Routine for the computation of determinants of symmetric positive matrices
Parameters : | H array of shape(n,n) :
|
---|---|
Returns : | dh: float, the determinant : |
Evaluate the probability of a certain discrete draw w from the Dirichlet density with parameters alpha
Parameters : | w: array of shape (n) : alpha: array of shape (n) : |
---|
Returns the KL divergence between two dirichlet distribution
Parameters : | w1: array of shape(n), :
w2: array of shape(n), :
|
---|
Returns the KL divergence between gausians densities
Parameters : | m1: array of shape (n), :
P1: array of shape(n,n), :
m2: array of shape (n), :
P2: array of shape(n,n), :
|
---|
returns the KL divergence bteween two Wishart distribution of parameters (a1,B1) and (a2,B2),
Parameters : | a1: Float, :
B1: array of shape(n,n), :
a2: Float, :
B2: array of shape(n,n), :
|
---|---|
Returns : | dkl: float, the Kullback-Leibler divergence : |
Generate a sample from Wishart density
Parameters : | n: float, :
V: array of shape (n,n) :
|
---|---|
Returns : | W: array of shape (n,n) :
|
Generate a Gaussian sample with mean m and precision P
Parameters : | m array of shape n: the mean vector : P array of shape (n,n): the precision matrix : |
---|---|
Returns : | ng : array of shape(n): a draw from the gaussian density |
returns an array of shape(nbperm, k) representing the permutations of k elements
Parameters : | k, int the number of elements to be permuted : nperm=100 the maximal number of permutations : if gamma(k+1)>nperm: only nperm random draws are generated : |
---|---|
Returns : | p: array of shape(nperm,k): each row is permutation of k : |
Generate samples form a miltivariate distribution
Parameters : | probabilities: array of shape (nelements, nclasses): :
|
---|---|
Returns : | z array of shape (nelements): the draws, :
|
Probability of x under normal(mu, inv(P))
Parameters : | mu: array of shape (n), :
P: array of shape (n, n), :
x: array of shape (n), :
|
---|---|
Returns : | (float) the density : |
Evaluation of the probability of W under Wishart(n,V)
Parameters : | n: float, :
V: array of shape (n,n) :
W: array of shape (n,n) :
dV: float, optional, :
dW: float, optional, :
piV: array of shape (n,n), optional :
|
---|---|
Returns : | (float) the density : |