| Title: | Dynamic Programming Based Gaussian Mixture Modelling Tool for 1D and 2D Data |
|---|---|
| Description: | Gaussian mixture modeling of one- and two-dimensional data, provided in original or binned form, with an option to estimate the number of model components. The method uses Gaussian Mixture Models (GMM) with initial parameters determined by a dynamic programming algorithm, leading to stable and reproducible model fitting. For more details see Zyla, J., Szumala, K., Polanski, A., Polanska, J., & Marczyk, M. (2026) <doi:10.1016/j.jocs.2026.102811>. |
| Authors: | Michal Marczyk [aut, ctb], Kamila Szumala [aut, cre], Joanna Zyla [aut, ctb] |
| Maintainer: | Kamila Szumala <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.0 |
| Built: | 2026-05-31 07:20:40 UTC |
| Source: | https://github.com/cran/dpGMM |
This data set is part of mass spectrometry measurements. First column represent X values. Second column represent counts of X.
data(binned)data(binned)
A matrix of X and Y (in histogram)
The function performs the EM algorithm to find the local maximum likelihood for the estimated Gaussian mixture parameters.
EM_iter(X, alpha, mu, sig, Y = NULL, opts = NULL)EM_iter(X, alpha, mu, sig, Y = NULL, opts = NULL)
X |
Vector of 1D data for GMM decomposition. |
alpha |
Vector containing the weights (alpha) for each component in the statistical model. |
mu |
Vector containing the means (mu) for each component in the statistical model. |
sig |
Vector containing the standard deviation (sigma) for each component in the statistical model. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
opts |
Parameters of run saved in |
Returns a list of GMM parameter values that correspond to the local extremes for each component.
Vector of optimal alpha (weights) values.
Vector of optimal mu (means) values.
Vector of optimal sigma (standard devations) values.
Log-likelihood statistic for the estimated number of components.
Value of the selected information criterion in local extreme of likelihood function.
runGMM and gaussian_mixture_vector
data("example") opts <- GMM_1D_opts Y <- matrix(1, 1, length(example$Dist)) rcpt <- EM_iter(example$Dist, 1, mean(example$Dist), sd(example$Dist), Y, opts)data("example") opts <- GMM_1D_opts Y <- matrix(1, 1, length(example$Dist)) rcpt <- EM_iter(example$Dist, 1, mean(example$Dist), sd(example$Dist), Y, opts)
The function performs the EM algorithm to find the local maximum likelihood for the estimated Gaussian mixture parameters.
EM_iter_2D(X, Y, init, opts = NULL)EM_iter_2D(X, Y, init, opts = NULL)
X |
Matrix of 2D data to decompose by GMM. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
init |
Vector of initial parameters for Gaussian components. |
opts |
Parameters of run stored in |
Function returns a list of GMM parameters for tested number of components:
Weights (alpha) of each component.
Means of decomposition.
Covariances of each component.
Estimated number of components.
Log-likelihood statistic for the estimated number of components.
The value of the selected information criterion which was used to calculate the number of components.
data("example2D") X <- example2D[,1:2] Y <- matrix(1, 1, nrow(X)) opts <- GMM_2D_opts # It is necessary to define the initial conditions. Here we use random initialization. alpha <- matrix(1, 1, opts$KS)/opts$KS center <- as.matrix(X[sample(nrow(X), opts$KS),]) rownames(center) <- NULL covar <- replicate(opts$KS, diag(apply(as.matrix(X), 2, sd)/opts$KS), simplify = "array") init <- list(alpha = alpha, center = center, covar = covar, KS = opts$KS) gmm <- EM_iter_2D(X, Y, init, opts)data("example2D") X <- example2D[,1:2] Y <- matrix(1, 1, nrow(X)) opts <- GMM_2D_opts # It is necessary to define the initial conditions. Here we use random initialization. alpha <- matrix(1, 1, opts$KS)/opts$KS center <- as.matrix(X[sample(nrow(X), opts$KS),]) rownames(center) <- NULL covar <- replicate(opts$KS, diag(apply(as.matrix(X), 2, sd)/opts$KS), simplify = "array") init <- list(alpha = alpha, center = center, covar = covar, KS = opts$KS) gmm <- EM_iter_2D(X, Y, init, opts)
This data set was randomly drown for 6 components GMM. The parameters of distributions are as follow:
means <- c(-14.56, -14.16, -11.80, -8.77, -2.89, 2.31);
sigma <- c(2.06, 4.49, 4.42, 2.39, 3.92, 1.36);
alpha <- c(0.2012, 0.2898, 0.0334, 0.0092, 0.4278, 0.0384)
data(example)data(example)
A vector containing 1500 observations
Randomly generated data
This data set contain translated image information into X and Y coordinates and count for each pair X and Y.
data(example2D)data(example2D)
data.frame of X and Y coordinates and counts
Function which assign each point of 2D matrix data to a cluster by maximum probability.
find_class_2D(X, gmm)find_class_2D(X, gmm)
X |
matrix of data to decompose by GMM. |
gmm |
Results of |
Return a vector of cluster assignment of each point of X matrix.
Function to calculate cutoffs between each component of mixture normal distributions using probability distribution function.
find_thr_by_dist(input, sigmas.dev = 2.5, alpha, mu, sigma)find_thr_by_dist(input, sigmas.dev = 2.5, alpha, mu, sigma)
input |
output of
|
sigmas.dev |
Number of sigmas to secure thresholds on the ends of distributions. Equivalent to sigma.dev in merging GMMs. |
alpha |
Vector containing the weights (alpha) for each component in the statistical model. |
mu |
Vector containing the means (mu) for each component in the statistical model. |
sigma |
Vector containing the standard deviation (sigma) for each component in the statistical model. |
Return a vector of thresholds.
data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) dist.plot <- generate_dist(example$Dist, alpha = alpha, mu = mu, sigma = sigma, 1e4) thr <- find_thr_by_dist(dist.plot, 2.5, alpha = alpha, mu = mu, sigma = sigma)data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) dist.plot <- generate_dist(example$Dist, alpha = alpha, mu = mu, sigma = sigma, 1e4) thr <- find_thr_by_dist(dist.plot, 2.5, alpha = alpha, mu = mu, sigma = sigma)
Function to calculate cutoffs between each component of mixture normal distributions based on the component parameters.
find_thr_by_params(alpha, mu, sigma, input, sigmas.dev = 2.5)find_thr_by_params(alpha, mu, sigma, input, sigmas.dev = 2.5)
alpha |
Vector containing the weights (alpha) for each component in the statistical model. |
mu |
Vector containing the means (mu) for each component in the statistical model. |
sigma |
Vector containing the standard deviation (sigma) for each component in the statistical model. |
input |
output of
|
sigmas.dev |
Number of sigmas to secure thresholds on the ends of distributions. Equivalent to sigma.dev in merging GMMs. |
Return a vector of thresholds.
data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) dist.plot <- generate_dist(example$Dist, alpha = alpha, mu = mu, sigma = sigma, 1e4) thr <- find_thr_by_params(alpha = alpha, mu = mu, sigma = sigma, dist.plot)data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) dist.plot <- generate_dist(example$Dist, alpha = alpha, mu = mu, sigma = sigma, 1e4) thr <- find_thr_by_params(alpha = alpha, mu = mu, sigma = sigma, dist.plot)
Function to choose the optimal number of components of a 2D mixture normal distributions, minimizing the value of the information criterion.
gaussian_mixture_2D(X, Y = NULL, opts = NULL)gaussian_mixture_2D(X, Y = NULL, opts = NULL)
X |
Matrix of 2D data to decompose by GMM. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
opts |
Parameters of run saved in |
Function returns a list of GMM parameters for the optimal number of components:
Weights (alpha) of each component.
Means of decomposition.
Covariances of each component.
Estimated number of components.
Log-likelihood statistic for the estimated number of components.
The value of the selected information criterion which was used to calculate the number of components.
Assigment of point to the clusters.
data(example2D) custom.settings <- GMM_2D_opts exp <- gaussian_mixture_2D(example2D[,1:2], example2D[,3], opts = custom.settings)data(example2D) custom.settings <- GMM_2D_opts exp <- gaussian_mixture_2D(example2D[,1:2], example2D[,3], opts = custom.settings)
Function to estimate number of components of a mixture normal distributions, minimizing the value of the information criterion.
gaussian_mixture_vector(X, Y = NULL, opts = NULL)gaussian_mixture_vector(X, Y = NULL, opts = NULL)
X |
Vector of 1D data for GMM decomposition. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
opts |
Parameters of run saved in |
Function returns a list of GMM parameters for the estimated number of components:
A list of model component parameters - mean values (mu), standard deviations (sigma)
and weights (alpha) for each component.
The value of the selected information criterion which was used to calculate the number of components.
Log-likelihood statistic for the estimated number of components.
Estimaged number of model components.
data <- generate_norm1D(1000, alpha = c(0.2,0.4,0.4), mu = c(-15,0,15), sigma = c(1,2,3)) custom.settings <- GMM_1D_opts custom.settings$IC <- "AIC" custom.settings$KS <- 10 exp <- gaussian_mixture_vector(data$Dist, opts = custom.settings)data <- generate_norm1D(1000, alpha = c(0.2,0.4,0.4), mu = c(-15,0,15), sigma = c(1,2,3)) custom.settings <- GMM_1D_opts custom.settings$IC <- "AIC" custom.settings$KS <- 10 exp <- gaussian_mixture_vector(data$Dist, opts = custom.settings)
Function to generate PDF of GMM distributions and its cumulative results with high lincespacing.
generate_dist(X, alpha, mu, sigma, precision)generate_dist(X, alpha, mu, sigma, precision)
X |
Vector of 1D data. |
alpha |
Vector of alphas (weights) for each distribution. |
mu |
Vector of means for each distribution. |
sigma |
Vector of sigmas for each distribution. |
precision |
Precision of point linespacing. |
List with following elements:
Numeric vector with equaliy spread data of given precison.
Matrix with PDF of each GMM component and cumulative distribution.
data <- generate_norm1D(1000, alpha = c(0.2, 0.4, 0.4), mu = c(-15, 0, 15), sigma = c(1, 2, 3)) dist <- generate_dist(data$Dist, alpha = c(0.2, 0.4, 0.4), mu = c(-15, 0, 15), sigma = c(1, 2, 3), precision = 1000)data <- generate_norm1D(1000, alpha = c(0.2, 0.4, 0.4), mu = c(-15, 0, 15), sigma = c(1, 2, 3)) dist <- generate_dist(data$Dist, alpha = c(0.2, 0.4, 0.4), mu = c(-15, 0, 15), sigma = c(1, 2, 3), precision = 1000)
Generator of multiple 2D mixed normal distribution with given model parameters ranges.
generate_dset2D( n = 1500, m = 1500, KS_range = 2:8, mu_range = c(-15, 15), cov_range = c(1, 5) )generate_dset2D( n = 1500, m = 1500, KS_range = 2:8, mu_range = c(-15, 15), cov_range = c(1, 5) )
n |
Number of points to generate. |
m |
Number of distribution to generate. |
KS_range |
Range of possible number of components of generated distribution. Default |
mu_range |
Range of means of components of generated distribution. Default |
cov_range |
Range of means of components of generated distribution. Default |
List with 2D GMM distributions where each list contains elements of generate_norm2D.
dset <- generate_dset2D(n = 1500, m = 10, KS_range = 2:5, mu_range = c(-10, 10), cov_range = c(1, 3))dset <- generate_dset2D(n = 1500, m = 10, KS_range = 2:5, mu_range = c(-10, 10), cov_range = c(1, 3))
Generator of mixed-normal distribution with given model parameters for certain points number.
generate_norm1D(n, alpha, mu, sigma)generate_norm1D(n, alpha, mu, sigma)
n |
Number of points to generate. |
alpha |
Vector of alphas (weights) for each distribution. |
mu |
Vector of means for each distribution. |
sigma |
Vector of sigmas for each distribution. |
List with following elements:
Numeric vector with generated data
Numeric vector with classification of each point to particular mixed distribution
data <- generate_norm1D(1000, alpha = c(0.2, 0.4, 0.4), mu = c(-15, 0, 15), sigma = c(1, 2, 3))data <- generate_norm1D(1000, alpha = c(0.2, 0.4, 0.4), mu = c(-15, 0, 15), sigma = c(1, 2, 3))
Generator of 2D mixed normal distribution with given model parameters for certain points number.
generate_norm2D(n, alpha, mu, cov)generate_norm2D(n, alpha, mu, cov)
n |
Number of points to generate. |
alpha |
Vector of alphas (weights) for each distribution. |
mu |
Matrix of means for each distribution. |
cov |
Vector of covariances for each distribution. |
List with following elements:
Numeric marix with generated data.
Numeric vector with classification of each point to particular distribution.
data <- generate_norm2D(1500, alpha = c(0.2, 0.4, 0.4), mu = matrix(c(1, 2, 1, 3, 2, 2), nrow = 2), cov = c(0.01, 0.02, 0.03))data <- generate_norm2D(1500, alpha = c(0.2, 0.4, 0.4), mu = matrix(c(1, 2, 1, 3, 2, 2), nrow = 2), cov = c(0.01, 0.02, 0.03))
A list with parameters customizing a GMM for 1D and binned data. Each component of the
list is an effective argument for runGMM.
GMM_1D_optsGMM_1D_opts
A list with the following components:
Maximum number of components of the model.
Criterion for early stopping of EM (1e-7, by default) given by the following formula:
Maximum number of iterations of EM algorithm. By default it is max_iter = 10 000.
Parameter for calculating minimum variance of each Gaussian component (0.25, by default) using the following formula:
. Lower value means smaller component variance allowed.
Information criterion used to select the number of model components. Possible methods are "AIC","AICc", "BIC" (default), "ICL-BIC" or "LR".
Parameter used to define close GMM components that needs to be merged. For each component, standard deviation is multiplied by sigmas.dev to estimate the distance from component mean.
All other components within this distance are merged. By default it is sigmas.dev = 1. When sigmas.dev = 0 no components are merged.
Logical value. Determines if stop searching of the number of components earlier based on the Likelihood Ratio Test. Used to speed up the function (TRUE, by default).
Significance level set for Likelihood Ratio Test (0.05, by default).
Logical value. Fit GMM for selected number of components given by KS (FALSE, by default).
Logical value. If TRUE (default), the figure visualizing GMM decomposition will be displayed.
Name of the RColorBrewer palette used in the figure. By default "Blues".
# display all default settings GMM_1D_opts # create a new settings object custom.settings <- GMM_1D_opts custom.settings$IC <- "AIC" custom.settings# display all default settings GMM_1D_opts # create a new settings object custom.settings <- GMM_1D_opts custom.settings$IC <- "AIC" custom.settings
A list with parameters customizing a GMM_2D. Each component of the
list is an effective argument for runGMM2D.
GMM_2D_optsGMM_2D_opts
A list with the following components:
Criterion for early stopping of EM (1e-7, by default).
Maximum number of iterations of EM algorithm. By default it is max_iter = 50 000.
Regularizing coefficient for covariance.
Maximum dissimilarity between horizontal and vertical dispersion. By default it is max_var_ratio = 5.
Information criterion used to select the number of model components. Possible methods are "AIC","AICc", "BIC" (default), "ICL-BIC" or "LR".
Type of covariance defined for each model component. Possible "sphere","diag" or "full" (default).
Number of random initial conditions. By default it is init_nb = 10.
Maximum number of components of the model. By default it is KS = 5.
Logical value. Determines if stop searching of the number of components earlier based on the Likelihood Ratio Test. Used to speed up the function (TRUE, by default).
Significance level set for Likelihood Ratio Test (0.05, by default).
Type of initial conditions. Could be "rand" (default),"DP" or "diag".
Logical value. Fit GMM for selected number of components given by KS (FALSE, by default).
Logical value. If TRUE, the GMM decomposition figure will be displayed (FALSE, by default).
# display all default settings GMM_2D_opts # create a new settings object custom.settings <- GMM_2D_opts custom.settings$IC <- "AIC" custom.settings# display all default settings GMM_2D_opts # create a new settings object custom.settings <- GMM_2D_opts custom.settings$IC <- "AIC" custom.settings
Transform image into coordinates data
img_to_coords(img)img_to_coords(img)
img |
image in 2D array. |
Function plot the decomposed distribution together with histogram of data. Moreover the cut-off are marked.
This plot is also return as regular output of runGMM.
plot_gmm_1D(X, dist, Y = NULL, threshold = NA, pal = "Blues")plot_gmm_1D(X, dist, Y = NULL, threshold = NA, pal = "Blues")
X |
Vector of 1D data for GMM decomposition. |
dist |
Output of |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
threshold |
Vector with GMM cutoffs. |
pal |
Name of the RColorBrewer palette used in the figure. By default |
A ggplot object showing the histogram or density of the input
data together with the Gaussian mixture model decomposition.
Individual mixture components and the overall fitted density are
displayed as line plots, and optional cut-off thresholds are marked
as vertical dashed lines.
data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) dist.plot <- generate_dist(example$Dist, alpha, mu, sigma, 1e4) thr <- find_thr_by_params(alpha, mu, sigma, dist.plot) plot_gmm_1D(example$Dist, dist.plot, Y = NULL, threshold = thr, pal="Dark2")data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) dist.plot <- generate_dist(example$Dist, alpha, mu, sigma, 1e4) thr <- find_thr_by_params(alpha, mu, sigma, dist.plot) plot_gmm_1D(example$Dist, dist.plot, Y = NULL, threshold = thr, pal="Dark2")
Function plot the heatmap of binned data with marked GMM decomposition.
This plot is also return as regular output of runGMM2D.
plot_gmm_2D_binned(X, Y, gmm, opts)plot_gmm_2D_binned(X, Y, gmm, opts)
X |
Matrix of 2D data to decompose by GMM. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
gmm |
Results of |
opts |
Parameters of run stored in |
A ggplot object showing the heatmap of binned two-dimensional
data with an overlay of the Gaussian mixture model decomposition. Mixture
component centers are indicated by points and covariance ellipses corresponding
to selected probability contours are drawn around each component.
data(example2D) custom.settings <- GMM_2D_opts res <- runGMM2D(example2D[,1:2], example2D[,3], opts = custom.settings) plot_gmm_2D_binned(example2D[,1:2], example2D[,3], res$model, custom.settings)data(example2D) custom.settings <- GMM_2D_opts res <- runGMM2D(example2D[,1:2], example2D[,3], opts = custom.settings) plot_gmm_2D_binned(example2D[,1:2], example2D[,3], res$model, custom.settings)
Function plot the decomposed distribution together with histogram of data.
This plot is also return as regular output of runGMM.
plot_gmm_2D_orig(X, gmm, opts)plot_gmm_2D_orig(X, gmm, opts)
X |
Matrix of 2D data to decompose by GMM. |
gmm |
Results of |
opts |
Parameters of run stored in |
A ggplot object showing the scatter plot of two-dimensional
data with an overlay of the Gaussian mixture model decomposition. Mixture
component centers are indicated by points and covariance ellipses corresponding
to selected probability contours are drawn around each component.
custom.settings <- GMM_2D_opts data <- generate_norm2D(1500, alpha = c(0.2, 0.4, 0.4), mu = matrix(c(1, 2, 1, 3, 2, 2), nrow = 2), cov = c(0.01, 0.02, 0.03)) res <- runGMM2D(data$Dist, opts = custom.settings) plot_gmm_2D_orig(data$Dist, res$model, custom.settings)custom.settings <- GMM_2D_opts data <- generate_norm2D(1500, alpha = c(0.2, 0.4, 0.4), mu = matrix(c(1, 2, 1, 3, 2, 2), nrow = 2), cov = c(0.01, 0.02, 0.03)) res <- runGMM2D(data$Dist, opts = custom.settings) plot_gmm_2D_orig(data$Dist, res$model, custom.settings)
Function return ggplot object with fit diagnostic Quantile-Quantile plot for one normal distribution and fitted GMM.
This plot is also return as regular output of runGMM.
plot_QQplot(X, alpha, mu, sigma)plot_QQplot(X, alpha, mu, sigma)
X |
Vector of 1D data for GMM decomposition. |
alpha |
Vector containing the weights (alpha) for each component in the statistical model. |
mu |
Vector containing the means (mu) for each component in the statistical model |
sigma |
Vector containing the standard deviation (sigma) for each component in the statistical model. |
An object extending ggplot that arranges two quantile-quantile
plots into a single figure. One panel shows a QQ plot of the input
data against a normal distribution, and the other shows a QQ plot
against data simulated from the fitted Gaussian mixture model.
data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) plot_QQplot(example$Dist, alpha, mu, sigma)data(example) alpha <- c(0.45, 0.5, 0.05) mu <- c(-14, -2, 5) sigma <- c(2, 4, 1.5) plot_QQplot(example$Dist, alpha, mu, sigma)
Function fits GMM with initial conditions found using dynamic programming-based approach by using expectation-maximization (EM) algorithm. The function works on original and binned (e.g. obtained by creating histogram on 1D data) data. Additionally, threshold values that allows to assign data to individual Gaussian components are provided. Function allows to estimate the number of GMM components using five different information criteria and merging of similar components.
runGMM(X, Y = NULL, opts = NULL)runGMM(X, Y = NULL, opts = NULL)
X |
Vector of 1D data for GMM decomposition. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
opts |
Parameters of run saved in |
Function returns a list which contains:
A list of model component parameters - mean values (mu), standard deviations (sigma)
and weights (alpha) for each component. Output of gaussian_mixture_vector.
Estimaged number of model components.
The value of the selected information criterion which was used to calculate the number of components.
Log-likelihood statistic for the estimated number of components.
Vector of thresholds between each component.
Assignment of original X values to individual components (clusters) by thresholds.
ggplot object (output of the plot_gmm_1D function). It contains GMM decomposition together with a histogram of the data.
ggplot object (output of the plot_QQplot function).
It presents diagnostic Quantile-Quantile plot for a single normal distribution and fitted GMM.
gaussian_mixture_vector, EM_iter
data(example) custom.settings <- GMM_1D_opts custom.settings$sigmas.dev <- 1.5 custom.settings$max_iter <- 1000 custom.settings$KS <- 10 mix_test <- runGMM(example$Dist, opts = custom.settings) mix_test$QQplot #example for binned data data(binned) custom.settings <- GMM_1D_opts custom.settings$quick_stop <- TRUE custom.settings$KS <- 40 custom.settings$col.pal <- "Dark2" custom.settings$plot <- FALSE binned_test <- runGMM(X = binned$V1, Y = binned$V2, opts = custom.settings) binned_test$figdata(example) custom.settings <- GMM_1D_opts custom.settings$sigmas.dev <- 1.5 custom.settings$max_iter <- 1000 custom.settings$KS <- 10 mix_test <- runGMM(example$Dist, opts = custom.settings) mix_test$QQplot #example for binned data data(binned) custom.settings <- GMM_1D_opts custom.settings$quick_stop <- TRUE custom.settings$KS <- 40 custom.settings$col.pal <- "Dark2" custom.settings$plot <- FALSE binned_test <- runGMM(X = binned$V1, Y = binned$V2, opts = custom.settings) binned_test$fig
Main function to perform GMM on 2D data. Function choose the optimal number of components of a 2D mixture normal distributions by minimizing the value of the information criterion.
runGMM2D(X, Y = NULL, opts = NULL)runGMM2D(X, Y = NULL, opts = NULL)
X |
Matrix of 2D data to decompose by GMM. |
Y |
Vector of counts, with the same length as "X". Applies only to binned data (Y = NULL, by default). |
opts |
Parameters of run stored in |
Function returns a list of GMM parameters for the estimated number of components:
Weights (alpha) of each component.
Means of decomposition.
Covariances of each component.
Estimated number of components.
Log-likelihood statistic for the estimated number of components.
The value of the selected information criterion which was used to calculate the number of components.
Assigment of point to the clusters.
Plot of decomposition.
data(example2D) custom.settings <- GMM_2D_opts custom.settings$fixed <- TRUE custom.settings$KS <- 3 custom.settings$max_iter <- 5000 custom.settings$plot <- TRUE res <- runGMM2D(example2D[,1:2], example2D[,3], opts = custom.settings)data(example2D) custom.settings <- GMM_2D_opts custom.settings$fixed <- TRUE custom.settings$KS <- 3 custom.settings$max_iter <- 5000 custom.settings$plot <- TRUE res <- runGMM2D(example2D[,1:2], example2D[,3], opts = custom.settings)