Skip to content

giuseppedib/microclustering

Repository files navigation

Random partitions for microclustering

This demo requires Julia 0.6.0 with the following packages: Distributions, PyPlot, ProgressMeter.

The file demo.jl contains the following function

function run_demo(xi = 1, sigma = 0.3, zeta = 1,
                  n_train = 300,
                  n_test = 400,
                  n_pred = 30,
                  num_particles = 1000,
                  n_sigma = 10,
                  n_smcruns = 10)

It samples a partition of size n_train + n_test from the non-exchangeable random partion model with parameters (xi, sigma, zeta) whose range is {1,2,3} x [0,1) x (0,+inf). These characterize the Generalized Gamma Process with mean measure

equation

The function produces two plots showing the clusters' size trajectories and the frequencies of clusters of given size in log-log scale. The following plots are produced with run_demo().

Sequential Monte Carlo with num_particles is adopted to find the MLE of sigma and xi using the first n_train points of the simulated partition. The SMC algorithm runs n_smcruns times on each point of a grid of the parameters' space: {1,2,3} for xi and n_sigma equidistant points in [0,0.9] for sigma. A plot of the log-likelihood estimates is produced.

The prediction step generates n_pred partitions of size n_train + n_test from the predictive distribution and plots the 95% credible intervals for frequencies of clusters of given size.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages