Skip to content
This repository has been archived by the owner on Jun 5, 2018. It is now read-only.

ariddell/horizont

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

In hiatus. Currently evaluating best way to approach the DTM model. Using the Pólya-Gamma augmentation and the original DTM formulation is complicated and might not give better performance than simpler models (e.g., using truncated Pitman-Yor Processes).

NOTE: The implementation of LDA has been broken out (and refined) into lda.

NOTE: If you're interested in implementing the dynamic topic model using Pólya-Gamma, most of the hard work has been done: https://github.com/HIPS/pgmult

horizont: Topic models in Python

https://travis-ci.org/ariddell/horizont.png

horizont implements a number of topic models. Conventions from scikit-learn are followed.

The following models are implemented using Gibbs sampling.

  • Latent Dirichlet allocation (Blei et al., 2003; Pritchard et al., 2000)
  • (Coming soon) Logistic normal topic model
  • (Coming soon) Dynamic topic model (Blei and Lafferty, 2006)

Getting started

horizont.LDA implements latent Dirichlet allocation (LDA) using Gibbs sampling. The interface follows conventions in scikit-learn.

>>> import numpy as np
>>> from horizont import LDA
>>> X = np.array([[1,1], [2, 1], [3, 1], [4, 1], [5, 8], [6, 1]])
>>> model = LDA(n_topics=2, random_state=0, n_iter=100)
>>> doc_topic = model.fit_transform(X)  # estimate of document-topic distributions
>>> model.components_  # estimate of topic-word distributions

Requirements

Python 2.7 or Python 3.3+ is required. The following packages are also required:

GSL is required for random number generation inside the Pólya-Gamma random variate generator. On Debian-based sytems, GSL may be installed with the command sudo apt-get install libgsl0-dev. horizont looks for GSL headers and libraries in /usr/include and /usr/lib/ respectively.

Cython is needed if compiling from source.

Important links

License

horizont is licensed under Version 3.0 of the GNU General Public License. See LICENSE file for a text of the license or visit http://www.gnu.org/copyleft/gpl.html.

About

[hibernating] Dynamic topic models

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages