项目作者: vlukiyanov

项目描述 :
PyTorch implementation of AVITM (Autoencoding Variational Inference For Topic Models)
高级语言: Python
项目地址: git://github.com/vlukiyanov/pt-avitm.git
创建时间: 2018-12-29T20:20:28Z
项目社区:https://github.com/vlukiyanov/pt-avitm

开源协议:MIT License

下载


pt-avitm

Build Status codecov
Codacy Badge

PyTorch implementation of a version of the Autoencoding Variational Inference For Topic Models (AVITM) algorithm. Compatible with PyTorch 1.0.0 and Python 3.6 or 3.7 with or without CUDA.

This follows (or attempts to; note this implementation is unofficial) the algorithm described in “Autoencoding Variational Inference For Topic Models” of Akash Srivastava, Charles Sutton (https://arxiv.org/abs/1703.01488).

Examples

You can find a number of examples in the examples directory, see also Usage below.

Usage

The simplest way to use the library is using the sklearn-compatible API, as below.

  1. import sklearn.datasets
  2. from sklearn.feature_extraction.text import CountVectorizer
  3. from sklearn.pipeline import make_pipeline
  4. from ptavitm.sklearn_api import ProdLDATransformer
  5. texts = sklearn.datasets.fetch_20newsgroups()['data']
  6. pipeline = make_pipeline(
  7. CountVectorizer(
  8. stop_words='english',
  9. max_features=2500,
  10. max_df=0.9
  11. ),
  12. ProdLDATransformer()
  13. )
  14. pipeline.fit(texts)
  15. result = pipeline.transform(texts)

Other implementations of AVITM and similar