项目作者: yohanesgultom

项目描述 :
Indonesian NLP experiments
高级语言: Java
项目地址: git://github.com/yohanesgultom/nlp-experiments.git
创建时间: 2016-04-08T02:52:37Z
项目社区:https://github.com/yohanesgultom/nlp-experiments

开源协议:

下载


Open NLP

POS tagging and Named-entity recognizing

Distribution

Binary distribution can be downloaded here (JRE 1.7 or later required, Unix or Windows only)

Usage

Please find usage guide in the README

Building

Prerequisites:

  • JDK 1.7 or later
  • Maven 3.3.9 or later

Building program:

  1. $ cd java/nlp
  2. $ mvn clean package

NLTK

Prequisites

POS Tagging

POS tagging with predefined training and test data:

  1. $ cd python
  2. $ python tagger.py ../data/pos-tagging/Indonesian_Manually_Tagged_Corpus_ID.tsv ../data/pos-tagging/Wikipedia.txt

POS tagging by splitting training data to training and test data:

  1. $ cd python
  2. $ python tagger.py ../data/pos-tagging/Indonesian_Manually_Tagged_Corpus_ID.tsv 1000 sentences.tag