项目作者: sagorbrur

项目描述 :
BNLP is a natural language processing toolkit for Bengali Language.
高级语言: Jupyter Notebook
项目地址: git://github.com/sagorbrur/bnlp.git
创建时间: 2019-11-22T10:02:15Z
项目社区:https://github.com/sagorbrur/bnlp

开源协议:MIT License

下载


Bengali Natural Language Processing(BNLP)

PyPI version
Downloads

BNLP is a natural language processing toolkit for Bengali Language. This tool will help you to tokenize Bengali text, Embedding Bengali words, Embedding Bengali Document, Bengali POS Tagging, Bengali Name Entity Recognition, Bangla Text Cleaning for Bengali NLP purposes.

Features

Installation

PIP installer

  1. pip install bnlp_toolkit

or Upgrade

  1. pip install -U bnlp_toolkit
  • Python: 3.8, 3.9, 3.10, 3.11
  • OS: Linux, Windows, Mac

Build from source

  1. git clone https://github.com/sagorbrur/bnlp.git
  2. cd bnlp
  3. python setup.py install

Sample Usage

  1. from bnlp import BasicTokenizer
  2. tokenizer = BasicTokenizer()
  3. raw_text = "আমি বাংলায় গান গাই।"
  4. tokens = tokenizer(raw_text)
  5. print(tokens)
  6. # output: ["আমি", "বাংলায়", "গান", "গাই", "।"]

Documentation

Full documentation are available here

If you are using previous version of bnlp check the documentation archive

Contributor Guide

Check CONTRIBUTING.md page for details.

Thanks To

  • Semantics Lab
  • All the developers who are contributing to enrich Bengali NLP.