项目作者: anuragithub

项目描述 :
A simple methodology to optimize the candidate model by searching through an optimized clustered graph based on levenshtein distance.
高级语言: Jupyter Notebook
项目地址: git://github.com/anuragithub/OLDC.git
创建时间: 2019-09-19T21:30:22Z
项目社区:https://github.com/anuragithub/OLDC

开源协议:

下载


Optimized Levenshtein Distance search using Clustering

This is an attempt to optimize the search of words for candidate model while building auto-correct applications. The idea is to minimize the calculation the Ld(Levenshtein distance) as it is computationally expensive. Simple implementation of DBSCAN clustering is being used to form clusters to be queried further.

Requirements

Use the package manager pip to install the requirements.

  1. pip install numpy

Usage

  1. jupyter notebook

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT