项目作者： anuragithub

项目描述：

  A simple methodology to optimize the candidate model by searching through an optimized clustered graph based on levenshtein distance.

高级语言： Jupyter Notebook

项目主页：

项目地址: git://github.com/anuragithub/OLDC.git

创建时间： 2019-09-19T21:30:22Z
项目社区：https://github.com/anuragithub/OLDC
开源协议：
下载

Optimized Levenshtein Distance search using Clustering

This is an attempt to optimize the search of words for candidate model while building auto-correct applications. The idea is to minimize the calculation the Ld(Levenshtein distance) as it is computationally expensive. Simple implementation of DBSCAN clustering is being used to form clusters to be queried further.

Requirements

Use the package manager pip to install the requirements.

pip install numpy

Usage

jupyter notebook

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT


