项目作者: cvhariharan

项目描述 :
K-Gram spell corrector
高级语言: Python
项目地址: git://github.com/cvhariharan/K-Gram-Spell-Corrector.git
创建时间: 2019-03-13T06:18:43Z
项目社区:https://github.com/cvhariharan/K-Gram-Spell-Corrector

开源协议:

下载


K-Gram Spell Corrector

Codacy Badge

This is a simple K-Gram spell corrector with basic indexing. The indexing here is only to retrieve words with the same initial bi-gram. Similarity is calculated using Jaccard coefficient.

Create Index

  1. # wordsList is a list of words with correct spellings, the second param is the value of k for k-gram
  2. c = Correct(wordsList, 2)
  3. c.createIndex() # creates an index.json file in the working directory and loads it

Load Index

  1. c.loadIndex() # looks for index.json in the working directory and loads it, not needed if createIndex is used

Get Word Suggestions

  1. res = c.suggest("palontolody") # returns a sorted dict of words in ascending order, ordered by Jaccard coefficients
  2. print(res[len(res)-1]) # print the last word which will have the largest Jaccard coefficient