项目作者: Abhinandan11

项目描述 :
This is an off-line wikipedia search engine, which uses TF-IDF scoring to retrieve top results from a given wikipedia dump.
高级语言: Python
项目地址: git://github.com/Abhinandan11/wiki-search-engine.git
创建时间: 2018-08-04T06:27:14Z
项目社区:https://github.com/Abhinandan11/wiki-search-engine

开源协议:

下载


wiki-search-engine

  1. Install PyStemmer

    https://github.com/snowballstem/pystemmer

  2. Create inverted index of a dump

    python wiki_indexer.py <wiki_dump_file_name> <output_file_name>

  3. Search in the input dump

    python query.py