项目作者: yfgao0502

项目描述 :
Information Retrival Based on NLP
高级语言: Python
项目地址: git://github.com/yfgao0502/Information-Retrieval.git
创建时间: 2017-10-18T05:31:16Z
项目社区:https://github.com/yfgao0502/Information-Retrieval

开源协议:

下载


Information-Retrieval

How to train the model and draw dots:
Download the codes to your local path and put the train data folder( crawled from the internet, full of html files) into the path, note that all files in the folder must end with .html. Then type a line in your command terminal with the following format : python main_program.py . After training, you will be asked to type a word. Then there will be a picture showed the top 10 similar words.

Where can I find the files?
Note that all the files are saved under the path /trans_files. The folder ori_files is full of .txt files which are converted from your .html files. The folder train_files is full of .txt files which are used to be train a model. The folder models is full of model files. The pictures folder is full of pictures that saved in one operation.

Thank you for browsing. If you have problems in running codes, or you are lack of train data, please feel free to contact me: yfgao0502@zju.edu.cn, and I will tell you specifically.