项目作者: ushashwat

项目描述 :
Semantic search using NLP on extracted text
高级语言: Jupyter Notebook
项目地址: git://github.com/ushashwat/Semantic-Search-Engine.git
创建时间: 2020-08-06T07:39:45Z
项目社区:https://github.com/ushashwat/Semantic-Search-Engine

开源协议:

下载


Semantic Search Engine

Semantic search using NLP on extracted text

Error!

Data Extraction

The text corpus can be extracted from any website that allows web scraping. BeautifulSoup library is used to parse the components from a HTML webpage and extract the text from the body.

Acknowledgement

AllenNLP

Limitations

  • The semantic meaning is not fully comprehended for open-ended questions
  • Web scraping is not reliable and webpages with popups/redirects can create issues