项目作者: domantasm96

项目描述 :
Autoplius scrapper + VIN number image decoding and reading + implemented captcha solver
高级语言: Jupyter Notebook
项目地址: git://github.com/domantasm96/autoplius.git
创建时间: 2020-01-23T21:26:01Z
项目社区:https://github.com/domantasm96/autoplius

开源协议:

下载


Autoplius ads scraper

Implemented tasks:

  • [X] URL’s gathering from autoplius sitemap
  • [X] Defined data fields
  • [X] Free proxy usage feature(by default disabled)
  • [X] Implemented VIN code decoder and image reader using OCR tools
  • [X] Implemented automatic Captcha solver (There is no need to use proxies at this point)

Tasks to do:

  • [] Create virtual environment for easier necessary package installation
  • [] Merge new ads links with scraped url’s list
  • [] Improve automatic Captcha solver by not hardacoding crop values(if Selenium window is at different size, captcha is not properly cropped)
  • [] Make it faster by implementing Asynchronous requests

If you have any recommendations or code improvements - feel free to contact me.

Please note that this is side self learning project. Treat https://en.autoplius.lt/ data with respect and use this code at your own risk.