项目作者: egorfolley

项目描述 :
Python script for extracting images from docx files
高级语言: Jupyter Notebook
项目地址: git://github.com/egorfolley/Png_text_from_Docx.git
创建时间: 2019-09-22T16:55:21Z
项目社区:https://github.com/egorfolley/Png_text_from_Docx

开源协议:

下载


- Add pdf-file into Data folder and change it name in python source code file

- “Images_from_PDF” - folder where the initial pages from pdf extracts

- “Images” - folder where final result png-images stores

Algorithm is searches for the black colored pixels and cut the images from the 1st point to final point