项目作者: Peder2911

项目描述 :
Text-mining & classification framework
高级语言: Python
项目地址: git://github.com/Peder2911/Diverse_Folio_Isle.git
创建时间: 2018-07-24T10:48:15Z
项目社区:https://github.com/Peder2911/Diverse_Folio_Isle

开源协议:GNU General Public License v3.0

下载


Diverse folio isle

This branch is the start of a near total refactoring of the DFI framework to make it easier to write modules, and to facilitate the use of Redis for IPC. (and also to fix some wierdness left over from coding in the summer heat).

Stay tuned! :)

Requirements

You need to install the dfitools package to use this application.

Description

Diverse Folio Isle is a framework for doing text-mining.

There is a manual available here: Read the docs

Usage

The program has a simple CLI that is accessed through DiverseFolioIsle.py. This script directs a three-step process where the user specifies a sourcing, a preprocessing and a __classification script. These scripts are as orthogonally modular as possible, some are even usable as standalone applications (like the UFT pdf scraper).

Modularity is meant to facilitate scientific comparison of text-mining techniques. For more on this, see my thesis.