项目作者: twobob

项目描述 :
Elasticlunr implementation to wrap Clarifai output
高级语言: JavaScript
项目地址: git://github.com/twobob/Ki.git
创建时间: 2017-05-03T15:47:40Z
项目社区:https://github.com/twobob/Ki

开源协议:

下载


Ki

SCREENSHOT

Create a searchable image tag website using Python and Elasticlunr.js.

This project uses a local BLIP-2 captioning model to automatically tag your images, generates thumbnails, and provides a web interface to search them.

Requirements

  • Python 3.8 or newer
  • Pillow for image processing
  • Transformers and PyTorch for the BLIP‑2 model
  • spaCy with the en_core_web_sm language model
  • scikit-image for JPEG recompression metrics
  • tqdm for progress bars
  • Optional JPEG compression with -J/--jpegli uses the
    jpeglib Python package.

Thumbnails are generated at 256×256 pixels by default, so ensure you have enough disk space for the resized copies.

How to Use

  1. Install Dependencies:
    Ensure you have Python installed. Then, install the necessary libraries. While specific versions may vary, you’ll typically need:

    1. pip install Pillow scikit-image transformers torch torchvision torchaudio spacy tqdm jpeglib
    2. python -m spacy download en_core_web_sm

    (Note: torch installation can vary based on your system and CUDA availability. Refer to the official PyTorch website for specific instructions if needed.)

  2. Process Your Images:
    Navigate to the repository directory and run the main pipeline script, providing the path to your image folder:
    Typical locations are %USERPROFILE%\Pictures on Windows or ~/Pictures on Linux/macOS.

    1. python run_pipeline.py [PATH_TO_YOUR_IMAGES] [-I PATH_TO_YOUR_IMAGES] [-O OUTPUT_DIR] [-R | --recurse] [-C | --clear] [-Z | --compress] [-J | --jpegli] [-A | --add] [-D | --delete] [-V | --verbose] [-S [PORT]]

    Windows users: Avoid quoting a path that ends with a single backslash. Either remove the trailing backslash or escape it as \\ so additional flags are parsed correctly.

    This script will:

    • Scan the PATH_TO_YOUR_IMAGES directory (positional or via -I/--input) for JPG, JPEG, and PNG files. Use -R/--recurse to include subfolders.
    • Generate descriptive tags for each image using a local BLIP-2 model.
    • Create 256×256 thumbnails for each image and store them in the output directory (default img/thumbs/). An optional watermark from img/overlay/watermark.png may be applied if make_thumbs.py (called by the pipeline) is configured for it. Thumbnail file names now include a short hash of the original path so duplicates across folders or extensions will never collide.
    • Optionally clear the contents of the output folder first when using -C/--clear.
    • Enable additional JPEG compression with -Z/--compress or use the jpeglib library with -J/--jpegli. These options are mutually exclusive.
    • Compile all tag information into data.json, which is used by the search interface.
    • Show per-image progress bars so you know exactly how many files remain.
    • Use -V/--verbose to print per-image details instead of progress bars.
    • Use -A/--add to append new images without rebuilding existing entries, or -D/--delete to remove records and thumbnails for images in the folder.
    • Use -S [PORT] to automatically launch the local server after processing. Omit PORT to use serve.py‘s default.
  3. Run the Web Server:
    If you didn’t use -S during the pipeline step, start the local web server manually:

    1. python serve.py

    (On Linux/macOS, you might need to use python3 serve.py)

    Then, open your web browser and go to http://localhost:8000 (or the port specified by serve.py) to view and search your images.

Project Structure Highlights

  • index.html: The main page for the image search.
  • app.js: Handles the client-side logic, including Elasticlunr.js setup and search functionality.
  • data.json: Contains the image tags and metadata for the search index (generated by run_pipeline.py).
  • img/thumbs/: Default directory where thumbnails are stored.
  • run_pipeline.py: The main script to process your images (tagging and thumbnail generation).
  • make_thumbs.py: Script for generating thumbnails, typically called by run_pipeline.py.
  • serve.py: A simple Python HTTP server to run the website locally.

TODO/MAYBES:

  • Make the partial rendering loop stop when you click a result before it is finished.
  • Add transactional folders (e.g., IN, PROCESSED, ERROR) for more efficient content addition.
  • Check EXIF/File attributes for “Time Created” to compare against a “last sync” date for incremental updates.
  • Add a script to search network drives for images.
  • Conduct thorough testing, including corner cases.