Elasticlunr implementation to wrap Clarifai output
Create a searchable image tag website using Python and Elasticlunr.js.
This project uses a local BLIP-2 captioning model to automatically tag your images, generates thumbnails, and provides a web interface to search them.
en_core_web_sm
language modeltqdm
for progress bars-J/--jpegli
uses theThumbnails are generated at 256×256 pixels by default, so ensure you have enough disk space for the resized copies.
Install Dependencies:
Ensure you have Python installed. Then, install the necessary libraries. While specific versions may vary, you’ll typically need:
pip install Pillow scikit-image transformers torch torchvision torchaudio spacy tqdm jpeglib
python -m spacy download en_core_web_sm
(Note: torch
installation can vary based on your system and CUDA availability. Refer to the official PyTorch website for specific instructions if needed.)
Process Your Images:
Navigate to the repository directory and run the main pipeline script, providing the path to your image folder:
Typical locations are %USERPROFILE%\Pictures
on Windows or ~/Pictures
on Linux/macOS.
python run_pipeline.py [PATH_TO_YOUR_IMAGES] [-I PATH_TO_YOUR_IMAGES] [-O OUTPUT_DIR] [-R | --recurse] [-C | --clear] [-Z | --compress] [-J | --jpegli] [-A | --add] [-D | --delete] [-V | --verbose] [-S [PORT]]
Windows users: Avoid quoting a path that ends with a single backslash. Either remove the trailing backslash or escape it as \\
so additional flags are parsed correctly.
This script will:
PATH_TO_YOUR_IMAGES
directory (positional or via -I
/--input
) for JPG, JPEG, and PNG files. Use -R
/--recurse
to include subfolders.img/thumbs/
). An optional watermark from img/overlay/watermark.png
may be applied if make_thumbs.py
(called by the pipeline) is configured for it. Thumbnail file names now include a short hash of the original path so duplicates across folders or extensions will never collide.-C
/--clear
.-Z
/--compress
or use the jpeglib
library with -J
/--jpegli
. These options are mutually exclusive.data.json
, which is used by the search interface.-V
/--verbose
to print per-image details instead of progress bars.-A
/--add
to append new images without rebuilding existing entries, or -D
/--delete
to remove records and thumbnails for images in the folder.-S [PORT]
to automatically launch the local server after processing. Omit PORT
to use serve.py
‘s default.Run the Web Server:
If you didn’t use -S
during the pipeline step, start the local web server manually:
python serve.py
(On Linux/macOS, you might need to use python3 serve.py
)
Then, open your web browser and go to http://localhost:8000
(or the port specified by serve.py
) to view and search your images.
index.html
: The main page for the image search.app.js
: Handles the client-side logic, including Elasticlunr.js setup and search functionality.data.json
: Contains the image tags and metadata for the search index (generated by run_pipeline.py
).img/thumbs/
: Default directory where thumbnails are stored.run_pipeline.py
: The main script to process your images (tagging and thumbnail generation).make_thumbs.py
: Script for generating thumbnails, typically called by run_pipeline.py
.serve.py
: A simple Python HTTP server to run the website locally.