Skip to content

🧿 Pupyl is a really fast image search library which you can index your own (millions of) images and find similar images in milliseconds.

License

Notifications You must be signed in to change notification settings

policratus/pupyl

pupyl pupyl-ci codecov anaconda PyPI version Documentation Status Downloads CII Best Practices Anaconda-Server Badge

pupyl - A Python Image Search Library

pupyl

Table of contents

🧿 pupyl what?

The pupyl project (pronounced pyooΒ·piel) is a pythonic library to perform image search tasks (even over animated GIFs). It's intended to make easy reading, indexing, retrieving and maintaining a complete reverse image search engine. You can use it in your own data pipelines, web projects and wherever you find fit!

πŸŽ‰ Getting started

πŸ“¦ Installation

Installing pupyl on your environment is pretty easy:

# pypi
pip install pupyl

or

# anaconda
conda install -c policratus pupyl

For installation troubleshooting, visit troubleshooting.

🚸 Usage

You can call pupyl's objects directly from your application code. For this example, a sample database will be indexed and after that, the following image will be used as a query image (credits: @dlanor_s):

@dlanor_s

pupyl also supports using animated gifs as query images and can store and retrieve it too.

from pupyl.search import PupylImageSearch
from pupyl.web import interface

SEARCH = PupylImageSearch()

SEARCH.index(
    'https://github.com/policratus/pupyl'
    '/raw/main/samples/images.tar.xz'
)

# Using, for instance, a remote image. Local images have pretty faster results.
QUERY_IMAGE = 'https://images.unsplash.com/photo-1520763185298-1b434c919102?w=224&q=70'

[*SEARCH.search(QUERY_IMAGE)]

Disclaimer: the example above creates pupyl assets on your temporary directory. To define a non-volatile database, you should define data_dir parameter.

This will return:

# Here's the simplest possible result
> [486, 12, 203, 176]

With more information and returning image metadata from the results:

# The results with image metadata
[*SEARCH.search(QUERY_IMAGE, return_metadata=True)]

Now an excerpt of the (possible) return is:

[
    {
        "id": 486,
        "internal_path": "/tmp/pupyl/0/486.gif",
        "original_access_time": "2021-12-03T13:23:47",
        "original_file_name": "icegif-5690.gif",
        "original_file_size": "261K",
        "original_path": "/tmp/tmp3gdxlwr6"
    },
    {
        "id": 12,
        "internal_path": "/tmp/pupyl/0/12.gif",
        "original_access_time": "2021-12-03T13:23:46",
        "original_file_name": "roses.gif",
        "original_file_size": "1597K",
        "original_path": "/tmp/tmp3gdxlwr6"
    },
    ...
]

To interact visually, use the web interface:

# Opening the web interface
interface.serve()

A glimpse of the web interface, visualizing the results shown above:

web

Alternatively, you can interact with pupyl via command line. The same example above in CLI terms:

🐚 Command line interface

# Indexing images
pupyl --data_dir /path/to/your/data/dir index /path/to/images/

# Opening web interface
pupyl --data_dir /path/to/your/data/dir serve

# Searching using command line interface
pupyl --data_dir /path/to/your/data/dir search /path/to/query/image.ext

πŸ’‘ Type pupyl --help to discover all the CLI's capabilities.

πŸ“Œ Dependencies

See all dependencies here: dependencies.

πŸ“ Documentation

See a getting started guide and the API reference on https://pupyl.readthedocs.io/.

πŸ–ŠοΈ Citation

If you use pupyl in your publications or projects, please cite:

@misc{pupyl,
    author = {Nelson Forte de Souza Junior},
    title = {pupyl - A Python Image Search Library},
    howpublished = {\url{https://github.com/policratus/pupyl}},
    year = {2021}
}