Skip to content

Commit

Permalink
Add docs for image captioning
Browse files Browse the repository at this point in the history
  • Loading branch information
derneuere committed Dec 2, 2023
1 parent ae7ad83 commit 4ef4d81
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 2 deletions.
2 changes: 1 addition & 1 deletion docs/user-guide/exif-data.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: " Exif Data"
title: "📇 Exif Data"
excerpt: "What exif data can we read, write and filter for"
sidebar_position: 2
---
Expand Down
33 changes: 33 additions & 0 deletions docs/user-guide/image-captioning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: "📝 Image Captioning"
excerpt: "What is image captioning and how do I use it?"
sidebar_position: 6
---

## What is image captioning?

The goal of automatic image captioning to understand the content of an image and then produce a coherent and contextually relevant sentence or phrase that describes what is happening in the image.

To use the feature in LibrePhotos, open up an image, click on the information icon and then click on the generate button below the caption segment. It should generate a phrase, which should describe your image.

## How do I change the model?

Click on your avatar in the top right, and go to `Admin Area`. There is a setting for `Captioning Model`. You can select here between the different models. After selecting a model, it will be downloaded and added to your data_models folder.

## What is the difference between the models?

Currently, there are three available models: "im2txt PyTorch," "im2txt Onnx," and "Blip."

### im2txt PyTorch

This model serves as the default choice. It offers rapid results and represents the original implementation of the image captioning task. It uses the PyTorch deep learning framework, it has been a reliable option for users seeking both speed and baseline performance.

### im2txt ONNX

Utilizing the Open Neural Network Exchange (ONNX) as its engine, "im2txt Onnx" is designed for enhanced efficiency during inference. It exhibits a slight speed improvement compared to the original PyTorch model. The integration of ONNX facilitates seamless deployment across different platforms.

### Blip Base Capfilt Large

The next generation model "Blip" excels in providing highly accurate image descriptions. However, it comes with a trade-off, as it operates at approximately 20 times slower speeds than "im2txt PyTorch." This deliberate sacrifice in speed is made to achieve superior descriptive accuracy, making "Blip" an ideal choice for applications prioritizing precision over real-time processing.

Users can choose a model based on their specific requirements, balancing the need for speed, accuracy, and the trade-offs associated with each implementation. It's recommended to consider the performance of your system and the desired performance characteristics when selecting the most suitable model.
2 changes: 1 addition & 1 deletion docs/user-guide/trash.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Trash"
title: "🗑️ Trash"
excerpt: "A overview on how Trash works"
sidebar_position: 5
---
Expand Down

0 comments on commit 4ef4d81

Please sign in to comment.