This is a Python script that generates captions for images using the Salesforce BLIP Image Captioning model and the OpenAI API. It provides a user-friendly interface using the Streamlit library.
Make sure you have the following dependencies installed:
- streamlit
- transformers
- openai
- tqdm
- PIL (Python Imaging Library)
- torch
You can install the dependencies by running the following command:
pip insntall -r requirements.txt
To use the image caption generator, follow these steps:
-
Clone the repository and navigate to the project directory.
-
Run the script by executing the following command: streamlit run CapGen.py
-
The script will start a Streamlit server and open a web interface in your browser.
-
Click on the "Uload any Image" tab to upload your images. Select one or multiple images (in JPG, PNG, or JPEG format) using the file uploader.
-
Click the "Generate" button to generate captions for the uploaded images. The script will display the generated descriptions and captions for each image.
-
The script uses the Salesforce BLIP Image Captioning model to generate descriptions for the uploaded images. It then uses the bOpenAI API to generate creative captions based on the descriptions
-
The uploaded images are processed and displayed using the prediction function. This function opens each image, checks if it is in RGB format, and appends it to a list.
-
The pixel values of the images are extracted and converted to tensors.The model object generates captions for the pixel values using the generate method from the transformers library.
-
The generated captions are converted to text using the tokenizer object.The caption_generator function takes the generated descriptions and uses the OpenAI API to generate creative captions for each description. It constructs a prompt for each description and sends it to the OpenAI API for completion.
-
The script displays the generated descriptions and captions using the Streamlit interface. Configuration
You need to provide your OpenAI API key and the model name for generating captions in the script. Modify the following lines in the code to add your credentials:
- openai.api_key = "YOUR_OPENAI_API_KEY"
- openai_model = "YOUR_OPENAI_MODEL_NAME"
- I have used openai_model= "text-davinci-002"
Information Science engineer currently in 3rd year who is passionate about artificial intelligence and machine learning, with good skills in NLP, regression tasks, data analysis, and similer fields. This script was created by Aditya raj Pateriya. If you have any questions or suggestions, feel free to reach out. Enjoy generating cool captions for your images!