This repository contains a Streamlit application that uses the Gemini Pro Vision API to recognize and describe the content of an uploaded image based on a customizable prompt text box.
- Upload an image and get a description based on a prompt.
- Customize the prompt to get specific descriptions.
- Simple and user-friendly web interface.
-
Clone the repository:
git clone https://github.com/your-username/gemini-pro-vision-streamlit.git cd gemini-pro-vision-streamlit
-
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
- On Windows:
-
Install the dependencies:
pip install -r requirements.txt
- Create a
.env
file in the root directory of the project and add your Gemini Pro Vision API key:GEMINI_API_KEY=your_gemini_api_key_here
-
Run the Streamlit application:
streamlit run app.py
-
Open your web browser and go to
http://localhost:8501
to view the application. -
Upload an image and enter a prompt to get a description based on the prompt.
The main components of the application are:
app.py
: The main application file that contains the Streamlit code.requirements.txt
: The file that lists the Python dependencies for the project.
Contributions are welcome! Please open an issue or submit a pull request for any changes.