Skip to content

kwishna/openai-smart-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Apps Using GPT-4o Model

This repository contains a collection of AI applications built using the GPT-4o model, which is capable of reading images, identifying text and objects within them, and performing actions based on prompts.

Features

  • Image Recognition: The GPT-4o model can analyze images and identify various objects, text, and other elements present within them.
  • Text Extraction: The model can extract and recognize text from images, making it useful for tasks like OCR (Optical Character Recognition).
  • Prompt-based Actions: Users can provide prompts or instructions, and the model will perform the requested actions based on the image and text analysis.

Getting Started

  1. Clone the repository: git clone https://github.com/your-username/ai-apps-gpt-4o.git
  2. Install the required dependencies: npm install
  3. update .env file with OpenAI API KEY: OPENAI_API_KEY=<YOUR API KEY>
  4. Run the desired application: node <file_name>

Applications

This repository includes the following AI applications:

  • Image Captioning: Generates descriptive captions for input images.
  • Text Recognition: Extracts and recognizes text from images.
  • Object Detection: Identifies and locates objects within images.
  • Document Processing: Processes and extracts information from documents (e.g., invoices, receipts).
  • Visual Question Answering: Answers questions based on the content of an image.

Contributing

Contributions are welcome! If you have any ideas, improvements, or bug fixes, please open an issue or submit a pull request.

License

This project is free for everyone.

Releases

No releases published

Packages

No packages published