Skip to content

πŸ€– Serverless Discord bot using OpenAI's GPT and Whisper to analyze and summarize podcasts, with interactive Q&A and session handling via API Gateway, Lambda, and DynamoDB. Data can be saved in Notion.

License

Notifications You must be signed in to change notification settings

matiasvallejosdev/serverless-podcast-ai-discord-bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

50 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Podcast Agent Bot

GitHub top language License Forks Stars Watchers

πŸš€ Experience Podcast Agent Bot in action: View Demo

πŸ“˜ Introduction

The Podcast Agent Bot is an innovative Discord bot designed to analyze and summarize podcasts. Utilizing OpenAI's powerful GPT and Whisper models, it offers users a seamless way to interact with audio content, providing insights, summaries, and responses to queries based on the podcast's content.

🎯 Purpose

The creation of the Podcast Agent Bot was inspired by the challenge of consuming and retaining the wealth of information available in podcasts. Often, while listening to podcasts, taking notes and revisiting key points later can be cumbersome. This bot simplifies the process by providing tools to summarize, highlight important information, translate content, and offer an interactive Q&A feature based on the podcast, making the wealth of knowledge in podcasts more accessible and engaging.

✨ Features

  • Audio Analysis: Upload your podcast audio files and get a detailed analysis of the content.
  • Transcription: Convert audio content into text using Whisper for further analysis.
  • Summarization: Get concise summaries of your podcast episodes, highlighting the key points.
  • Interactive Q&A: Ask questions about the podcast content and receive accurate answers.
  • Support for Multiple Audio Formats: Supports mp3, wav, and ogg audio formats.

πŸ”§ Interaction Design & Architecture

Podcast Agent Bot integrates real-time WebSocket communication with Discord's API, allowing users to start conversations, upload audio files, and interact with podcast content seamlessly. It harnesses OpenAI's GPT-4 for intelligent chat completion and uses Whisper for accurate audio transcriptions. The bot's memory system ensures continuity in conversations. Commands like /upload_audio, /clear, and /summarize empower users to fully engage with and analyze their favorite podcasts within Discord. This diagram gives a visual overview of the bot's structure and interaction flow.

Podcast Agent Bot Design System

πŸ“₯ Installation

To set up the Podcast Agent Bot for development and testing, follow these steps:

  1. Clone the Repository Clone the project to your local machine:
   git clone https://github.com/your-username/podcast-agent-bot.git
   cd podcast-agent-bot
  1. Create a New Discord Bot Application

  2. Generate Your Discord Token

    • In the Discord Developer Portal, select your application, go to the "Bot" tab, and click "Add Bot".
    • Under the "Token" section, click "Copy" to get your Discord bot token.
    • Set the necessary permissions for your bot to function correctly.
  3. Add Your Environment Variables Configure the environment variables in your system or a .env file:

    DISCORD_TOKEN="your_discord_token"
    DISCORD_GUILD_ID=your_guild_id
    DISCORD_GUILD_CHANNEL=your_channel_id
    OPENAI_API_KEY="your_openai_api_key"
    OPENAI_GPTMODEL="gpt-4-turbo-preview"
    OPENAI_TEMPERATURE=0
    OPENAI_TOKENS=4096
  4. Install Dependencies Run the following command to install the necessary dependencies:

    pip install -r requirements.txt
  5. Run Your Project Launch the bot with the following command:

    python3 main.py

Note: To run tests, use the pytest command in your terminal.

❓ How to use it?

πŸ“– User Guide

To interact with the Podcast Agent Bot effectively, follow these steps:

  1. Start a New Conversation: Initiate your interaction with the bot to set the context.
  2. Clear Chat History (if necessary): Use the /clear command to remove old interactions and start with a clean slate.
  3. Upload Podcast Audio: Execute the /upload_audio [file] [language] command to upload your podcast file in a supported format (mp3, wav, or ogg) for analysis.
  4. Analyze and Interact: Post-upload, ask the bot specific questions about the podcast's content with the /ask [question] command or request a summary using /summarize.

⌨️ Commands

Utilize the following commands to interact with the bot:

  • /ask [question]: Inquire about specific podcast content.
  • /help: Display a list of available commands and their functions.
  • /clear: Clear the chat history to clean up the conversation space.
  • /upload_audio [file] [language]: Upload an audio file for detailed transcription and analysis.
  • /purge: Remove all messages in the current channel for a fresh start.
  • /summarize: Summarize the main points extracted from the uploaded audio file.

πŸ› οΈ Pre-Prompted Model Configuration

To enhance the Podcast Agent Bot's capabilities, we've crafted a specialized assistant profile using a system.json configuration. This profile informs the bot's behavior and sets the stage for its advanced analytical tasks.

πŸ’» Technologies Used

  • πŸ“Š Utilizes OpenAI's GPT-4 via the API for advanced content analysis and engagement.
  • πŸŽ™οΈ Employs OpenAI's Whisper for high-accuracy podcast transcription.
  • πŸ€– Integrates with Discord using the Python API for interactive bot capabilities.

🀝 Contributing

The Podcast Agent Bot is an open-source project, and contributions are welcome. Feel free to fork the repository, make your changes, and submit a pull request.

πŸ“ž Contact

If you have any questions or need further assistance, you can contact the project maintainer:

Feel free to reach out if you have any inquiries or need any additional information about the project.

πŸ“„ License

This project is open source and available under the GNU Affero General Public License v3.0.

About

πŸ€– Serverless Discord bot using OpenAI's GPT and Whisper to analyze and summarize podcasts, with interactive Q&A and session handling via API Gateway, Lambda, and DynamoDB. Data can be saved in Notion.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages