diff --git a/docs/CONTRIBUTING.md b/CONTRIBUTING.md similarity index 100% rename from docs/CONTRIBUTING.md rename to CONTRIBUTING.md diff --git a/README.md b/README.md index e478fdb92..7cd5a813f 100644 --- a/README.md +++ b/README.md @@ -2,13 +2,19 @@ ย -# Pipecat - [![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) -`pipecat` is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, [story-telling toys for kids](https://storytelling-chatbot.fly.dev/), customer support bots, [intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0), and snarky social companions. +Pipecat is an open source Python framework for building voice and multimodal conversational agents. It handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences. + +## What you can build + +- **Voice Assistants**: [Natural, real-time conversations with AI](https://demo.dailybots.ai/) +- **Interactive Agents**: Personal coaches and meeting assistants +- **Multimodal Apps**: Combine voice, video, images, and text +- **Creative Tools**: [Story-telling experiences](https://storytelling-chatbot.fly.dev/) and social companions +- **Business Solutions**: [Customer intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0) and support bots -Take a look at some example apps: +## See it in action
@@ -18,33 +24,52 @@ Take a look at some example apps:
-## Getting started with voice agents +## Key features + +- **Voice-first Design**: Built-in speech recognition, TTS, and conversation handling +- **Flexible Integration**: Works with popular AI services (OpenAI, ElevenLabs, etc.) +- **Pipeline Architecture**: Build complex apps from simple, reusable components +- **Real-time Processing**: Frame-based pipeline architecture for fluid interactions +- **Production Ready**: Enterprise-grade WebRTC and Websocket support + +## Getting started You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youโre ready. You can also add a ๐ telephone number, ๐ผ๏ธ image output, ๐บ video input, use different LLMs, and more. ```shell -# install the module +# Install the module pip install pipecat-ai -# set up an .env file with API keys +# Set up your environment cp dot-env.template .env ``` -By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with: +To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with: ```shell pip install "pipecat-ai[option,...]" ``` -Your project may or may not need these, so they're made available as optional requirements. Here is a list: +Available options include: + +| Category | Services | Install Command Example | +| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------- | +| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/api-reference/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/api-reference/services/stt/azure), [Deepgram](https://docs.pipecat.ai/api-reference/services/stt/deepgram), [Gladia](https://docs.pipecat.ai/api-reference/services/stt/gladia), [Whisper](https://docs.pipecat.ai/api-reference/services/stt/whisper) | `pip install "pipecat-ai[deepgram]"` | +| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/services/llm/anthropic), [Azure](https://docs.pipecat.ai/api-reference/services/llm/azure), [Fireworks AI](https://docs.pipecat.ai/api-reference/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/services/llm/gemini), [Ollama](https://docs.pipecat.ai/api-reference/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/services/llm/openai), [Together AI](https://docs.pipecat.ai/api-reference/services/llm/together) | `pip install "pipecat-ai[openai]"` | +| Text-to-Speech | [AWS](https://docs.pipecat.ai/api-reference/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/services/tts/azure), [Cartesia](https://docs.pipecat.ai/api-reference/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/services/tts/elevenlabs), [Google](https://docs.pipecat.ai/api-reference/services/tts/google), [LMNT](https://docs.pipecat.ai/api-reference/services/tts/lmnt), [OpenAI](https://docs.pipecat.ai/api-reference/services/tts/openai), [PlayHT](https://docs.pipecat.ai/api-reference/services/tts/playht), [Rime](https://docs.pipecat.ai/api-reference/services/tts/rime), [XTTS](https://docs.pipecat.ai/api-reference/services/tts/xtts) | `pip install "pipecat-ai[cartesia]"` | +| Speech-to-Speech | [OpenAI Realtime](https://docs.pipecat.ai/api-reference/services/s2s/openai) | `pip install "pipecat-ai[openai]"` | +| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/services/transport/daily), WebSocket, Local | `pip install "pipecat-ai[daily]"` | +| Video | [Tavus](https://docs.pipecat.ai/api-reference/services/video/tavus) | `pip install "pipecat-ai[tavus]"` | +| Vision & Image | [Moondream](https://docs.pipecat.ai/api-reference/services/vision/moondream), [fal](https://docs.pipecat.ai/api-reference/services/image-generation/fal) | `pip install "pipecat-ai[moondream]"` | +| Audio Processing | [Silero VAD](https://docs.pipecat.ai/api-reference/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/api-reference/utilities/audio/krisp-filter), [Noisereduce](https://docs.pipecat.ai/api-reference/utilities/audio/noisereduce-filter) | `pip install "pipecat-ai[silero]"` | +| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/api-reference/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/api-reference/services/analytics/sentry) | `pip install "pipecat-ai[canonical]"` | -- **AI services**: `anthropic`, `assemblyai`, `aws`, `azure`, `deepgram`, `gladia`, `google`, `fal`, `lmnt`, `moondream`, `openai`, `openpipe`, `playht`, `silero`, `whisper`, `xtts` -- **Transports**: `local`, `websocket`, `daily` +๐ [View full services documentation โ](https://docs.pipecat.ai/api-reference/services/supported-services) ## Code examples -- [foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) โ small snippets that build on each other, introducing one or two concepts at a time -- [example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) โ complete applications that you can use as starting points for development +- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) โ small snippets that build on each other, introducing one or two concepts at a time +- [Example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) โ complete applications that you can use as starting points for development ## A simple voice agent running locally @@ -109,7 +134,7 @@ Run it with: python app.py ``` -Daily provides a prebuilt WebRTC user interface. Whilst the app is running, you can visit at `https://