From 57ef525a8ed74f34e97d3e1ec4063e7e34cc2b9c Mon Sep 17 00:00:00 2001 From: Mark Backman Date: Wed, 13 Nov 2024 22:44:39 -0500 Subject: [PATCH 1/2] Update README --- README.md | 90 +++++++++++++++++++++++++++++++++---------------------- 1 file changed, 54 insertions(+), 36 deletions(-) diff --git a/README.md b/README.md index e478fdb9..7cd5a813 100644 --- a/README.md +++ b/README.md @@ -2,13 +2,19 @@ ย pipecat -# Pipecat - [![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) -`pipecat` is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, [story-telling toys for kids](https://storytelling-chatbot.fly.dev/), customer support bots, [intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0), and snarky social companions. +Pipecat is an open source Python framework for building voice and multimodal conversational agents. It handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences. + +## What you can build + +- **Voice Assistants**: [Natural, real-time conversations with AI](https://demo.dailybots.ai/) +- **Interactive Agents**: Personal coaches and meeting assistants +- **Multimodal Apps**: Combine voice, video, images, and text +- **Creative Tools**: [Story-telling experiences](https://storytelling-chatbot.fly.dev/) and social companions +- **Business Solutions**: [Customer intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0) and support bots -Take a look at some example apps: +## See it in action

  @@ -18,33 +24,52 @@ Take a look at some example apps:

-## Getting started with voice agents +## Key features + +- **Voice-first Design**: Built-in speech recognition, TTS, and conversation handling +- **Flexible Integration**: Works with popular AI services (OpenAI, ElevenLabs, etc.) +- **Pipeline Architecture**: Build complex apps from simple, reusable components +- **Real-time Processing**: Frame-based pipeline architecture for fluid interactions +- **Production Ready**: Enterprise-grade WebRTC and Websocket support + +## Getting started You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youโ€™re ready. You can also add a ๐Ÿ“ž telephone number, ๐Ÿ–ผ๏ธ image output, ๐Ÿ“บ video input, use different LLMs, and more. ```shell -# install the module +# Install the module pip install pipecat-ai -# set up an .env file with API keys +# Set up your environment cp dot-env.template .env ``` -By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with: +To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with: ```shell pip install "pipecat-ai[option,...]" ``` -Your project may or may not need these, so they're made available as optional requirements. Here is a list: +Available options include: + +| Category | Services | Install Command Example | +| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------- | +| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/api-reference/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/api-reference/services/stt/azure), [Deepgram](https://docs.pipecat.ai/api-reference/services/stt/deepgram), [Gladia](https://docs.pipecat.ai/api-reference/services/stt/gladia), [Whisper](https://docs.pipecat.ai/api-reference/services/stt/whisper) | `pip install "pipecat-ai[deepgram]"` | +| LLMs | [Anthropic](https://docs.pipecat.ai/api-reference/services/llm/anthropic), [Azure](https://docs.pipecat.ai/api-reference/services/llm/azure), [Fireworks AI](https://docs.pipecat.ai/api-reference/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/api-reference/services/llm/gemini), [Ollama](https://docs.pipecat.ai/api-reference/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/api-reference/services/llm/openai), [Together AI](https://docs.pipecat.ai/api-reference/services/llm/together) | `pip install "pipecat-ai[openai]"` | +| Text-to-Speech | [AWS](https://docs.pipecat.ai/api-reference/services/tts/aws), [Azure](https://docs.pipecat.ai/api-reference/services/tts/azure), [Cartesia](https://docs.pipecat.ai/api-reference/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/api-reference/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/api-reference/services/tts/elevenlabs), [Google](https://docs.pipecat.ai/api-reference/services/tts/google), [LMNT](https://docs.pipecat.ai/api-reference/services/tts/lmnt), [OpenAI](https://docs.pipecat.ai/api-reference/services/tts/openai), [PlayHT](https://docs.pipecat.ai/api-reference/services/tts/playht), [Rime](https://docs.pipecat.ai/api-reference/services/tts/rime), [XTTS](https://docs.pipecat.ai/api-reference/services/tts/xtts) | `pip install "pipecat-ai[cartesia]"` | +| Speech-to-Speech | [OpenAI Realtime](https://docs.pipecat.ai/api-reference/services/s2s/openai) | `pip install "pipecat-ai[openai]"` | +| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/api-reference/services/transport/daily), WebSocket, Local | `pip install "pipecat-ai[daily]"` | +| Video | [Tavus](https://docs.pipecat.ai/api-reference/services/video/tavus) | `pip install "pipecat-ai[tavus]"` | +| Vision & Image | [Moondream](https://docs.pipecat.ai/api-reference/services/vision/moondream), [fal](https://docs.pipecat.ai/api-reference/services/image-generation/fal) | `pip install "pipecat-ai[moondream]"` | +| Audio Processing | [Silero VAD](https://docs.pipecat.ai/api-reference/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/api-reference/utilities/audio/krisp-filter), [Noisereduce](https://docs.pipecat.ai/api-reference/utilities/audio/noisereduce-filter) | `pip install "pipecat-ai[silero]"` | +| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/api-reference/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/api-reference/services/analytics/sentry) | `pip install "pipecat-ai[canonical]"` | -- **AI services**: `anthropic`, `assemblyai`, `aws`, `azure`, `deepgram`, `gladia`, `google`, `fal`, `lmnt`, `moondream`, `openai`, `openpipe`, `playht`, `silero`, `whisper`, `xtts` -- **Transports**: `local`, `websocket`, `daily` +๐Ÿ“š [View full services documentation โ†’](https://docs.pipecat.ai/api-reference/services/supported-services) ## Code examples -- [foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) โ€” small snippets that build on each other, introducing one or two concepts at a time -- [example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) โ€” complete applications that you can use as starting points for development +- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) โ€” small snippets that build on each other, introducing one or two concepts at a time +- [Example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) โ€” complete applications that you can use as starting points for development ## A simple voice agent running locally @@ -109,7 +134,7 @@ Run it with: python app.py ``` -Daily provides a prebuilt WebRTC user interface. Whilst the app is running, you can visit at `https://.daily.co/` and listen to the bot say hello! +Daily provides a prebuilt WebRTC user interface. While the app is running, you can visit at `https://.daily.co/` and listen to the bot say hello! ## WebRTC for production use @@ -119,28 +144,6 @@ One way to get up and running quickly with WebRTC is to sign up for a Daily deve Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard. -## What is VAD? - -Voice Activity Detection — very important for knowing when a user has finished speaking to your bot. If you are not using press-to-talk, and want Pipecat to detect when the user has finished talking, VAD is an essential component for a natural feeling conversation. - -Pipecat makes use of WebRTC VAD by default when using a WebRTC transport layer. Optionally, you can use Silero VAD for improved accuracy at the cost of higher CPU usage. - -```shell -pip install pipecat-ai[silero] -``` - -## Krisp Audio Filter - -This project includes support for Krisp's noise cancellation SDK. To get started, you'll need a Krisp developer account and SDK access. - -For complete setup and usage instructions, see our [Krisp Integration Guide](https://docs.pipecat.ai/guides/krisp). - -Quick install (after SDK setup): - -```bash -pip install pipecat-ai[krisp] -``` - ## Hacking on the framework itself _Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_ @@ -218,8 +221,23 @@ Install the } ``` +## Contributing + +We welcome contributions from the community! Whether you're fixing bugs, improving documentation, or adding new features, here's how you can help: + +- **Found a bug?** Open an [issue](https://github.com/pipecat-ai/pipecat/issues) +- **Have a feature idea?** Start a [discussion](https://discord.gg/pipecat) +- **Want to contribute code?** Check our [CONTRIBUTING.md](CONTRIBUTING.md) guide +- **Documentation improvements?** [Docs](https://github.com/pipecat-ai/docs) PRs are always welcome + +Before submitting a pull request, please check existing issues and PRs to avoid duplicates. + +We aim to review all contributions promptly and provide constructive feedback to help get your changes merged. + ## Getting help โžก๏ธ [Join our Discord](https://discord.gg/pipecat) +โžก๏ธ [Read the docs](https://docs.pipecat.ai) + โžก๏ธ [Reach us on X](https://x.com/pipecat_ai) From 27ff868e5a57b455ab67163cfc27b754e50affdc Mon Sep 17 00:00:00 2001 From: Mark Backman Date: Wed, 13 Nov 2024 22:45:31 -0500 Subject: [PATCH 2/2] Move CONTRIBUTING to top directory --- docs/CONTRIBUTING.md => CONTRIBUTING.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/CONTRIBUTING.md => CONTRIBUTING.md (100%) diff --git a/docs/CONTRIBUTING.md b/CONTRIBUTING.md similarity index 100% rename from docs/CONTRIBUTING.md rename to CONTRIBUTING.md