-
Notifications
You must be signed in to change notification settings - Fork 322
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #551 from Allenmylath/patch-3
Frame types and short descriptionCreate Frames.md
- Loading branch information
Showing
1 changed file
with
113 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
# Understanding Different Frame Types in the Pipecat System | ||
|
||
In the Pipecat system, frames are used to represent different types of data and control signals that flow through the pipeline. Understanding these frame types is crucial for working with the system effectively. This tutorial will cover the main categories of frames and their specific uses. | ||
|
||
## 1. Base Frame Classes | ||
|
||
### Frame | ||
The `Frame` class is the base class for all frames. It includes: | ||
- `id`: A unique identifier | ||
- `name`: A descriptive name | ||
- `pts`: Presentation timestamp (optional) | ||
|
||
### DataFrame | ||
`DataFrame` is a subclass of `Frame` and serves as a base for most data-carrying frames. | ||
|
||
## 2. Audio Frames | ||
|
||
### AudioRawFrame | ||
Represents a chunk of audio with properties: | ||
- `audio`: Raw audio data | ||
- `sample_rate`: Audio sample rate | ||
- `num_channels`: Number of audio channels | ||
|
||
Subclasses include: | ||
- `InputAudioRawFrame`: For audio from input sources | ||
- `OutputAudioRawFrame`: For audio to be played by output devices | ||
- `TTSAudioRawFrame`: For audio generated by Text-to-Speech services | ||
|
||
## 3. Image Frames | ||
|
||
### ImageRawFrame | ||
Represents an image with properties: | ||
- `image`: Raw image data | ||
- `size`: Image dimensions | ||
- `format`: Image format (e.g., JPEG, PNG) | ||
|
||
Subclasses include: | ||
- `InputImageRawFrame`: For images from input sources | ||
- `OutputImageRawFrame`: For images to be displayed | ||
- `UserImageRawFrame`: For images associated with a specific user | ||
- `VisionImageRawFrame`: For images with associated text for description | ||
- `URLImageRawFrame`: For images with an associated URL | ||
|
||
### SpriteFrame | ||
Represents an animated sprite, containing a list of `ImageRawFrame` objects. | ||
|
||
## 4. Text and Transcription Frames | ||
|
||
### TextFrame | ||
Represents a chunk of text, used for various purposes in the pipeline. | ||
|
||
### TranscriptionFrame | ||
A specialized `TextFrame` for speech transcriptions, including: | ||
- `user_id`: ID of the speaking user | ||
- `timestamp`: When the transcription was generated | ||
- `language`: Detected language of the speech | ||
|
||
### InterimTranscriptionFrame | ||
Similar to `TranscriptionFrame`, but for interim (not final) transcriptions. | ||
|
||
## 5. LLM (Language Model) Frames | ||
|
||
### LLMMessagesFrame | ||
Contains a list of messages for an LLM service to process. | ||
|
||
### LLMMessagesAppendFrame and LLMMessagesUpdateFrame | ||
Used to modify the current context of LLM messages. | ||
|
||
### LLMSetToolsFrame | ||
Specifies tools (functions) available for the LLM to use. | ||
|
||
### LLMEnablePromptCachingFrame | ||
Controls prompt caching in certain LLMs. | ||
|
||
## 6. System and Control Frames | ||
|
||
### SystemFrame | ||
Base class for system-level frames. | ||
|
||
Important system frames include: | ||
- `StartFrame`: Initiates a pipeline | ||
- `CancelFrame`: Stops a pipeline immediately | ||
- `ErrorFrame`: Notifies of errors (with `FatalErrorFrame` for unrecoverable errors) | ||
- `EndTaskFrame` and `CancelTaskFrame`: Control pipeline tasks | ||
- `StartInterruptionFrame` and `StopInterruptionFrame`: Indicate user speech for interruptions | ||
|
||
### ControlFrame | ||
Base class for control-flow frames. | ||
|
||
Notable control frames: | ||
- `EndFrame`: Signals the end of a pipeline | ||
- `LLMFullResponseStartFrame` and `LLMFullResponseEndFrame`: Bracket LLM responses | ||
- `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame`: Indicate user speech activity | ||
- `BotStartedSpeakingFrame` and `BotStoppedSpeakingFrame`: Indicate bot speech activity | ||
- `TTSStartedFrame` and `TTSStoppedFrame`: Bracket Text-to-Speech responses | ||
|
||
## 7. Special Purpose Frames | ||
|
||
### AppFrame | ||
Base class for application-specific custom frames. | ||
|
||
### MetricsFrame | ||
Contains performance metrics data. | ||
|
||
### FunctionCallInProgressFrame and FunctionCallResultFrame | ||
Used for handling LLM function (tool) calls. | ||
|
||
### ServiceUpdateSettingsFrame | ||
Base class for updating service settings, with specific subclasses for LLM, TTS, and STT services. | ||
|
||
## Conclusion | ||
|
||
Understanding these frame types is essential for working with the Pipecat system. Each frame type serves a specific purpose in the pipeline, whether it's carrying data (like audio or images), controlling the flow of the pipeline, or managing system-level operations. By using the appropriate frame types, you can effectively process and transmit various kinds of information through your pipeline. |