Skip to content

Commit

Permalink
Merge pull request #551 from Allenmylath/patch-3
Browse files Browse the repository at this point in the history
Frame types and short descriptionCreate Frames.md
  • Loading branch information
vipyne authored Nov 7, 2024
2 parents b959ac6 + 0e69625 commit 29925a8
Showing 1 changed file with 113 additions and 0 deletions.
113 changes: 113 additions & 0 deletions docs/frame.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Understanding Different Frame Types in the Pipecat System

In the Pipecat system, frames are used to represent different types of data and control signals that flow through the pipeline. Understanding these frame types is crucial for working with the system effectively. This tutorial will cover the main categories of frames and their specific uses.

## 1. Base Frame Classes

### Frame
The `Frame` class is the base class for all frames. It includes:
- `id`: A unique identifier
- `name`: A descriptive name
- `pts`: Presentation timestamp (optional)

### DataFrame
`DataFrame` is a subclass of `Frame` and serves as a base for most data-carrying frames.

## 2. Audio Frames

### AudioRawFrame
Represents a chunk of audio with properties:
- `audio`: Raw audio data
- `sample_rate`: Audio sample rate
- `num_channels`: Number of audio channels

Subclasses include:
- `InputAudioRawFrame`: For audio from input sources
- `OutputAudioRawFrame`: For audio to be played by output devices
- `TTSAudioRawFrame`: For audio generated by Text-to-Speech services

## 3. Image Frames

### ImageRawFrame
Represents an image with properties:
- `image`: Raw image data
- `size`: Image dimensions
- `format`: Image format (e.g., JPEG, PNG)

Subclasses include:
- `InputImageRawFrame`: For images from input sources
- `OutputImageRawFrame`: For images to be displayed
- `UserImageRawFrame`: For images associated with a specific user
- `VisionImageRawFrame`: For images with associated text for description
- `URLImageRawFrame`: For images with an associated URL

### SpriteFrame
Represents an animated sprite, containing a list of `ImageRawFrame` objects.

## 4. Text and Transcription Frames

### TextFrame
Represents a chunk of text, used for various purposes in the pipeline.

### TranscriptionFrame
A specialized `TextFrame` for speech transcriptions, including:
- `user_id`: ID of the speaking user
- `timestamp`: When the transcription was generated
- `language`: Detected language of the speech

### InterimTranscriptionFrame
Similar to `TranscriptionFrame`, but for interim (not final) transcriptions.

## 5. LLM (Language Model) Frames

### LLMMessagesFrame
Contains a list of messages for an LLM service to process.

### LLMMessagesAppendFrame and LLMMessagesUpdateFrame
Used to modify the current context of LLM messages.

### LLMSetToolsFrame
Specifies tools (functions) available for the LLM to use.

### LLMEnablePromptCachingFrame
Controls prompt caching in certain LLMs.

## 6. System and Control Frames

### SystemFrame
Base class for system-level frames.

Important system frames include:
- `StartFrame`: Initiates a pipeline
- `CancelFrame`: Stops a pipeline immediately
- `ErrorFrame`: Notifies of errors (with `FatalErrorFrame` for unrecoverable errors)
- `EndTaskFrame` and `CancelTaskFrame`: Control pipeline tasks
- `StartInterruptionFrame` and `StopInterruptionFrame`: Indicate user speech for interruptions

### ControlFrame
Base class for control-flow frames.

Notable control frames:
- `EndFrame`: Signals the end of a pipeline
- `LLMFullResponseStartFrame` and `LLMFullResponseEndFrame`: Bracket LLM responses
- `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame`: Indicate user speech activity
- `BotStartedSpeakingFrame` and `BotStoppedSpeakingFrame`: Indicate bot speech activity
- `TTSStartedFrame` and `TTSStoppedFrame`: Bracket Text-to-Speech responses

## 7. Special Purpose Frames

### AppFrame
Base class for application-specific custom frames.

### MetricsFrame
Contains performance metrics data.

### FunctionCallInProgressFrame and FunctionCallResultFrame
Used for handling LLM function (tool) calls.

### ServiceUpdateSettingsFrame
Base class for updating service settings, with specific subclasses for LLM, TTS, and STT services.

## Conclusion

Understanding these frame types is essential for working with the Pipecat system. Each frame type serves a specific purpose in the pipeline, whether it's carrying data (like audio or images), controlling the flow of the pipeline, or managing system-level operations. By using the appropriate frame types, you can effectively process and transmit various kinds of information through your pipeline.

0 comments on commit 29925a8

Please sign in to comment.