Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI Realtime API support #869

Open
bachittle opened this issue Oct 7, 2024 · 8 comments
Open

OpenAI Realtime API support #869

bachittle opened this issue Oct 7, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@bachittle
Copy link

https://platform.openai.com/docs/guides/realtime

it uses websockets to get and set audio files from gpt-4o-realtime.

@bachittle bachittle added the enhancement New feature or request label Oct 7, 2024
@anhao
Copy link

anhao commented Oct 8, 2024

+1

1 similar comment
@WqyJh
Copy link
Contributor

WqyJh commented Oct 10, 2024

+1

@WqyJh
Copy link
Contributor

WqyJh commented Oct 17, 2024

I want to add support for the GPT-4o-realtime model, which relies on WebSocket technology. This necessitates the use of a WebSocket library, but introducing such a dependency conflicts with the zero-dependency philosophy of the existing library, as I previously discussed with @sashabaranov.

image

As a result, I've decided to create a new library dedicated exclusively to GPT-4o-realtime. This library will serve as a complement to go-openai, focusing solely on supporting GPT-4o-realtime functionality.

The new library is called go-openai-realtime. Feel free to check it out!

https://github.com/WqyJh/go-openai-realtime

@sashabaranov
Copy link
Owner

@WqyJh, thank you for your effort on this! I think websockets are a fair case to introduce a dependency to this library, and I would love to merge your changes if you'll decide to contribute 🙌🏻

@bachittle
Copy link
Author

They just released support for audio input and output in the chat completions endpoint, using the gpt-4o-audio-preview model. This could be supported first in the meantime: https://platform.openai.com/docs/guides/audio/quickstart

@WqyJh
Copy link
Contributor

WqyJh commented Nov 11, 2024

They just released support for audio input and output in the chat completions endpoint, using the gpt-4o-audio-preview model. This could be supported first in the meantime: https://platform.openai.com/docs/guides/audio/quickstart

I just added support for gpt-4o-audio-preview. See #895

@WqyJh
Copy link
Contributor

WqyJh commented Nov 11, 2024

@WqyJh, thank you for your effort on this! I think websockets are a fair case to introduce a dependency to this library, and I would love to merge your changes if you'll decide to contribute 🙌🏻

I'd like to contribute to the project. Since it contains a lot of code and examples, it will take some time to complete. Mixing all the code together would create a mess, so I suggest organizing all the real-time code into a folder named realtime.

@sabuhigr
Copy link
Contributor

No need to implement realtime api with websockets/webRTC I think.

Completion api supports Audio input now.
Ref: https://platform.openai.com/docs/guides/audio

The project is concentrated to api level functionalities with REST not with different protocol like WebRTC or Websocket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants