Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add direct client implementation #15

Merged
merged 3 commits into from
Nov 7, 2024
Merged

Add direct client implementation #15

merged 3 commits into from
Nov 7, 2024

Conversation

dltn
Copy link
Contributor

@dltn dltn commented Nov 7, 2024

The Llama Stack primarily operates as a client/server model. However, there are scenarios where hosting a distribution can be cumbersome (e.g., testing, Jupyter), making it more desirable to utilize the Llama Stack as a library.

This introduces a clever hack that extends the Stainless Python client. It intercepts GET/POST requests intended for HTTP transmission and uses reflection to deserialize and route them directly to their implementations.

Is this roundabout serialization the most efficient method? Certainly not. However, the convenience of having this as a drop-in solution is significant, and it is negligible compared to GPU latency.

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh man this is beautiful <3

@dltn dltn merged commit 0901251 into main Nov 7, 2024
3 checks passed
@dltn dltn deleted the add-direct-client branch November 7, 2024 21:27

from llama_stack.distribution.datatypes import StackRunConfig
from llama_stack.distribution.distribution import get_provider_registry
from llama_stack.distribution.resolver import resolve_impls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add llama-stack as a dependency for the llama-stack-client package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope it should be the reverse as we talked about. this code should always be exercised when the person already has llama-stack in their environment (as a library or as pip)

Copy link
Contributor

@yanxi0830 yanxi0830 Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should this class LlamaStackDirectClient be inside the llama-stack repo instead of the llama-stack-client-python repo?

  1. User who want to use llama-stack as a library. Install llama-stack package (dependent on llama-stack-client package). Is able to use LlamaStackDirectClient.

  2. User who just installs llama-stack-client package. They cannot use LlamaStackDirectClient without installing llama-stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanxi0830 yeah I think that makes sense to me actually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants