Are there any plans to implement concurrent LoRa inference with multiple adapters (such as S-lora)? #1237

SamGalanakis · 2023-12-08T00:03:31Z

SamGalanakis
Dec 8, 2023

Would be very useful and there doesn't seem to be a flexible implementation of this yet.

BenjaminBossan · 2023-12-08T17:14:32Z

BenjaminBossan
Dec 8, 2023
Maintainer

Do you mean as suggested in #903? If yes, there are plans, hopefully we can tackle it soon. But note that SLoRA has a bunch of specialized optimizations that we cannot do in PEFT, as we want to support a very broad range of models and adapter types.

0 replies

SamGalanakis · 2023-12-10T18:42:28Z

SamGalanakis
Dec 10, 2023
Author

Ah my bad I missed that one. So that will allow parallel inference, any rough idea how pefromant it will be throughput and memory wise?

1 reply

BenjaminBossan Dec 15, 2023
Maintainer

parallel inference

If, by that, you mean different adapters in the same batch, then yes.

any rough idea how pefromant it will be throughput and memory wise

Sorry, we don't have any numbers yet, let's check once the PR is finalized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are there any plans to implement concurrent LoRa inference with multiple adapters (such as S-lora)? #1237

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Are there any plans to implement concurrent LoRa inference with multiple adapters (such as S-lora)? #1237

SamGalanakis Dec 8, 2023

Replies: 2 comments · 1 reply

BenjaminBossan Dec 8, 2023 Maintainer

SamGalanakis Dec 10, 2023 Author

BenjaminBossan Dec 15, 2023 Maintainer

SamGalanakis
Dec 8, 2023

Replies: 2 comments 1 reply

BenjaminBossan
Dec 8, 2023
Maintainer

SamGalanakis
Dec 10, 2023
Author

BenjaminBossan Dec 15, 2023
Maintainer