Skip to content

Commit

Permalink
Merge pull request #98 from premAI-io/mudler-patch-1
Browse files Browse the repository at this point in the history
mlops-engines: add LocalAI
  • Loading branch information
casperdcl authored Nov 5, 2023
2 parents 469e3b1 + 9295d86 commit 3fe85db
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 4 deletions.
2 changes: 1 addition & 1 deletion desktop-apps.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ koboldcpp Julius Model Configuration

[local.ai]: https://www.localai.app

The [local.ai] App from https://github.com/louisgv/local.ai ([not to be confused](https://github.com/louisgv/local.ai/discussions/71) with [LocalAI](https://localai.io) from https://github.com/mudler/LocalAI) is a simple application for loading LLMs after you manually download a `ggml` model from online.
The [local.ai] App from https://github.com/louisgv/local.ai ([not to be confused](https://github.com/louisgv/local.ai/discussions/71) with [](mlops-engines.md#localai) from https://github.com/mudler/LocalAI) is a simple application for loading LLMs after you manually download a `ggml` model from online.

### UI and Chat

Expand Down
15 changes: 15 additions & 0 deletions mlops-engines.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Inference Engine | Open-Source | GPU optimisations | Ease of use
[](#vllm) | 🟢 Yes | Continuous Batching, Tensor Parallelism, Paged Attention | 🟢 Easy
[](#bentoml) | 🟢 Yes | None | 🟢 Easy
[](#modular) | 🔴 No | N/A | 🟡 Moderate
[](#localai) | 🟢 Yes | 🟢 Yes | 🟢 Easy
```

{{ table_feedback }}
Expand Down Expand Up @@ -127,6 +128,20 @@ Cons:

This is not an exhaustive list of MLOps engines by any means. There are many other tools and frameworks developer use to deploy their ML models. There is ongoing development in both the open-source and private sectors to improve the performance of LLMs. It's up to the community to test out different services to see which one works best for their use case.

## LocalAI

[LocalAI](https://localai.io) from https://github.com/mudler/LocalAI ([not to be confused](https://github.com/louisgv/local.ai/discussions/71) with [](desktop-apps.md#localai) from https://github.com/louisgv/local.ai) is the free, Open Source alternative to OpenAI. LocalAI act as a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It can run LLMs (with various backend such as https://github.com/ggerganov/llama.cpp or [](#vllm)), generate images, generate audio, transcribe audio, and can be self-hosted (on-prem) with consumer-grade hardware.

Pros:

- [wide range of models supported](https://localai.io/model-compatibility)
- support for [functions](https://localai.io/features/openai-functions) (self-hosted [OpenAI functions](https://platform.openai.com/docs/guides/gpt/function-calling))
- [easy to integrate](https://localai.io/integrations)

Cons:

- binary version is harder to run and compile locally. https://github.com/mudler/LocalAI/issues/1196.
- high learning curve due to high degree of customisation

## Challenges in Open Source

Expand Down
2 changes: 1 addition & 1 deletion model-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,7 @@ Some [clients & libraries supporting `GGUF`](https://huggingface.co/TheBloke/Lla
- [LM Studio](https://lmstudio.ai) -- an easy-to-use and powerful local GUI with GPU acceleration on both Windows (NVidia and AMD), and macOS

```{seealso}
For more info on `GGUF`, see https://github.com/ggerganov/llama.cpp/pull/2398 and its [spec](https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md).
For more info on `GGUF`, see https://github.com/ggerganov/llama.cpp/pull/2398 and its [spec](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md).
```

### Limitations
Expand Down
3 changes: 1 addition & 2 deletions sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,10 @@ The list of vector stores that LangChain supports can be found [here](https://ap

### Models

This is the heart of most LLM models where the core functionality resides. There are broadly 3 different [models](https://docs.langchain.com/docs/components/models) that LLMs provide. They are Language, Chat, and Embedding model.
This is the heart of most LLMs, where the core functionality resides. There are broadly [2 different types of models](https://python.langchain.com/docs/modules/model_io/models) which LangChain integrates with:

- **Language**: Inputs & outputs are `string`s
- **Chat**: Run on top of a Language model. Inputs are a list of chat messages, and output is a chat message
- **Embedding**: Inputs is a `string` and outputs are a list of `float`s (vector)

### Tools

Expand Down

0 comments on commit 3fe85db

Please sign in to comment.