-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Example] Use new ggml backend with llama options support #52
Conversation
dm4
commented
Oct 19, 2023
•
edited
Loading
edited
- Should be merged after
- [WASI-NN] Add metadata support WasmEdge/WasmEdge#2957
- [WASI-NN] Support more llama options WasmEdge/WasmEdge#2967
- [WASI-NN] ggml backend: enable cuBLAS and bump from b1309 to b1383. WasmEdge/WasmEdge#2952
Signed-off-by: dm4 <[email protected]>
Hello, I am a code review bot on flows.network. Here are my reviews of code commits in this PR. Overall, the pull request titled "[Example] Use new ggml backend with llama options support" contains a collection of code changes across multiple files. The important findings and potential issues are as follows:
Considering all the individual summaries, the pull request includes some potential issues and errors, such as hardcoded options, lack of error handling, unclear changes, missing information, and compatibility concerns. Further clarification, documentation, and addressing the mentioned problems are recommended before merging the pull request. DetailsCommit e60315b1dde62f7c7a5c9d43d9ba6dfc2a960ed8Key changes:
Potential problems:
Commit 450a92c1a10222d944bf4ab874d9ff1ba4be91b7Key changes:
Potential problems:
Overall, more information and context about the changes would be beneficial to fully assess the impact and potential problems of this patch. Commit ad4af727f4a1cc808a672ddfad1ea020c5571c48Key Changes:
Potential problems:
Commit 274d494bb669a871e97de64741c763b6f281400fKey changes:
Potential problems:
Commit b839ddedcba0a149c0864f06e1e98fc0cb9ffdbbKey changes:
Potential problems:
Commit 3c14961cc148a07d643d5f85160115e939ef41faKey changes in this patch:
Potential problems:
Commit e6b9477dc186e560a482b9ddb75d504541986673The key change in this patch is the fix to the llama interactive workflow in the Overall, this patch should resolve an issue with the llama interactive workflow by correctly compiling and using the updated wasm file. One potential problem with this patch is that it does not provide any additional context or explanation for why this change is necessary. It would be helpful to include a brief description or reference to the specific issue or bug that this patch is addressing. Additionally, it would be beneficial to include more detailed testing steps or information to verify that this fix is working as expected. Overall, the changes in this patch appear to be straightforward and should effectively fix the llama interactive workflow. Commit 559f195de562ecac884ac9b6dce37430d3f8a167Key changes:
Potential problems:
Commit fac6415f63803a11e0f6430fb3b4a7471ba8f449Key Changes:
Potential Problems:
Suggestions:
Overall, the changes seem to be focused on adding default values to the README and synchronizing it with the implementation. There are some potential problems in parsing the environment variables, which should be addressed for proper functionality. Commit bf8317150ad77639c90b99349543ed35fdb2c4baKey changes:
Potential problems:
Commit 27073e8271a49ba1906cc91d1e6c9401d3cdf1b0Key changes:
Potential problems:
Overall, the changes seem relatively minor and focused on updating documentation. However, the potential problem with the lack of explanation for the macOS Metal support recommendation and the uncertainty about missing changes should be addressed before merging this pull request. |
The upcoming commits are expected to include:
|
The index 0 is always the initial prompt and index 1 is always the options, right? The initial prompt is used to load the model. When we are in a chat session, subsequent calls to the compute will not have the initial prompt? Thanks. |
Yes, index 1 always represents the options, while index 0 is used for prompts. During a chat session, we always update the prompt by setting it to index 0 in order to replace the old prompt. Then, we call compute again. |
…e to ggml Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Perhaps we should make the "initial prompt" a field in the options JSON object? Also, during a conversation, I believe the initial prompt should only be called once at the beginning to load the model -- the same behavior of the current llama interactive example. |
This seems wrong. The JSON is for setting the context for |
We should merge this after WasmEdge/WasmEdge#2952 gets merged. |
Even so -- I think the options should be at index 0 and initial prompt should be at index 1 as the latter is probably much less used. |
I just want to confirm that the current If that's the case, I think we can modify it to handle the first model loading in |
Good idea
|
Signed-off-by: dm4 <[email protected]>
e2baeb6
to
274d494
Compare
I just tested the response time of the ggml plugin. Previously, we didn't support loading the model file directly, so we had to write a logs:
|
…vars way Signed-off-by: hydai <[email protected]>
Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Set env variable `is_interactive` to `true` to switch into non-interactive mode. Signed-off-by: dm4 <[email protected]>
74f6c75
to
be2c191
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove aot.wasm
- If only the preload model name is given, then it defaults to interactive mode. - If another argument is provided after the preload model name, then it will be considered as a prompt, and it will enter non-interactive mode (just like the previous wasmedge-ggml-llama example)." Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
deabd73
to
559f195
Compare
…ize it with the implementation of the WasmEdge WASI-NN plugin Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Signed-off-by: dm4 <[email protected]>
Please merge this after the 0.13.5 is out. |