You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Additionally, play.sh has an issue with parsing JSON use space as an IFS, which is how the help model param tells you to format your JSON.
Steps to reproduce:
Attempt to start with something like the following, which attempts to set model_parameters: /play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters "{'0_Layers':35,'1_Layers':45,'model_ctx':4096}"
Actual Behavior:
Output:
Colab Check: False, TPU: False
INIT | OK | KAI Horde Models
## Warning: this project requires Python 3.9 or higher.
INFO | __main__:<module>:680 - We loaded the following model backends:
Huggingface GPTQ
KoboldAI Old Colab Method
KoboldAI API
Huggingface
Horde
Read Only
OpenAI
ExLlama
Basic Huggingface
GooseAI
INFO | __main__:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE | Welcome to KoboldAI!
MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR | __main__:<module>:10948 - An error has been caught in function '<module>', process 'MainProcess' (12311), thread 'MainThread' (140613286643520):
Traceback (most recent call last):
> File "aiserver.py", line 10948, in <module>
run()
└ <function run at 0x7fe259d69ca0>
File "aiserver.py", line 10849, in run
command_line_backend = general_startup()
└ <function general_startup at 0x7fe25a321dc0>
File "aiserver.py", line 1634, in general_startup
model_backends[args.model_backend].set_input_parameters(arg_parameters)
│ │ │ └ {'0_Layers': 35, '1_Layers': 45, 'model_ctx': 4096, 'max_ctx': 2048, 'compress_emb': 1, 'ntk_alpha': 1, 'id': 'airoboros-l2-7...
│ │ └ 'ExLlama'
│ └ Namespace(apikey=None, aria2_port=None, cacheonly=False, colab=False, configname=None, cpu=False, customsettings=None, f=None...
└ {'Huggingface GPTQ': <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7fe22525b2b0>, 'KoboldAI Old Co...
File "/home/.../KoboldAI-Client-llama/modeling/inference_models/exllama/class.py", line 423, in set_input_parameters
self.model_config.device_map.layers = []
│ └ None
└ <modeling.inference_models.exllama.class.model_backend object at 0x7fe22b4f31c0>
AttributeError: 'NoneType' object has no attribute 'device_map'
$ git status
On branch exllama
Your branch is up to date with 'origin/exllama'
commit 973aea12ea079e9c5de1e418b848a0407da7eab7 (HEAD -> exllama, origin/exllama)
Author: 0cc4m <[email protected]>
Date: Sun Jul 23 22:07:34 2023 +0200
Only import big python modules for GPTQ once they get used
Additionally, the following change should be made in play.sh:
$ git diff play.sh
diff --git a/play.sh b/play.sh
index 8ce7b781..3e88ae28 100755
--- a/play.sh
+++ b/play.sh
@@ -3,4 +3,4 @@ export PYTHONNOUSERSITE=1
if [ ! -f "runtime/envs/koboldai/bin/python" ]; then
./install_requirements.sh cuda
fi
-bin/micromamba run -r runtime -n koboldai python aiserver.py $*
+bin/micromamba run -r runtime -n koboldai python aiserver.py "$@"
So that you can pass in JSON as the model params with spaces between the KV pairs, as the help parameter instructs you:
$ ./play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters help
...
INFO | main:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE | Welcome to KoboldAI!
MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR | main:general_startup:1627 - Please pass through the parameters as a json like "{'[ID]': '[Value]', '[ID2]': '[Value]'}" using --model_parameters (required parameters shown below)
ERROR | main:general_startup:1628 - Parameters (ID: Default Value (Help Text)): 0_Layers: [None] (The number of layers to put on NVIDIA GeForce RTX 3090.)
1_Layers: [0] (The number of layers to put on NVIDIA GeForce RTX 3090.)
max_ctx: 2048 (The maximum context size the model supports)
compress_emb: 1 (If the model requires compressed embeddings, set them here)
ntk_alpha: 1 (NTK alpha value)
The text was updated successfully, but these errors were encountered:
I've faced the same situation, but I was able to get it working without using the model_backend and model path parameters. Instead, I did it manually through the web interface menu to open the model. and of course I selected the ExLlama backend on its menu.
Nvm.. I figured out a way. The issue with self.model_config.device_map.layers , model_config being None is because it was never initialized from the beginning. The only place where it gets initialized is in the is_valid() function within the KoboldAI\modeling\inference_models\exllama\class.py file. This is_valid() function is called when a user opens the model through the web interface menu.
To fix this, I made a small change to the get_requested_parameters() function in the class.py file. I added the following line at the very beginning:
if not self.model_config:
self.model_config = ExLlamaConfig(os.path.join(model_path, "config.json"))
However, it turns out there was one more thing that needed to be changed within the same function. I removed the square brackets from
"default": [layer_count if i == 0 else 0]
and changed it to
"default": layer_count if i == 0 else 0
This is needed because in the set_input_parameters() function still in class.py file, on the line:
for i, l in enumerate(layers):
if l > 0:
that would treat it as a list instead of an integer.
Summary
It appears that self.model_config is None in ExLlama's class.py (https://github.com/0cc4m/KoboldAI/blob/exllama/modeling/inference_models/exllama/class.py#L423), and is assumed to exist when you get to that code via passing in --model_parameters.
Additionally, play.sh has an issue with parsing JSON use space as an IFS, which is how the
help
model param tells you to format your JSON.Steps to reproduce:
/play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters "{'0_Layers':35,'1_Layers':45,'model_ctx':4096}"
Actual Behavior:
Output:
Expected Behavior:
The model parameters can be set at startup
Environment:
Additionally, the following change should be made in play.sh:
So that you can pass in JSON as the model params with spaces between the KV pairs, as the
help
parameter instructs you:INFO | main:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE | Welcome to KoboldAI!
MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR | main:general_startup:1627 - Please pass through the parameters as a json like "{'[ID]': '[Value]', '[ID2]': '[Value]'}" using --model_parameters (required parameters shown below)
ERROR | main:general_startup:1628 - Parameters (ID: Default Value (Help Text)): 0_Layers: [None] (The number of layers to put on NVIDIA GeForce RTX 3090.)
1_Layers: [0] (The number of layers to put on NVIDIA GeForce RTX 3090.)
max_ctx: 2048 (The maximum context size the model supports)
compress_emb: 1 (If the model requires compressed embeddings, set them here)
ntk_alpha: 1 (NTK alpha value)
The text was updated successfully, but these errors were encountered: