Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to pass model params to ExLlama on startup causes an AttributeError #59

Open
InconsolableCellist opened this issue Aug 9, 2023 · 2 comments

Comments

@InconsolableCellist
Copy link

Summary

It appears that self.model_config is None in ExLlama's class.py (https://github.com/0cc4m/KoboldAI/blob/exllama/modeling/inference_models/exllama/class.py#L423), and is assumed to exist when you get to that code via passing in --model_parameters.

Additionally, play.sh has an issue with parsing JSON use space as an IFS, which is how the help model param tells you to format your JSON.

Steps to reproduce:

  1. Attempt to start with something like the following, which attempts to set model_parameters: /play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters "{'0_Layers':35,'1_Layers':45,'model_ctx':4096}"

Actual Behavior:

Output:

Colab Check: False, TPU: False
INIT       | OK         | KAI Horde Models

 ## Warning: this project requires Python 3.9 or higher.

INFO       | __main__:<module>:680 - We loaded the following model backends: 
Huggingface GPTQ
KoboldAI Old Colab Method
KoboldAI API
Huggingface
Horde
Read Only
OpenAI
ExLlama
Basic Huggingface
GooseAI
INFO       | __main__:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE    | Welcome to KoboldAI!
MESSAGE    | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR      | __main__:<module>:10948 - An error has been caught in function '<module>', process 'MainProcess' (12311), thread 'MainThread' (140613286643520):
Traceback (most recent call last):

> File "aiserver.py", line 10948, in <module>
    run()
    └ <function run at 0x7fe259d69ca0>

  File "aiserver.py", line 10849, in run
    command_line_backend = general_startup()
                           └ <function general_startup at 0x7fe25a321dc0>

  File "aiserver.py", line 1634, in general_startup
    model_backends[args.model_backend].set_input_parameters(arg_parameters)
    │              │    │                                   └ {'0_Layers': 35, '1_Layers': 45, 'model_ctx': 4096, 'max_ctx': 2048, 'compress_emb': 1, 'ntk_alpha': 1, 'id': 'airoboros-l2-7...
    │              │    └ 'ExLlama'
    │              └ Namespace(apikey=None, aria2_port=None, cacheonly=False, colab=False, configname=None, cpu=False, customsettings=None, f=None...
    └ {'Huggingface GPTQ': <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7fe22525b2b0>, 'KoboldAI Old Co...

  File "/home/.../KoboldAI-Client-llama/modeling/inference_models/exllama/class.py", line 423, in set_input_parameters
    self.model_config.device_map.layers = []
    │    └ None
    └ <modeling.inference_models.exllama.class.model_backend object at 0x7fe22b4f31c0>

AttributeError: 'NoneType' object has no attribute 'device_map'

Expected Behavior:

The model parameters can be set at startup

Environment:

$ git remote -v
origin  https://github.com/0cc4m/KoboldAI.git (fetch)
origin  https://github.com/0cc4m/KoboldAI.git (push)
$ git status
On branch exllama
Your branch is up to date with 'origin/exllama'

  commit 973aea12ea079e9c5de1e418b848a0407da7eab7 (HEAD -> exllama, origin/exllama)
  Author: 0cc4m <[email protected]>
  Date:   Sun Jul 23 22:07:34 2023 +0200
  
      Only import big python modules for GPTQ once they get used

Additionally, the following change should be made in play.sh:

$ git diff play.sh
  diff --git a/play.sh b/play.sh
  index 8ce7b781..3e88ae28 100755
  --- a/play.sh
  +++ b/play.sh
  @@ -3,4 +3,4 @@ export PYTHONNOUSERSITE=1
   if [ ! -f "runtime/envs/koboldai/bin/python" ]; then
   ./install_requirements.sh cuda
   fi
  -bin/micromamba run -r runtime -n koboldai python aiserver.py $*
  +bin/micromamba run -r runtime -n koboldai python aiserver.py "$@"

So that you can pass in JSON as the model params with spaces between the KV pairs, as the help parameter instructs you:

$ ./play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters help
...

INFO | main:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE | Welcome to KoboldAI!
MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR | main:general_startup:1627 - Please pass through the parameters as a json like "{'[ID]': '[Value]', '[ID2]': '[Value]'}" using --model_parameters (required parameters shown below)
ERROR | main:general_startup:1628 - Parameters (ID: Default Value (Help Text)): 0_Layers: [None] (The number of layers to put on NVIDIA GeForce RTX 3090.)
1_Layers: [0] (The number of layers to put on NVIDIA GeForce RTX 3090.)
max_ctx: 2048 (The maximum context size the model supports)
compress_emb: 1 (If the model requires compressed embeddings, set them here)
ntk_alpha: 1 (NTK alpha value)


@GrennKren
Copy link

I've faced the same situation, but I was able to get it working without using the model_backend and model path parameters. Instead, I did it manually through the web interface menu to open the model. and of course I selected the ExLlama backend on its menu.

@GrennKren
Copy link

GrennKren commented Sep 7, 2023

Nvm.. I figured out a way. The issue with self.model_config.device_map.layers , model_config being None is because it was never initialized from the beginning. The only place where it gets initialized is in the is_valid() function within the KoboldAI\modeling\inference_models\exllama\class.py file. This is_valid() function is called when a user opens the model through the web interface menu.

To fix this, I made a small change to the get_requested_parameters() function in the class.py file. I added the following line at the very beginning:

if not self.model_config:
    self.model_config = ExLlamaConfig(os.path.join(model_path, "config.json"))

However, it turns out there was one more thing that needed to be changed within the same function. I removed the square brackets from

"default": [layer_count if i == 0 else 0]

and changed it to

"default": layer_count if i == 0 else 0

This is needed because in the set_input_parameters() function still in class.py file, on the line:

for i, l in enumerate(layers):
    if l > 0:

that would treat it as a list instead of an integer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants