[E892] Unknown function registry: 'llm_backends' #12987

rkatriel · 2023-09-18T17:38:42Z

How to reproduce the behaviour

I'm getting an "Unknown function registry: 'llm_backends'" error (see the traceback below) when running the example provided in Matthew Honnibal's blog "Against LLM maximalism" (https://explosion.ai/blog/against-llm-maximalism)

import spacy

nlp = spacy.blank("en")
nlp.add_pipe("sentencizer")
nlp.add_pipe(
    "llm",
    config={
        "task": {
            "@llm_tasks": "spacy.NER.v1",
            "labels": "SAAS_PLATFORM,PROGRAMMING_LANGUAGE,OPEN_SOURCE_LIBRARY"
        },
        "backend": {
            "@llm_backends": "spacy.REST.v1",
            "api": "OpenAI",
            "config": {"model": "text-davinci-003"},
        },
    },
)

doc = nlp("There's no PyTorch bindings for Go. We just use Microsoft Cognitive Services.")
for ent in doc.ents:
    print(ent.text, ent.label_, ent.sent)

Here is the full traceback:

File "/Users/ron.katriel/PycharmProjects/NLP/spacy-llm-example.py", line 5, in
nlp.add_pipe(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py", line 786, in add_pipe
pipe_component = self.create_pipe(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py", line 679, in create_pipe
resolved = registry.resolve(cfg, validate=validate)
File "/Users/ron.katriel/PycharmProjects/Labs-Gen-AI/venv/lib/python3.10/site-packages/confection/init.py", line 756, in resolve
resolved, _ = cls._make(
File "/Users/ron.katriel/PycharmProjects/Labs-Gen-AI/venv/lib/python3.10/site-packages/confection/init.py", line 805, in _make
filled, _, resolved = cls._fill(
File "/Users/ron.katriel/PycharmProjects/Labs-Gen-AI/venv/lib/python3.10/site-packages/confection/init.py", line 860, in _fill
filled[key], validation[v_key], final[key] = cls._fill(
File "/Users/ron.katriel/PycharmProjects/Labs-Gen-AI/venv/lib/python3.10/site-packages/confection/init.py", line 859, in _fill
promise_schema = cls.make_promise_schema(value, resolve=resolve)
File "/Users/ron.katriel/PycharmProjects/Labs-Gen-AI/venv/lib/python3.10/site-packages/confection/init.py", line 1051, in make_promise_schema
func = cls.get(reg_name, func_name)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/util.py", line 128, in get
raise RegistryError(Errors.E892.format(name=registry_name, available=names))
catalogue.RegistryError: [E892] Unknown function registry: 'llm_backends'.

Available names: architectures, augmenters, batchers, callbacks, cli, datasets, displacy_colors, factories, initializers, languages, layers, lemmatizers, llm_misc, llm_models, llm_queries, llm_tasks, loggers, lookups, losses, misc, models, ops, optimizers, readers, schedules, scorers, tokenizers

Your Environment

spaCy version: 3.5.1
Platform: macOS-13.5.2-x86_64-i386-64bit
Python version: 3.10.4

The text was updated successfully, but these errors were encountered:

rmitsch · 2023-10-16T11:10:36Z

Sorry for not getting back to you earlier, this one fell through the cracks! The example in the blog is outdated, the API looks a bit different now. We'll update the blog soon. The correct way to initialize this with spacy-llm >= 0.4.0 looks like this:

nlp.add_pipe(
    "llm",
    config={
        "task": {
            "@llm_tasks": "spacy.NER.v1",
            "labels": "SAAS_PLATFORM,PROGRAMMING_LANGUAGE,OPEN_SOURCE_LIBRARY"
        },
        "model": {"@llm_models": "spacy.Davinci.v2"},
    },
)

rkatriel · 2023-10-17T13:50:39Z

@rmitsch Hi Raphael,

I tried your suggestion - after upgrading spacy and spacy-llm to the latest versions (3.7.2 and 0.6.2, respectively) - but now I'm getting a Config validation error. See the console trace below.

Thanks,
Ron

nlp.add_pipe(

File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py", line 821, in add_pipe
pipe_component = self.create_pipe(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py", line 709, in create_pipe
resolved = registry.resolve(cfg, validate=validate)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/confection/init.py", line 756, in resolve
resolved, _ = cls._make(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/confection/init.py", line 805, in _make
filled, _, resolved = cls._fill(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/confection/init.py", line 860, in _fill
filled[key], validation[v_key], final[key] = cls._fill(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/confection/init.py", line 860, in _fill
filled[key], validation[v_key], final[key] = cls._fill(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/confection/init.py", line 926, in _fill
raise ConfigValidationError(
confection.ConfigValidationError:

Config validation error
llm.model -> llm_models extra fields not permitted
{'llm_models': 'spacy.Davinci.v2', '@llm_models': 'spacy.GPT-3-5.v2', 'strict': True}

rmitsch · 2023-10-17T13:58:32Z

Can you share the config you're using?

rkatriel · 2023-10-17T14:01:08Z

I have no config file. Below is the code I'm running. The parameters are passed in the code as recommended.

import spacy

nlp = spacy.blank("en")
nlp.add_pipe("sentencizer")
nlp.add_pipe(
    "llm",
    config={
        "task": {
            "@llm_tasks": "spacy.NER.v1",
            "labels": "SAAS_PLATFORM,PROGRAMMING_LANGUAGE,OPEN_SOURCE_LIBRARY"
        },
        "model": {"llm_models": "spacy.Davinci.v2"},
    },
)

doc = nlp("There's no PyTorch bindings for Go. We just use Microsoft Cognitive Services.")
for ent in doc.ents:
    print(ent.text, ent.label_, ent.sent)

rmitsch · 2023-10-17T14:10:01Z

Ah, I forgot to an "@" in the example I've given above. Try again with this:

import spacy

nlp = spacy.blank("en")
nlp.add_pipe("sentencizer")
nlp.add_pipe(
    "llm",
    config={
        "task": {
            "@llm_tasks": "spacy.NER.v1",
            "labels": "SAAS_PLATFORM,PROGRAMMING_LANGUAGE,OPEN_SOURCE_LIBRARY"
        },
        "model": {"@llm_models": "spacy.Davinci.v2"},
    },
)

doc = nlp("There's no PyTorch bindings for Go. We just use Microsoft Cognitive Services.")
for ent in doc.ents:
    print(ent.text, ent.label_, ent.sent)

rkatriel · 2023-10-17T20:09:30Z

Thanks, that did the trick! But now I'm getting a connection error

ConnectionError: API could not be reached after 34.596 seconds in total and attempting to connect 5 times. Check your network connection and the API's availability.
429 Too Many Requests

This is likely from OpenAI because my account is not a paid one.

Is there an open source (e.g., Huggingface) model that works with this setup? I tried running the script with 'spacy.OpenLLaMA.v1' and got the following error

Config validation error
llm.model -> name field required
{'@llm_models': 'spacy.OpenLLaMA.v1'}

rmitsch · 2023-10-18T07:34:45Z

The ConnectionError usually is from the OpenAI rate-limiting you, yes. You could also increase the time between tries, but that's also unsatisfying.

OS models work the same way. Hugging Face models also appear in variations, and we don't select one by default (maybe we should). Anyway, have a look at the documentation to see which ones are available. You could go with the 3B one e. g. and do

import spacy

nlp = spacy.blank("en")
nlp.add_pipe("sentencizer")
nlp.add_pipe(
    "llm",
    config={
        "task": {
            "@llm_tasks": "spacy.NER.v1",
            "labels": "SAAS_PLATFORM,PROGRAMMING_LANGUAGE,OPEN_SOURCE_LIBRARY"
        },
        "model": {
            "@llm_models": "spacy.OpenLLaMa.v2",
            "name": "open_llama_3b"
        },
    },
)

doc = nlp("There's no PyTorch bindings for Go. We just use Microsoft Cognitive Services.")
for ent in doc.ents:
    print(ent.text, ent.label_, ent.sent)

Note: OpenLLaMa is an older model, and the 3B model is small. You'll probably won't get amazing results out of using it.

rkatriel · 2023-10-18T14:42:57Z

Thanks Raphael, but that doesn't work. I'm get the following catalogue/registry error:

catalogue.RegistryError: [E893] Could not find function 'spacy.OpenLLaMa.v2' in function registry 'llm_models'. If you're using a custom function, make sure the code is available. If the function is provided by a third-party package, e.g. spacy-transformers, make sure the package is installed in your environment.

Changing to 'spacy.OpenLLaMa.v1', as implied in the rest of the error message below, does not help.

Available names: langchain.AI21.v1, langchain.AlephAlpha.v1, langchain.Anthropic.v1, langchain.Anyscale.v1, langchain.Aviary.v1, langchain.AzureOpenAI.v1, langchain.Banana.v1, langchain.Beam.v1, langchain.CTransformers.v1, langchain.CerebriumAI.v1, langchain.Cohere.v1, langchain.Databricks.v1, langchain.DeepInfra.v1, langchain.FakeListLLM.v1, langchain.ForefrontAI.v1, langchain.GPT4All.v1, langchain.GooglePalm.v1, langchain.GooseAI.v1, langchain.HuggingFaceEndpoint.v1, langchain.HuggingFaceHub.v1, langchain.HuggingFacePipeline.v1, langchain.HuggingFaceTextGenInference.v1, langchain.HumanInputLLM.v1, langchain.LlamaCpp.v1, langchain.Modal.v1, langchain.MosaicML.v1, langchain.NLPCloud.v1, langchain.OpenAI.v1, langchain.OpenLM.v1, langchain.Petals.v1, langchain.PipelineAI.v1, langchain.RWKV.v1, langchain.Replicate.v1, langchain.SagemakerEndpoint.v1, langchain.SelfHostedHuggingFaceLLM.v1, langchain.SelfHostedPipeline.v1, langchain.StochasticAI.v1, langchain.VertexAI.v1, langchain.Writer.v1, spacy.Ada.v1, spacy.Ada.v2, spacy.Azure.v1, spacy.Babbage.v1, spacy.Babbage.v2, spacy.Claude-1-0.v1, spacy.Claude-1-2.v1, spacy.Claude-1-3.v1, spacy.Claude-1.v1, spacy.Claude-2.v1, spacy.Claude-instant-1-1.v1, spacy.Claude-instant-1.v1, spacy.Code-Davinci.v1, spacy.Code-Davinci.v2, spacy.Command.v1, spacy.Curie.v1, spacy.Curie.v2, spacy.Davinci.v1, spacy.Davinci.v2, spacy.Dolly.v1, spacy.Falcon.v1, spacy.GPT-3-5.v1, spacy.GPT-3-5.v2, spacy.GPT-4.v1, spacy.GPT-4.v2, spacy.Llama2.v1, spacy.Mistral.v1, spacy.NoOp.v1, spacy.OpenLLaMA.v1, spacy.PaLM.v1, spacy.StableLM.v1, spacy.Text-Ada.v1, spacy.Text-Ada.v2, spacy.Text-Babbage.v1, spacy.Text-Babbage.v2, spacy.Text-Curie.v1, spacy.Text-Curie.v2, spacy.Text-Davinci.v1, spacy.Text-Davinci.v2

rmitsch · 2023-10-18T14:45:08Z

A typo on my end, use spacy.OpenLLaMa.v1 instead of spacy.OpenLLaMa.v2.

rkatriel · 2023-10-18T14:46:36Z

Already tried that, as mentioned above. Same type of error

catalogue.RegistryError: [E893] Could not find function 'spacy.OpenLLaMa.v1' in function registry 'llm_models'. If you're using a custom function, make sure the code is available. If the function is provided by a third-party package, e.g. spacy-transformers, make sure the package is installed in your environment.

rmitsch · 2023-10-19T07:40:00Z

Argh, these different Llama casings always get me. So the correct spelling is spacy.OpenLLaMA.v1, not spacy.OpenLLaMa.v1 (notice that the last "a" is uppercase). Apologies for not double-checking.

rkatriel · 2023-10-19T14:20:06Z

Thanks, Raphael! That did the trick, though after fixing it I got a new error:

Tokenizer class LlamaTokenizer does not exist or is not currently imported.

It turns out this is a known issue and is solved by uninstalling/reinstalling the transformers library.

So now we're past the loading of the model but not out of the woods. I'm getting the following error when calling Spacy's nlp() function with the query shown in the code above (see the full traceback below):

RuntimeError: Placeholder storage has not been allocated on MPS device!

(I thought this could be an issue with Intel vs. Apple silicon but I'm getting the same error on a MacBook with the M2 chip)

Any thoughts on how to resolve this?

Ron

Traceback (most recent call last):
  File "/Users/ron.katriel/PycharmProjects/Transformer/test-spacy-llm.py", line 19, in <module>
    doc = nlp("There's no PyTorch bindings for Go. We just use Microsoft Cognitive Services.")
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py", line 1054, in __call__
    error_handler(name, proc, [doc], e)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/util.py", line 1704, in raise_error
    raise e
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py", line 1049, in __call__
    doc = proc(doc, **component_cfg.get(name, {}))  # type: ignore[call-arg]
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py", line 156, in __call__
    docs = self._process_docs([doc])
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py", line 210, in _process_docs
    responses_iters = tee(self._model(prompts_iters[0]), n_iters)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/models/hf/openllama.py", line 55, in __call__
    return [
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/models/hf/openllama.py", line 57, in <listcomp>
    self._model.generate(input_ids=tii, **self._config_run)[
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/generation/utils.py", line 1606, in generate
    return self.greedy_search(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/generation/utils.py", line 2454, in greedy_search
    outputs = self(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1038, in forward
    outputs = self.model(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 875, in forward
    inputs_embeds = self.embed_tokens(input_ids)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Placeholder storage has not been allocated on MPS device!

rmitsch · 2023-10-30T10:40:07Z

Huh, that's odd. You're getting this error when running exactly this snippet?

rkatriel · 2023-10-30T16:13:07Z

Correct - except spacy.OpenLLaMA.v1 instead of spacy.OpenLLaMa.v1, as you suggested above.

rmitsch · 2023-10-31T13:36:46Z

Which machine are you running this one? We'd like to try replicating this.

rmitsch · 2023-10-31T13:37:12Z

Also, I'd appreciate if you opened a new issue for this problem. Might be useful for other users 🙏

rkatriel · 2023-10-31T14:38:44Z

Done! The new issue is "Spacy-LLM fails with storage not allocated on MPS device" #13096

github-actions · 2023-12-01T00:02:28Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

danieldk added usage General spaCy usage feat/llm Feature: LLMs (incl. spacy-llm) labels Sep 19, 2023

rmitsch added the docs Documentation and website label Oct 16, 2023

ines closed this as completed Oct 17, 2023

rkatriel mentioned this issue Oct 31, 2023

Spacy-LLM fails with storage not allocated on MPS device #13096

Closed

github-actions bot locked as resolved and limited conversation to collaborators Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[E892] Unknown function registry: 'llm_backends' #12987

[E892] Unknown function registry: 'llm_backends' #12987

rkatriel commented Sep 18, 2023 •

edited

Loading

rmitsch commented Oct 16, 2023 •

edited

Loading

rkatriel commented Oct 17, 2023 •

edited

Loading

rmitsch commented Oct 17, 2023

rkatriel commented Oct 17, 2023

rmitsch commented Oct 17, 2023

rkatriel commented Oct 17, 2023 •

edited

Loading

rmitsch commented Oct 18, 2023 •

edited

Loading

rkatriel commented Oct 18, 2023 •

edited

Loading

rmitsch commented Oct 18, 2023

rkatriel commented Oct 18, 2023

rmitsch commented Oct 19, 2023

rkatriel commented Oct 19, 2023 •

edited

Loading

rmitsch commented Oct 30, 2023

rkatriel commented Oct 30, 2023

rmitsch commented Oct 31, 2023

rmitsch commented Oct 31, 2023

rkatriel commented Oct 31, 2023 •

edited

Loading

github-actions bot commented Dec 1, 2023

[E892] Unknown function registry: 'llm_backends' #12987

[E892] Unknown function registry: 'llm_backends' #12987

Comments

rkatriel commented Sep 18, 2023 • edited Loading

How to reproduce the behaviour

Your Environment

rmitsch commented Oct 16, 2023 • edited Loading

rkatriel commented Oct 17, 2023 • edited Loading

rmitsch commented Oct 17, 2023

rkatriel commented Oct 17, 2023

rmitsch commented Oct 17, 2023

rkatriel commented Oct 17, 2023 • edited Loading

rmitsch commented Oct 18, 2023 • edited Loading

rkatriel commented Oct 18, 2023 • edited Loading

rmitsch commented Oct 18, 2023

rkatriel commented Oct 18, 2023

rmitsch commented Oct 19, 2023

rkatriel commented Oct 19, 2023 • edited Loading

rmitsch commented Oct 30, 2023

rkatriel commented Oct 30, 2023

rmitsch commented Oct 31, 2023

rmitsch commented Oct 31, 2023

rkatriel commented Oct 31, 2023 • edited Loading

github-actions bot commented Dec 1, 2023

rkatriel commented Sep 18, 2023 •

edited

Loading

rmitsch commented Oct 16, 2023 •

edited

Loading

rkatriel commented Oct 17, 2023 •

edited

Loading

rkatriel commented Oct 17, 2023 •

edited

Loading

rmitsch commented Oct 18, 2023 •

edited

Loading

rkatriel commented Oct 18, 2023 •

edited

Loading

rkatriel commented Oct 19, 2023 •

edited

Loading

rkatriel commented Oct 31, 2023 •

edited

Loading