Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] "Bulk Labeling Multimodal Data" Notebook outdated #5557

Open
trojblue opened this issue Oct 2, 2024 · 1 comment
Open

[DOCS] "Bulk Labeling Multimodal Data" Notebook outdated #5557

trojblue opened this issue Oct 2, 2024 · 1 comment

Comments

@trojblue
Copy link

trojblue commented Oct 2, 2024

Which page or section is this issue related to?

https://github.com/argilla-io/argilla/blob/develop/docs/_source/tutorials/notebooks/labelling-textclassification-sentencetransformers-semantic.ipynb

In the notebook i found several issues incompatible with the current version of argilla:

1. the dependency:

%pip install argilla "setfit~=0.2.0" "datasets~=2.3.0" transformers sentence-transformers -qqq
  • when dependencies are installed with "setfit~=0.2.0" "datasets~=2.3.0", and argilla is imported in the line import argilla as rg, it fails to import because an Error from datasets is not found and cannot be imported by argilla. (forgot the exact one)
  • the solution is remove the version limits, and I have datasets 3.0.1 and setfit 1.1.0

2. the init:

rg.init(
    api_url="https://localhost:6900",
    api_key="admin.apikey"
)

gets the error AttributeError: module 'argilla' has no attribute 'init', and the correct way to init seems to be:

client = rg.Argilla(
    api_url="some_url",
    api_key="argilla.apikey"
)

3. the dataset:

the dataset defined in the notebook (burtenshaw/electronics) is not available anymore on huggingface:

ELECTRONICS_DATASET = "burtenshaw/electronics"
dataset = load_dataset(ELECTRONICS_DATASET)
labels = dataset["labelled"].features["label"].names
int2str = dataset["labelled"].features["label"].int2str

I haven't tried further into the notebook, so there could be more issues after this still. For future reference I'm currently on argilla 2.2.2:

Name: argilla
Version: 2.2.2
Summary: The Argilla python server SDK
Home-page: 
Author: 
Author-email: Argilla <[email protected]>
License: Apache 2.0
Location: /root/miniconda3/lib/python3.10/site-packages
Requires: datasets, httpx, huggingface_hub, pillow, pydantic, rich, tqdm
Required-by:
@sdiazlor
Copy link
Contributor

sdiazlor commented Oct 8, 2024

Hi @trojblue, that's an old tutorial using legacy code. You can check this one for image classification: https://docs.argilla.io/latest/tutorials/image_classification/. Feel free to contribute if you’re interested in working on this :): https://docs.argilla.io/latest/community/contributor/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants