Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Resolve issues with datasets without splits in Experiments and Explainers #239

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

Felipedino
Copy link
Collaborator

Summary

This Pull Request fixes issues related to the functionality of datasets with splits. The changes made include:

Fix: Allow the Experiment and Explainer modules to handle datasets created without the "splits in folders" option (without splits).
Fix: Resolve the error that prevented the creation of datasets without splits.
Fix: Improve compatibility for datasets used in the Predictions functionality.
No new dependencies are introduced.

Type of change

  • Bug fix.

Changes

  1. Fixes in dataset handling:

    • Modified upload_dataset logic to properly handle datasets with the "splits in folders" option unchecked.
    • Corrected the dataset path handling to ensure proper saving and loading when no splits are provided.
  2. Add dynamic attribute checks in "get_dataset_info":

    • Replaced hardcoded keys ("train", "test", "validation") with dynamic attribute checks (hasattr) to avoid errors when datasets lack specific splits.
    • Centralized logic for calculating train_size, test_size, and val_size in get_dataset_info.
  3. Improved compatibility in Explainer and Model Jobs:

    • Added logic to check and handle missing or empty dataset splits (dataset_splits_path) in explainer_job.py and model_job.py.
    • Dynamically create train, test, and validation splits if they are missing.
  4. Bug fixes:

    • Resolved an issue where datasets could not be created when "splits in folders" was enabled.
    • Fixed compatibility issues in Experiment and Explainer modules when processing datasets without splits.

How to Test

  1. Create a dataset with the "splits in folders" checkbox unchecked.
  2. Use the created dataset in the Experiments module.
  3. Verify that the dataset is processed correctly.
  4. Repeat the process in the Explainers module and confirm no errors are generated.
  5. Test the Predictions functionality with the same dataset to ensure compatibility.

Notes

  • The behavior of the "splits in folders" checkbox has been updated:
    • Previously, leaving the checkbox unchecked would still create dataset splits by default.
    • Now, leaving it unchecked creates a dataset without splits, while checking it enables splits.
  • This change may cause some confusion for users familiar with the previous behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant