Fix: Resolve issues with datasets without splits in Experiments and Explainers #239

Felipedino · 2025-01-11T00:11:26Z

Summary

This Pull Request fixes issues related to the functionality of datasets with splits. The changes made include:

Fix: Allow the Experiment and Explainer modules to handle datasets created without the "splits in folders" option (without splits).
Fix: Resolve the error that prevented the creation of datasets without splits.
Fix: Improve compatibility for datasets used in the Predictions functionality.
No new dependencies are introduced.

Type of change

Bug fix.

Changes

Fixes in dataset handling:
- Modified upload_dataset logic to properly handle datasets with the "splits in folders" option unchecked.
- Corrected the dataset path handling to ensure proper saving and loading when no splits are provided.
Add dynamic attribute checks in "get_dataset_info":
- Replaced hardcoded keys ("train", "test", "validation") with dynamic attribute checks (hasattr) to avoid errors when datasets lack specific splits.
- Centralized logic for calculating train_size, test_size, and val_size in get_dataset_info.
Improved compatibility in Explainer and Model Jobs:
- Added logic to check and handle missing or empty dataset splits (dataset_splits_path) in explainer_job.py and model_job.py.
- Dynamically create train, test, and validation splits if they are missing.
Bug fixes:
- Resolved an issue where datasets could not be created when "splits in folders" was enabled.
- Fixed compatibility issues in Experiment and Explainer modules when processing datasets without splits.

How to Test

Create a dataset with the "splits in folders" checkbox unchecked.
Use the created dataset in the Experiments module.
Verify that the dataset is processed correctly.
Repeat the process in the Explainers module and confirm no errors are generated.
Test the Predictions functionality with the same dataset to ensure compatibility.

Notes

The behavior of the "splits in folders" checkbox has been updated:
- Previously, leaving the checkbox unchecked would still create dataset splits by default.
- Now, leaving it unchecked creates a dataset without splits, while checking it enables splits.
This change may cause some confusion for users familiar with the previous behavior.

…ting a dataset will default to no splits.

Felipedino added 4 commits January 10, 2025 18:42

🐛 Add code to handle a dataset without splits

1227d07

🐛 Fixed an issue when the splits_in_folders field was true. Now, crea…

870c8c3

…ting a dataset will default to no splits.

🐛 Added conditions to handle datasets without splits.

87118b4

🐛 Fix explainer_job: Now can handle datasets without splits.

11cb87b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Resolve issues with datasets without splits in Experiments and Explainers #239

Fix: Resolve issues with datasets without splits in Experiments and Explainers #239

Felipedino commented Jan 11, 2025

Fix: Resolve issues with datasets without splits in Experiments and Explainers #239

Are you sure you want to change the base?

Fix: Resolve issues with datasets without splits in Experiments and Explainers #239

Conversation

Felipedino commented Jan 11, 2025

Summary

Type of change

Changes

How to Test

Notes