Releases: webis-de/small-text
v1.0.1
Minor bug fix release.
Fixed
Links to notebooks and code examples will now always point to the latest release instead of the latest main branch.
v1.0.0
This is the first stable release 🎉! The release mainly consists of code cleanup, documentation, and repository organization.
- Datasets:
SklearnDataset
now checks if the dimensions of features and labels match.
- Query Strategies:
- ExpectedGradientLengthMaxWord: Cleaned up code and added checks to detect invalid configurations.
- Documentation:
- The html documentation uses the full screen width.
- Repository:
- This repository can now be referenced using the respective Zenodo DOI.
v1.0.0b4
This release adds two no query strategies, improves the Dataset
interface, and introduces optional dependencies.
Added
- General:
- We now have a concept for optional dependencies which allows components to rely on soft dependencies, i.e. python dependencies which can be installed on demand (and only when certain functionality is needed).
- Datasets:
- The
Dataset
interface now has aclone()
method that creates an identical copy of the respective dataset.
- The
- Query Strategies:
- New strategies: DiscriminativeActiveLearning and SEALS.
Changed
- Datasets:
- Separated the previous
DatasetView
implementation into interface (DatasetView
) and implementation (SklearnDatasetView
). - Added
clone()
method which creates an identical copy of the dataset.
- Separated the previous
- Query Strategies:
EmbeddingBasedQueryStrategy
now only embeds instances that are either in the label or in the unlabeled pool (and no longer the entire dataset).
- Code examples:
- Code structure was unified.
- Number of iterations can now be passed via an cli argument.
small_text.integrations.pytorch.utils.data
:- Method
get_class_weights()
now scales the resulting multi-class weights so that the smallest class weight is equal to1.0
.
- Method
v1.0.0b3
This release adds a new query strategy, improves the docs, and cleans up the interfaces in preparation of v1.0.0.
Added
- Added new query strategy: ContrastiveActiveLearning.
- Added Reproducibility Notes.
Changed
-
Cleaned up and unified argument naming: The naming of variables related to datasets and
indices has been improved and unified. The naming of datasets had been inconsistent,
and the previousx_
notation for indices was a relict of earlier versions of this library and
did not reflect the underlying object anymore.-
PoolBasedActiveLearner
:- attribute
x_indices_labeled
was renamed toindices_labeled
- attribute
x_indices_ignored
was unified toindices_ignored
- attribute
queried_indices
was unified toindices_queried
- attribute
_x_index_to_position
was named to_index_to_position
- arguments
x_indices_initial
,x_indices_ignored
, andx_indices_validation
were
renamed toindices_initial
,indices_ignored
, andindices_validation
. This affects most
methods of thePoolBasedActiveLearner
.
- attribute
-
QueryStrategy
- old:
query(self, clf, x, x_indices_unlabeled, x_indices_labeled, y, n=10)
- new:
query(self, clf, dataset, indices_unlabeled, indices_labeled, y, n=10)
- old:
-
StoppingCriterion
- old:
stop(self, active_learner=None, predictions=None, proba=None, x_indices_stopping=None)
- new:
stop(self, active_learner=None, predictions=None, proba=None, indices_stopping=None)
- old:
-
-
Renamed environment variable which sets the small-text temp folder from
ALL_TMP
toSMALL_TEXT_TEMP
v1.0.0b2
This release fixes some broken links which were caused due to the recent change in naming the git tags (1.0.0a8 -> v1.0.0b1).
Fixed
- Fix links to the documentation in README.md and notebooks.
v1.0.0b1
First beta release with multi-label functionality and stopping criteria. Added/revised large parts of the documentation.
Added
- Added a changelog.
- All provided classifiers are now capable of multi-label classification.
Changed
- Documentation has been overhauled considerably.
PoolBasedActiveLearner
: Renamedincremental_training
kwarg toreuse_model
.SklearnClassifier
: Changed__init__(clf)
to__init__(model, num_classes, multi_Label=False)
SklearnClassifierFactory
:__init__(clf_template, kwargs={})
to__init__(base_estimator, num_classes, kwargs={})
.- Refactored
KimCNNClassifier
andTransformerBasedClassification
.
Removed
- Removed
device
kwarg fromPytorchDataset.__init__()
,
PytorchTextClassificationDataset.__init__()
andTransformersDataset.__init__()
.