Skip to content

Releases: webis-de/small-text

v1.0.1

12 Sep 22:17
Compare
Choose a tag to compare

Minor bug fix release.

Fixed

Links to notebooks and code examples will now always point to the latest release instead of the latest main branch.

v1.0.0

14 Jun 13:07
Compare
Choose a tag to compare

This is the first stable release 🎉! The release mainly consists of code cleanup, documentation, and repository organization.

  • Datasets:
    • SklearnDataset now checks if the dimensions of features and labels match.
  • Query Strategies:
  • Documentation:
    • The html documentation uses the full screen width.
  • Repository:
    • This repository can now be referenced using the respective Zenodo DOI.

v1.0.0b4

04 May 18:20
Compare
Choose a tag to compare

This release adds two no query strategies, improves the Dataset interface, and introduces optional dependencies.

Added

  • General:
    • We now have a concept for optional dependencies which allows components to rely on soft dependencies, i.e. python dependencies which can be installed on demand (and only when certain functionality is needed).
  • Datasets:
    • The Dataset interface now has a clone() method that creates an identical copy of the respective dataset.
  • Query Strategies:

Changed

  • Datasets:
    • Separated the previous DatasetView implementation into interface (DatasetView) and implementation (SklearnDatasetView).
    • Added clone() method which creates an identical copy of the dataset.
  • Query Strategies:
    • EmbeddingBasedQueryStrategy now only embeds instances that are either in the label or in the unlabeled pool (and no longer the entire dataset).
  • Code examples:
    • Code structure was unified.
    • Number of iterations can now be passed via an cli argument.
  • small_text.integrations.pytorch.utils.data:
    • Method get_class_weights() now scales the resulting multi-class weights so that the smallest class weight is equal to 1.0.

v1.0.0b3

06 Mar 16:16
Compare
Choose a tag to compare

This release adds a new query strategy, improves the docs, and cleans up the interfaces in preparation of v1.0.0.

Added

Changed

  • Cleaned up and unified argument naming: The naming of variables related to datasets and
    indices has been improved and unified. The naming of datasets had been inconsistent,
    and the previous x_ notation for indices was a relict of earlier versions of this library and
    did not reflect the underlying object anymore.

    • PoolBasedActiveLearner:

      • attribute x_indices_labeled was renamed to indices_labeled
      • attribute x_indices_ignored was unified to indices_ignored
      • attribute queried_indices was unified to indices_queried
      • attribute _x_index_to_position was named to _index_to_position
      • arguments x_indices_initial, x_indices_ignored, and x_indices_validation were
        renamed to indices_initial, indices_ignored, and indices_validation. This affects most
        methods of the PoolBasedActiveLearner.
    • QueryStrategy

      • old: query(self, clf, x, x_indices_unlabeled, x_indices_labeled, y, n=10)
      • new: query(self, clf, dataset, indices_unlabeled, indices_labeled, y, n=10)
    • StoppingCriterion

      • old: stop(self, active_learner=None, predictions=None, proba=None, x_indices_stopping=None)
      • new: stop(self, active_learner=None, predictions=None, proba=None, indices_stopping=None)
  • Renamed environment variable which sets the small-text temp folder from ALL_TMP to SMALL_TEXT_TEMP

v1.0.0b2

22 Feb 16:48
Compare
Choose a tag to compare

This release fixes some broken links which were caused due to the recent change in naming the git tags (1.0.0a8 -> v1.0.0b1).

Fixed

  • Fix links to the documentation in README.md and notebooks.

v1.0.0b1

22 Feb 16:22
Compare
Choose a tag to compare

First beta release with multi-label functionality and stopping criteria. Added/revised large parts of the documentation.

Added

  • Added a changelog.
  • All provided classifiers are now capable of multi-label classification.

Changed

  • Documentation has been overhauled considerably.
  • PoolBasedActiveLearner: Renamed incremental_training kwarg to reuse_model.
  • SklearnClassifier: Changed __init__(clf) to __init__(model, num_classes, multi_Label=False)
  • SklearnClassifierFactory: __init__(clf_template, kwargs={}) to __init__(base_estimator, num_classes, kwargs={}).
  • Refactored KimCNNClassifier and TransformerBasedClassification.

Removed

  • Removed device kwarg from PytorchDataset.__init__(),
    PytorchTextClassificationDataset.__init__() and TransformersDataset.__init__().