v1.2.0
This release adds a SetFit classifier, the BALD query strategy, and two new example notebooks.
Added
Active Learning
- PoolBasedActiveLearner now handles keyword arguments passed to the classifier's
fit()
during theupdate()
step. - New strategy: BALD.
- SubsamplingQueryStrategy now uses the remaining unlabeled pool when more samples are requested than are available.
Classification
- Added new classifier: SetFitClassification which wraps huggingface/setfit.
Examples
- Revised both existing notebook examples.
- Added a notebook example for active learning with SetFit classifiers.
- Added a notebook example for cold start initialization with SetFit classifiers.
Documentation
- A showcase section has been added to the documentation.
Fixed
- Distances in lightweight_coreset were not correctly projected onto the [0, 1] interval (but ranking was unaffected).
Changed
- Coreset implementations now use the distance-based (as opposed to the similarity-based) formulation.