Release v0.4.0 Differentiable heads & various quality of life improvements · huggingface/setfit

Differentiable heads for `SetFitModel`

@blakechi has implemented a differentiable head in PyTorch for SetFitModel that enables the model to be trained end-to-end. The implementation is backwards compatible with the scikit-learn heads and can be activated by setting use_differentiable_head=True when loading SetFitModel. Here's a full example:

from datasets import load_dataset
from sentence_transformers.losses import CosineSimilarityLoss

from setfit import SetFitModel, SetFitTrainer


# Load a dataset from the Hugging Face Hub
dataset = load_dataset("sst2")

# Simulate the few-shot regime by sampling 8 examples per class
num_classes = 2
train_dataset = dataset["train"].shuffle(seed=42).select(range(8 * num_classes))
eval_dataset = dataset["validation"]

# Load a SetFit model from Hub
model = SetFitModel.from_pretrained(
    "sentence-transformers/paraphrase-mpnet-base-v2",
    use_differentiable_head=True,
    head_params={"out_features": num_classes},
)

# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    loss_class=CosineSimilarityLoss,
    metric="accuracy",
    batch_size=16,
    num_iterations=20, # The number of text pairs to generate for contrastive learning
    num_epochs=1, # The number of epochs to use for constrastive learning
    column_mapping={"sentence": "text", "label": "label"} # Map dataset columns to text/label expected by trainer
)

# Train and evaluate
trainer.freeze() # Freeze the head
trainer.train() # Train only the body

# Unfreeze the head and freeze the body -> head-only training
trainer.unfreeze(keep_body_frozen=True)
# or
# Unfreeze the head and unfreeze the body -> end-to-end training
trainer.unfreeze(keep_body_frozen=False)

trainer.train(
    num_epochs=25, # The number of epochs to train the head or the whole model (body and head)
    batch_size=16,
    body_learning_rate=1e-5, # The body's learning rate
    learning_rate=1e-2, # The head's learning rate
    l2_weight=0.0, # Weight decay on **both** the body and head. If `None`, will use 0.01.
)
metrics = trainer.evaluate()

# Push model to the Hub
trainer.push_to_hub("my-awesome-setfit-model")

# Download from Hub and run inference
model = SetFitModel.from_pretrained("lewtun/my-awesome-setfit-model")
# Run inference
preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])

Bug fixes and improvements

add num_epochs to train_step calculation by @PhilipMay in #139
Support for the differentiable head by @blakechi in #112
redirect call to predict by @PhilipMay in #142
fix: templated examples copy empty vector by @pdhall99 in #148
Add support to kwargs in compute() method called by trainer.evaluate() by @mpangrazzi in #125
Small fix on hyperparameter search by @Mouhanedg56 in #150
Fix typo: temerature => temperature by @tomaarsen in #155
Add the usage and relevant info. of the differentiable head to README by @blakechi in #149
Fix non default loss_class issue by @PhilipMay in #154
Add sampling function & update notebooks by @lewtun in #146
Fix typos: image(s) -> sentence(s) by @victorjmarin in #160
Add more loss function options by @PhilipMay in #159

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@pdhall99
- fix: allow load of pretrained model without head
- fix: templated examples copy empty vector (#148)
@PhilipMay
- add num_epochs to train_step calculation (#139)
- redirect call to predict (#142)
- Fix non default loss_class issue (#154)
- Add more loss function options (#159)
@blakechi
- Support for the differentiable head (#112)
- Add the usage and relevant info. of the differentiable head to README (#149)
@mpangrazzi
- Add support to kwargs in compute() method called by trainer.evaluate() (#125)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0 Differentiable heads & various quality of life improvements

Differentiable heads for `SetFitModel`

Bug fixes and improvements

Significant community contributions

Contributors

v0.4.0 Differentiable heads & various quality of life improvements

Differentiable heads for SetFitModel

Bug fixes and improvements

Significant community contributions

Contributors

Differentiable heads for `SetFitModel`