Bump transformers from 4.32.1 to 4.41.2 #82

dependabot · 2024-06-03T13:17:16Z

Bumps transformers from 4.32.1 to 4.41.2.

Release notes

Release v4.41.2

Mostly fixing some stuff related to trust_remote_code=True and from_pretrained

The local_file_only was having a hard time when a .safetensors file did not exist. This is not expected and instead of trying to convert, we should just fallback to loading the .bin files.

Do not trigger autoconversion if local_files_only #31004 from @Wauplin fixes this!

Paligemma: Fix devices and dtype assignments (#31008) by @molbap

Redirect transformers_agents doc to agents (#31054) @aymeric-roucher

Fix from_pretrained in offline mode when model is preloaded in cache (#31010) by @oOraph

Fix faulty rstrip in module loading (#31108) @Rocketknight1

Release v4.41.1 Fix PaliGemma finetuning, and some small bugs

Release v4.41.1

Fix PaliGemma finetuning:

The causal mask and label creation was causing label leaks when training. Kudos to @probicheaux for finding and reporting!

huggingface/transformers@a755745 : PaliGemma - fix processor with no input text (huggingface/transformers#30916) @hiyouga

huggingface/transformers@a25f7d3 : Paligemma causal attention mask (huggingface/transformers#30967) @molbap and @probicheaux

Other fixes:

huggingface/transformers@bb48e92: tokenizer_class = "AutoTokenizer" Llava Family (huggingface/transformers#30912)

huggingface/transformers@1d568df : legacy to init the slow tokenizer when converting from slow was wrong (huggingface/transformers#30972)

huggingface/transformers@b1065aa : Generation: get special tokens from model config (huggingface/transformers#30899) @zucchini-nlp

Reverted huggingface/transformers@4ab7a28

v4.41.0: Phi3, JetMoE, PaliGemma, VideoLlava, Falcon2, FalconVLM & GGUF support

New models

Phi3

The Phi-3 model was proposed in Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone by Microsoft.

TLDR; Phi-3 introduces new ROPE scaling methods, which seems to scale fairly well! A 3b and a Phi-3-mini is available in two context-length variants—4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.

Phi-3 by @gugarosa in huggingface/transformers#30423

JetMoE

JetMoe-8B is an 8B Mixture-of-Experts (MoE) language model developed by Yikang Shen and MyShell. JetMoe project aims to provide a LLaMA2-level performance and efficient language model with a limited budget. To achieve this goal, JetMoe uses a sparsely activated architecture inspired by the ModuleFormer. Each JetMoe block consists of two MoE layers: Mixture of Attention Heads and Mixture of MLP Experts. Given the input tokens, it activates a subset of its experts to process them. This sparse activation schema enables JetMoe to achieve much better training throughput than similar size dense models. The training throughput of JetMoe-8B is around 100B tokens per day on a cluster of 96 H100 GPUs with a straightforward 3-way pipeline parallelism strategy.

... (truncated)

Commits

ab0f050 Release: v4.41.2
57f5553 Fix faulty rstrip in module loading (#31108)
73b180c fix from_pretrained in offline mode when model is preloaded in cache (#31010)
a6325a7 Redirect transformers_agents doc to agents (#31054)
9ccdc84 Paligemma- fix devices and dtype assignments (#31008)
12aa316 Do not trigger autoconversion if local_files_only (#31004)
75f15f3 Release: v4.41.1
8282db5 Paligemma causal attention mask (#30967)
e5b788a Revert "feat: Upgrade Weights & Biases callback (#30135)"
9d05459 Generation: get special tokens from model config (#30899)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.32.1 to 4.41.2. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.32.1...v4.41.2) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

dependabot · 2024-07-01T13:46:35Z

Superseded by #87.

dependabot bot added the dependencies Pull requests that update a dependency file label Jun 3, 2024

dependabot bot mentioned this pull request Jun 3, 2024

Bump transformers from 4.32.1 to 4.41.1 #81

Closed

dependabot bot closed this Jul 1, 2024

dependabot bot deleted the dependabot/pip/transformers-4.41.2 branch July 1, 2024 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump transformers from 4.32.1 to 4.41.2 #82

Bump transformers from 4.32.1 to 4.41.2 #82

dependabot bot commented on behalf of github Jun 3, 2024

dependabot bot commented on behalf of github Jul 1, 2024

Bump transformers from 4.32.1 to 4.41.2 #82

Bump transformers from 4.32.1 to 4.41.2 #82

Conversation

dependabot bot commented on behalf of github Jun 3, 2024

Release v4.41.2

Release v4.41.1 Fix PaliGemma finetuning, and some small bugs

Release v4.41.1

Fix PaliGemma finetuning:

v4.41.0: Phi3, JetMoE, PaliGemma, VideoLlava, Falcon2, FalconVLM & GGUF support

New models

Phi3

JetMoE

dependabot bot commented on behalf of github Jul 1, 2024