Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

konrad-gerlach · 2025-01-10T15:07:25Z

This PR mainly affects the TextEnvironment class and adds caching in between generation calls, in order to not have to recompute all previous activations when generating the next segment. This is mainly intended for use cases where many tool calls are performed sequentially and thus the activations for the (possibly quite large) system prompt would have to be calculated at each step. For stability, caching is optional.

Bug fixes:
This issue also addresses two bugs I encountered:

max_length checking in TextEnvironment class threw an error, as it assumed batching was present, when no batching existed.
I fixed the bug and also added a check at generation time to ensure, that the padded inputs also do not exceed max length.
The StringStoppingCriteria did not take generated eos tokens into account, which I have now fixed.

RE testing:
I only made sure, that the tests in tests/test_environments.py were completing.
Using make test some tests were failing and the tests were taking a long time to run. However, the only tests, which call TextEnvironment seem to be in test_environments.py, so the rest should be unaffected as far as I know. Nevertheless, I would be grateful, if somebody else could run all the tests before merging. I suspect, that my environment may not be ideally configured. Is testing automated via a CI?

konrad-gerlach · 2025-01-10T16:02:23Z

I would be very grateful for a review by:
@lvwerra
@vwxyzjn
@younesbelkada
@qgallouedec
or any others, that feel up to the task.

konrad-gerlach · 2025-01-10T21:56:55Z

I was unable to execute the pre-commit hook, so I manually ran the linter.

docs/source/text_environments.md

trl/environment/base_environment.py

qgallouedec · 2025-01-12T15:48:37Z

Thanks for the PR!
Let's see what's the CI outputs.

HuggingFaceDocBuilderDev · 2025-01-12T15:52:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

konrad-gerlach · 2025-01-12T18:46:43Z

Just to be sure, as I'm unfamiliar with their implementation: The trl Trainers like PPO should not try to back propagate through the generated tokens, right?

Co-authored-by: Quentin Gallouédec <[email protected]>

konrad-gerlach · 2025-01-12T19:59:57Z

The CI failing for Python 3.9 seems unrelated to this PR.

qgallouedec · 2025-01-12T20:49:07Z

The trl Trainers like PPO should not try to back propagate through the generated tokens, right?

Yes that's correct. The backprop is done on the output of a forward pass

konrad-gerlach · 2025-01-12T21:21:55Z

@qgallouedec Could you run the precommit to fix the linting issues? I haven't gotten it to work.

…gStoppingCriteria

konrad-gerlach · 2025-01-15T22:59:24Z

I'm still working on adding some more tests and cleaning up the code a bit.

Konrad Gerlach added 9 commits January 10, 2025 10:02

feat: add caching for TextEnvironment and fix bugs

ab86162

feat: make TextEnvironment caching optional and add documentation

d09ec63

fix: failing TextEnvironment tests

b7885cc

test: add tests for TextEnvironment caching and fix cache combining bug

034c5f7

test: remove unnecessary parametrized class decorator

18eb106

docs: update TextEnvironmentDocs with caching

44fd184

fix: run linter on TextEnvironment and TextEnvironment tests

28601c2

fix: comment

2a7ec4e

fix: Args comment

af06d63

konrad-gerlach marked this pull request as draft January 10, 2025 15:16

fix: TextEnvironment cache combination and batching issue

f6f12b5

konrad-gerlach marked this pull request as ready for review January 10, 2025 15:57

konrad-gerlach force-pushed the text_environment_caching branch from 6a87c8d to 3f57ee9 Compare January 10, 2025 16:27

tests: make caching test more complex

ede7e81

konrad-gerlach force-pushed the text_environment_caching branch from 3f57ee9 to ede7e81 Compare January 10, 2025 16:33

konrad-gerlach marked this pull request as draft January 11, 2025 10:47

fix: combine caches of different sequence lengths

acddaa7

konrad-gerlach marked this pull request as ready for review January 11, 2025 12:58

Konrad Gerlach added 2 commits January 12, 2025 16:36

docs: update caching warning

e38940e

fix: prevent bos tokens in tool response

66d0ce4

qgallouedec reviewed Jan 12, 2025

View reviewed changes

docs/source/text_environments.md Outdated Show resolved Hide resolved

qgallouedec reviewed Jan 12, 2025

View reviewed changes

trl/environment/base_environment.py Outdated Show resolved Hide resolved

konrad-gerlach and others added 3 commits January 12, 2025 19:48

docs: Update docs/source/text_environments.md

a051e46

Co-authored-by: Quentin Gallouédec <[email protected]>

Update trl/environment/base_environment.py

9ea9287

Co-authored-by: Quentin Gallouédec <[email protected]>

Merge branch 'main' into text_environment_caching

ae1233a

fix: code cleanup

a2860bc

Konrad Gerlach and others added 2 commits January 14, 2025 22:06

fix: attended to invalid last generated token and off-by-one in Strin…

23014fb

…gStoppingCriteria

Merge branch 'main' into text_environment_caching

bdaa922

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

qgallouedec commented Jan 12, 2025

HuggingFaceDocBuilderDev commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025 •

edited

Loading

konrad-gerlach commented Jan 12, 2025

qgallouedec commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025

konrad-gerlach commented Jan 15, 2025

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

Are you sure you want to change the base?

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

Conversation

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

qgallouedec commented Jan 12, 2025

HuggingFaceDocBuilderDev commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025 • edited Loading

konrad-gerlach commented Jan 12, 2025

qgallouedec commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025

konrad-gerlach commented Jan 15, 2025

konrad-gerlach commented Jan 12, 2025 •

edited

Loading