Speculative sampling #1410

haim-barad · 2023-10-31T07:47:08Z

Works with DollyV2 for machines with at least 64GB local memory. By default, we use GPT2 for smaller machines.

Leveraging speculative sampling with KV caching, this notebook's code will generate text using standard autoregressive sampling and speculative sampling and compare the times needed to generate N tokens. By default, N is set to 100. Another parameter K is set to 5 but can be adjusted to determine the number of candidate tokens to generate from the smaller draft model.

This code uses gradio to provide an interface labeling the models used and handling the input.

fix update past key values for target model

review-notebook-app · 2023-10-31T07:47:13Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

andrei-kochin · 2023-11-01T12:51:06Z

@haim-barad please check the CI failures as it seems some of the are quite easy to fix

apaniukov · 2023-11-01T14:08:33Z

notebooks/264-speculative-sampling/264-speculative-sampling.ipynb

@@ -0,0 +1,492 @@
+{


Line #31. res = target_model(x_draft_target_input, attention_mask=torch.ones(x_draft_target_input.size(), dtype=torch.long), use_cache=False)
Why use_cache=False? You should use target_past_kv for target_model inference based on x_draft_target_input:
if target_past_kv is None:
x_draft_target_input = torch.cat((x, x_draft), dim=1)

else:
x_draft_target_input = x_draft

Reply via ReviewNB

apaniukov · 2023-11-01T14:08:34Z

notebooks/264-speculative-sampling/264-speculative-sampling.ipynb

@@ -0,0 +1,492 @@
+{


Line #37. q_cum = torch.cat((q_cum, q), dim=1)
q_cum is not used anywhere.

Reply via ReviewNB

apaniukov · 2023-11-01T14:08:34Z

notebooks/264-speculative-sampling/264-speculative-sampling.ipynb

@@ -0,0 +1,492 @@
+{


Line #46. if np.random.random() < min(1, (q_item / p_item)): # accepted
q_item is logits, p_item is probability after softmax. You should apply softmax to q first.

Reply via ReviewNB

apaniukov · 2023-11-01T14:08:34Z

notebooks/264-speculative-sampling/264-speculative-sampling.ipynb

@@ -0,0 +1,492 @@
+{


Line #58. target_past_kv = target_new_past_kv
You should update target_past_kv based on the number of accepted tokens.

Reply via ReviewNB

…openvino_notebooks into speculative_sampling

haim-barad · 2023-11-02T10:37:44Z

Code_check failure is another notebook. Spelling errors are ignoring the tag (as noted in your instructions). Treon (ubuntu for 3.8 and 3.9) fails due to timeout for GPU... but I clearly have CPU as a default device.

eaidova · 2023-11-02T14:13:50Z

@haim-barad, for spell check you need to add unknown words reported by tools into vocabulary. You can find more info about that contributing guide

haim-barad · 2023-11-02T15:06:19Z

I had added the required entries and now I see the latest push created some conflicts, I apologize for this, but can someone with permissions do a merge?

haim-barad · 2023-11-02T15:25:58Z

code check is failing for 263-latent-consistency-models-image-generation, not this notebook.

haim-barad · 2023-11-03T03:23:36Z

Both code_check and docker_treon are failing on other notebooks, not because of this PR. Please approve. Thanks.

haim-barad · 2023-11-06T17:52:11Z

There's still a small link to fix. I see this was renumbered to 266. There's a "open in Collab" button in the readme - the link needs to be 266 instead of 265. Can one of you fix it to avoid the review process?

haim-barad and others added 16 commits September 20, 2023 15:48

Initial commit of speculative sampling

7604403

Code cleanup - still has TODOs

7673aee

Some cleanup

152698c

Clean up readme

7cc61a2

fix update past key values for target model

335cc83

Merge pull request #1 from eaidova/ea/pkv_fix

4faaf75

fix update past key values for target model

Code and Text cleanup

711e30c

Documentation enhancements

28ee6b1

Fixed references

5d09423

Cleanup and focus on Dolly V2

7572fcf

Clean up

b82d54d

gradio fixes

d301ba7

Clean up and renumbering

66fdf11

commit removals of old numbering

beb30ec

Centering fix

ed96f05

markdown fix

45bf083

Merge branch 'main' into speculative_sampling

c21e3fd

andrei-kochin requested review from a team, apaniukov, itrushkin and aleksandr-mokrov and removed request for a team November 1, 2023 10:37

apaniukov reviewed Nov 1, 2023

View reviewed changes

haim-barad added 5 commits November 1, 2023 17:42

ci fixes

2ba4e39

Merge branch 'speculative_sampling' of https://github.com/haim-barad/…

5dc7e0d

…openvino_notebooks into speculative_sampling

ci cleanup and renumbering of notebook

1f04dc9

Delete old numbering

6397cd0

ci cleanup

7518b65

haim-barad added 3 commits November 1, 2023 18:26

ci fix

35cce57

ci README fix

58d0c94

ci fixes

3853565

haim-barad added 2 commits November 2, 2023 16:17

ci spelling wordlist additions

8bb1333

update ignore lists

fd6983e

eaidova and others added 2 commits November 2, 2023 19:09

Merge branch 'main' into speculative_sampling

b76be9e

update of ignore list

66ec7ca

haim-barad added 2 commits November 2, 2023 22:33

ci docker ignore patch

bf59d80

ci ignore list fixes

a07e9a2

Merge branch 'main' into speculative_sampling

59c7550

eaidova merged commit 5af01af into openvinotoolkit:main Nov 6, 2023
3 of 14 checks passed

eaidova deleted the speculative_sampling branch November 6, 2023 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speculative sampling #1410

Speculative sampling #1410

haim-barad commented Oct 31, 2023

review-notebook-app bot commented Oct 31, 2023

andrei-kochin commented Nov 1, 2023

apaniukov Nov 1, 2023 •

edited

Loading

apaniukov Nov 1, 2023 •

edited

Loading

apaniukov Nov 1, 2023 •

edited

Loading

apaniukov Nov 1, 2023 •

edited

Loading

haim-barad commented Nov 2, 2023 •

edited

Loading

eaidova commented Nov 2, 2023

haim-barad commented Nov 2, 2023

haim-barad commented Nov 2, 2023

haim-barad commented Nov 3, 2023

haim-barad commented Nov 6, 2023

Speculative sampling #1410

Speculative sampling #1410

Conversation

haim-barad commented Oct 31, 2023

review-notebook-app bot commented Oct 31, 2023

andrei-kochin commented Nov 1, 2023

apaniukov Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

apaniukov Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

apaniukov Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

apaniukov Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

haim-barad commented Nov 2, 2023 • edited Loading

eaidova commented Nov 2, 2023

haim-barad commented Nov 2, 2023

haim-barad commented Nov 2, 2023

haim-barad commented Nov 3, 2023

haim-barad commented Nov 6, 2023

apaniukov Nov 1, 2023 •

edited

Loading

apaniukov Nov 1, 2023 •

edited

Loading

apaniukov Nov 1, 2023 •

edited

Loading

apaniukov Nov 1, 2023 •

edited

Loading

haim-barad commented Nov 2, 2023 •

edited

Loading