Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Better generation stats #331

Open
LLukas22 opened this issue Jun 25, 2023 · 4 comments
Open

Better generation stats #331

LLukas22 opened this issue Jun 25, 2023 · 4 comments
Labels
meta:help-wanted Extra attention is needed meta:maintenance Changes that will make it easier for us to maintain code topic:api-design API design considerations, including new functionality and changes

Comments

@LLukas22
Copy link
Contributor

I'm currently facing an issue where the generation on a gpu sometimes slows down and its very hard to determine why. (see #325)

It would be great if we could have an option to get more detailed information from the generation process. Maybe we could divide the per token times into the following categories:

  • Forward pass: Raw time spend in the evaluate function of the model
  • Sampler: Time spend sampling the tokens
  • Decoding: Time taken by the tokenizer to decode the tokens
  • Printing: Time spend invoking the callback and printing to the CLI
@LLukas22 LLukas22 added meta:help-wanted Extra attention is needed meta:maintenance Changes that will make it easier for us to maintain code topic:api-design API design considerations, including new functionality and changes labels Jun 25, 2023
@jafioti
Copy link
Contributor

jafioti commented Jun 25, 2023

It would also be helpful to see the max and min time of each category, alongside the mean

@philpax
Copy link
Collaborator

philpax commented Jun 25, 2023

Sounds good to me, would anyone be interested in doing this?

@LLukas22
Copy link
Contributor Author

I could give it a try but im still kinda bussy with the CUDA/OpenCL stuff and i have no idea how i would implement performance metrics and loggin correctly in rust 😬

@philpax
Copy link
Collaborator

philpax commented Jun 26, 2023

You can probably just use std::time::Instant - it should be precise enough for this application. Just create some Instants at each measurement point, then call .elapsed() on them to find the amount of time that has passed since that instant.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
meta:help-wanted Extra attention is needed meta:maintenance Changes that will make it easier for us to maintain code topic:api-design API design considerations, including new functionality and changes
Projects
None yet
Development

No branches or pull requests

3 participants