Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

feat(tracing): add tracing to llm and llm-base crates #367

Merged
merged 1 commit into from
Jul 16, 2023

Conversation

radu-matei
Copy link
Contributor

Hi, everyone!
First, thanks for the project!

This commit begins adding tracing and some instrumentation to this project.

As folks are beginning to look at using and optimising this project, I suspect this will be very useful.
Below is an example of sending the data to an open telemetry collector and visualising it:

image

This is a draft PR for now, as we understand whether this is something we would like to add to the project.
If it is, this can be expanded to include more robust logging and instrumentation.

@LLukas22
Copy link
Contributor

See #331. I would love to have some better tracing/performance stats ❤️

Copy link
Collaborator

@philpax philpax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections from me! Would you like to do more or should I merge this PR as-is?

We may also want to switch over llm-cli and other applications to use tracing, so that we don't have duplicate loggers.

@radu-matei
Copy link
Contributor Author

I can update the CLI to use tracing as part of this PR as well, yeah.

@philpax
Copy link
Collaborator

philpax commented Jul 15, 2023

Cool, go ahead and switch things over to tracing. Should be a pretty straightforward PR :)

@radu-matei
Copy link
Contributor Author

Besides consuming the tracing data in an OTEL environment which is demonstrated above, you can now get basic logging with timestamps when using the llm CLI as well:

$ RUST_LOG=llm=trace ./target/release/llm ...
⣾ Loading model...2023-07-16T11:56:04.390067Z TRACE infer: llm_base::loader: Read model file from "/Users/radu/models/llama/open-llama-13b-open-instruct.ggmlv3.q3_K_L.bin"    
⣾ Loaded hyperparameters2023-07-16T11:56:04.538068Z TRACE infer: llm_base::loader: Loaded GGML model from reader    
2023-07-16T11:56:04.538102Z TRACE infer: llm_base::loader: Determined quantization version of model as 2    
2023-07-16T11:56:04.538163Z TRACE infer: llm_base::loader: Context size: 92928    
2023-07-16T11:56:04.538175Z DEBUG infer: llm::cli_args: ggml ctx size = 92.9 KB    
✓ Loaded 363 tensors (6.9 GB) after 150ms
2023-07-16T11:56:04.555690Z TRACE infer: llm_base::loader: Loaded model    
2023-07-16T11:56:04.559279Z TRACE infer:infer:infer: llm_base::inference_session: Starting inference request with max_token_count: 18446744073709551615    
2023-07-16T11:56:12.273385Z TRACE infer:infer:infer:feed_prompt: llm_base::inference_session: Finished feed prompt

@radu-matei radu-matei marked this pull request as ready for review July 16, 2023 11:57
@philpax philpax merged commit 0269796 into rustformers:main Jul 16, 2023
14 checks passed
@philpax
Copy link
Collaborator

philpax commented Jul 16, 2023

Excellent, thank you :)

@yangyaofei
Copy link

yangyaofei commented Aug 4, 2023

After this PR, run the following command will get nothing :

cargo run --release -- info -a bloom --model-path models/ggml-model-f16.bin

I had to add RUST_LOG=llm=trace for all command, is that good?

@hhamud hhamud mentioned this pull request Aug 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants