Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add token metrics: model_input_tokens and model_output_tokens #2006

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

luohua13
Copy link

No description provided.

@CLAassistant
Copy link

CLAassistant commented Dec 24, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@RobertSamoilescu
Copy link
Contributor

RobertSamoilescu commented Jan 7, 2025

Hi @luohua13,

Thank you so much for opening this PR. It is indeed a useful metric to expose when working with LLMs. What I would like to suggest to abstract away the use of Prometheus and use the custom metrics as presented in the docs here. The only issue with the custom metrics approach is that MLServer currently supports only histogram type of metrics as seen here. If you think that a histogram might not be appropriate, it would be great if you consider adding support for Counters as well.

Other things to consider, although those might be already addressed, are:

  • ensure that this PR does not affect or generalizes across other pipeline tasks (e.g., classification, qa, etc)
  • padding - when performing batch inference, some inputs might be padded (how do we count the tokens in this case)

Please also add a brief description of what this PR is trying to solve, what your approach is, etc.

Thank you again for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants