Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iris: Track the token usage of requests #9282

Open
Hialus opened this issue Sep 4, 2024 · 2 comments · May be fixed by #9455 or ls1intum/Pyris#165
Open

Iris: Track the token usage of requests #9282

Hialus opened this issue Sep 4, 2024 · 2 comments · May be fixed by #9455 or ls1intum/Pyris#165

Comments

@Hialus
Copy link
Member

Hialus commented Sep 4, 2024

Is your feature request related to a problem?

Iris currently does not track the token usage of requests at all. This means instructors/admins can't have an overview of the incurred costs of a course.

Describe the solution you'd like

  • Track the input and output tokens from LLMs in Pyris
    • This would be a part of the LLM subsystem
  • Send the token count to Artemis with the (final) response/result/artifact
  • Support cost calculation on Artemis's side
    • Each model already has a cost argument. This could be made mandatory and then sent to Artemis
    • This should support the usage of different models with different costs in a single request (e.g. self-hosted and GPT 4o)
    • This may mean that the token tracking needs to be model-aware and store the results in a map or list with an entry for each model
  • Admin should allow admins to view the token usage and cost on an instance level in Artemis
  • Instructors and Admins should be able to see the token usage and cost on a course level (and maybe exercise level)
  • The UI should be discussed further, but some initial ideas:
    • The UI should show stats about token usage/cost per request (min, max, median, etc.)
    • The UI could show some graphs about the historical data (e.g., cost per week, cost per exercise)
    • The UI should differentiate between features (and variants)
  • The tracking needs to be feature-independent and should work for exercise chats, course chats, competency generation, lecture ingestion, and any future features
  • This means that the data in Artemis must be stored separately from IrisSessions and IrisMessages with an optional link to the source

Describe alternatives you've considered

LangSmith can track this, but this is a paid service and does not integrate into Artemis.

Additional context

This does not all have to be done in a single PR. The UI can especially be done and/or improved in separate PRs.

@alexjoham
Copy link
Member

Hey, @Hialus, I need to find out where the mentioned cost argument that every model already has can be found in the Pyris code. Also, when talking about storing the token usage, you expect me to store those in a table in the database, am I right?
About the UI, do you want me to add the overview in the Iris section and on the exercise level in a separate section, or is it enough to include the graphs in sections like "Statistics" on the Course level or near the score statistics in the exercise section?

@Hialus
Copy link
Member Author

Hialus commented Sep 15, 2024

Hey @alexjoham,

  • The cost is currently a part of the capabilities of each model. This cost is currently not specified to any specific unit, but our own config afaik uses $/1M tokens. Though this can be changed to something more reasonable, should you see fit.
  • Yes the used input/output tokens and cost per (n) input/output token(s) should be stored in a separate database table. This table could then have an optional column to link to the source (e.g. IrisMessage). I'd also suggest to first create the entities in Java and the migration in Liquibase and then get it reviewed by some developers, before you implement the rest of this. Otherwise it may lead to additional work if you have to change the database schema.
  • The UI is up to debate and should be discussed with @bassner and possibly also @krusche. Imo the already existing Iris settings pages could be extended to include these statistics. This would also be the place where I would put any future Iris related information. However, as I already mentioned, the UI is not the most important thing right now, so you could already start with the server implementation and take care of the UI once this is discussed.

If you have further questions, feel free to reach out on Slack.

@alexjoham alexjoham linked a pull request Oct 11, 2024 that will close this issue
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants