Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] Adapting Foundation Models #38

Open
11 tasks
Shreyanand opened this issue Apr 17, 2023 · 0 comments
Open
11 tasks

[EPIC] Adapting Foundation Models #38

Shreyanand opened this issue Apr 17, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@Shreyanand
Copy link
Member

Shreyanand commented Apr 17, 2023

Foundation models need to be adapted for specific use cases and domains. There are several questions on around how to target different use cases. As a part of this epic, we will find answers to the following questions:

  • How do different variants of llms compare with each other, in terms of architecture (input tokens, hidden & attention layers, parameters, decoder encoder variations), licenses, hardware utilization, etc.?
  • What is the difference between small FMs (<15B) and large FMs(>50B)?
    • How does performance vary for few shot prompting large models vs fine tuning smaller models?
    • Distributed fine tuning of LLMs  #49
    • Do we need a hierarchy of models for specific tasks? For example, one base large model for text generation and two smaller models each for code generation and documentation QA? What's the difference between Bloom 13B and Bloom 3B?
    • Do smaller models have a smaller context window or token limit and is that a limitation? How are contexts used by the models, in other words how is the model learning complemented by the context to generate a response?
    • What is the relevance of vector databases in these solutions? Are they still relevant in smaller fine-tuned models with smaller context windows?
    • What are the production cost and performance comparisons of these approaches? Design experiments to show some of these comparisons.
  • What is the role of datasets in fine tuning? Does fine tuning for a domain require a QA format dataset or self-supervised masking words in a sentence (recheck) dataset? Can we try BERT based models that have a different architecture?
  • What are the various steps that take place in QA with FMs? A mechanism to introspect language chain operations #30
  • Adapt learnings from this to solve ROSA use case [spike] Fine-tuning options for LLMs #18
@Shreyanand Shreyanand added the enhancement New feature or request label Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant