Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lesser powerful gpu #140

Closed
xprabhudayal opened this issue Oct 16, 2024 · 4 comments
Closed

lesser powerful gpu #140

xprabhudayal opened this issue Oct 16, 2024 · 4 comments

Comments

@xprabhudayal
Copy link

ive been running this model on a high end instance(h100) for a while & i have some doubts..

  1. can we use less powerful gpu to run this ai model? like v100.
  2. is this runnable in COLAB PRO having a100?
@NathanaelTamirat
Copy link

how long will it take on h100? and have you tried on other cheaper GPUs?
and can this project generate papers beside IT related ?

@BradKML
Copy link

BradKML commented Dec 21, 2024

@xprabhudayal You can ask the agent to do more small scale experiments targeting fine-grained issues (e.g. activation functions, classical ML with HPO, easy NLP optimization) rather than reaching all the way into DL territory. When it comes to idea generation others have said it takes "a few hours" (2-4 hours) for more compute-heavy concepts. #145
A small thought would be to create a diverse benchmark covering a wider range of common targets.

@NathanaelTamirat Check is (currently it is made for data-centric experiments not physical research) #145 (comment)

@xprabhudayal
Copy link
Author

I have run this agent on Google Colab for free by using the T4 GPU and during the experiment part it has took me around 2 to 3 hours for Nano GPT lite(I had reverse the engineer the Colab and opened up a terminal).
I have generated a paper by using the Qwen 2.5 72B LLM via API.

Like the only part which utilizers the GPU is the experiment agent (when using LLM API) which has been seen in the diagram (in the middle section)

@BradKML
Copy link

BradKML commented Dec 21, 2024

@xprabhudayal so only the GPU is used for both writing code and experimentation? Could you share the setup and the time needed for each process (paper fetching, ideation, code-gen, experiment, drafting, reviewing)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants