Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to use Colab disk instead of MyDrive #211

Open
Fatih120 opened this issue Sep 19, 2024 · 2 comments
Open

Option to use Colab disk instead of MyDrive #211

Fatih120 opened this issue Sep 19, 2024 · 2 comments

Comments

@Fatih120
Copy link

Fatih120 commented Sep 19, 2024

Mounting MyDrive is typically helpful, but in comparison to reading and writing files within the rest of the content directories otherwise, it is very slow. I have my own personal set of images to train - these files are tiny and about 30kb each and unpacking the archive should be no sweat for any consumer PC, but unfortunately trying to unpack them through Colab's 7z is very slow, possibly due to rate limits. It's enough to go out and get some breakfast. It's not worth it trying to upload these images one by one through GDrive either or to move back and forth, so I believe it may be a benefit to provide the option to process the images through Dataset and then train them outside, downloading any results and needed (aside from output results).

Glancing at the first cell of the Dataset Maker, it seems like just preventing GDrive from connecting sets the directory to ~/Loras regardless, but it would still be useful to have a visual and still link GDrive if needed for other purposes.

(It should be noted that storage-based formats like .tar probably work best with Colab.)

@hollowstrawberry
Copy link
Owner

Could you explain why the following doesn't fit your use case?: You can upload a zip file to your drive, and use the dedicated cell to extract it to a path in the colab storage.

@uYouUs
Copy link
Contributor

uYouUs commented Sep 21, 2024

Me and some friends have been testing stuff to get more performance. This is one of the things we were messing with. You can try it here https://colab.research.google.com/github/uYouUs/Hollowstrawberry-kohya-colab/blob/Threading/Lora_Trainer_XL_threaded.ipynb

So far we've only tested with Pony and not the other models so I cant speak to those. But with Pony, we're getting 2-3 minutes saved. I think it depends on the hardware or timing you get. The colab is a little changed because part of what makes it faster is unzipping on the go, so it is currently setup at the beginning.

I don't think @hollowstrawberry would want that many changes though. So, these changes will probably not be implemented and I don't think i would want to support an actual fork, so think of it as a temporary beta test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants