Replies: 9 comments 9 replies
-
I did not use the ooba trainer, no. I used colab |
Beta Was this translation helpful? Give feedback.
-
Freeze until killed by system sounds like OOM. How big is the dataset? EDIT: Tested locally, got an OOM too. I think something got broken in the loader code? EDIT2: def generate_prompt(data_point: dict[str, str]):
print(f"Generating prompt for data-point: {data_point}")
(...)
def generate_and_tokenize_prompt(data_point):
prompt = generate_prompt(data_point)
return tokenize(prompt)
print("Loading JSON datasets...")
data = load_dataset("json", data_files=clean_path('training/datasets', f'{dataset}.json'))
print("Start mapping it")
train_data = data['train'].map(generate_and_tokenize_prompt) prints Our options: |
Beta Was this translation helpful? Give feedback.
-
I see you were able to reproduce the problem, but to answer this question, it was literally the exact text that I put in my first post. I originally was hoping it was just that the dataset was too big for my computer to handle. I tried using zetavg/LLaMA-LoRA-Tuner and it was able to parse the dataset, so maybe they are using an older version of the HF code, or maybe they're using something different. I'm kind of busy this week, so I don't have time to step through all of the code, but I did try to step through it in the webui code to see if I could figure out exactly where it was hanging. I got to parts where I couldn't understand what was actually happening, then I just held down the step button for a while and let it slowly run through the code. The last thing I remember was being in FWIW I am training in 8-bit, not 4. |
Beta Was this translation helpful? Give feedback.
-
I got it working. I'm a dummy and forgot to turn my swapfile on. It is weird watching the RAM usage crank up so high and then drop. Seem's like there has to be something that could be better optimized in there. I guess some loop is cranking out temporary variables. |
Beta Was this translation helpful? Give feedback.
-
The RAM usage is unreasonably massive even on small datasets. 1 KiB of JSON should not be loading up my 64 GiB of RAM no matter how stupid the internal code is. |
Beta Was this translation helpful? Give feedback.
-
I have the same issue. The process crashes when calling 'update_fingerprint' in 'arrorw_dataset.py'. 'update_fingerprint' is called in the 'map' function of the dataset, which is used in 'training.py'. 'update_fingerprint' calls a hasher to hash the function used for mapping text to tokens, which seems to lead to the problem. As temporary workaround I used a custom fingerprint:
in 'training.py' in the modules folder. |
Beta Was this translation helpful? Give feedback.
-
I'm losing my mind. I changed nothing and I can't replicate it anymore. I was going to test the fix loanMaster suggested but i can't replicate the "before" state of it overloading now. And I don't know why :( |
Beta Was this translation helpful? Give feedback.
-
It works now |
Beta Was this translation helpful? Give feedback.
-
I don't really want to submit an issue, because it might be something on my end. If I use raw text, it works fine, but if I try using any kind of json file, my whole system freezes until eventually (half hour or so) the program is killed by the system (running on linux). My data is formatted like this:
And my formats file looks like this:
@kaiokendev when you trained your LoRA, did you use the training tab in ooba, or something else? Did you ever have this problem?
Beta Was this translation helpful? Give feedback.
All reactions