-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Fix Padding Related Bugs: Crossfit
#66
[REVIEW] Fix Padding Related Bugs: Crossfit
#66
Conversation
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Crossfit
Crossfit
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add an example test for padding_side="right"
vs padding_side="left"
?
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor comments, thanks for fixing this!
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
Resolved comments @ryantwolf |
Verified same results before and after memory changes import os
import numpy as np
import joblib
from sklearn.linear_model import LinearRegression
new_model_dir = "/home/nfs/vjawa/.cf/memory"
old_model_dir = "/home/nfs/vjawa/bkp_memory_curves"
model_name = "microsoft/deberta-v3-base"
#model_name = "sentence-transformers/all-MiniLM-L6-v2"
model_fname = "mem_model.pkl"
old_model_path = os.path.join(old_model_dir, model_name, model_fname)
new_model_path = os.path.join(new_model_dir, model_name, model_fname)
old_model = joblib.load(old_model_path)
new_model = joblib.load(new_model_path)
assert np.allclose(old_model.coef_, old_model.coef_)
assert np.isclose(new_model.intercept_, new_model.intercept_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor things remain that you might want to fix, but other than that looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests LGTM, thanks!
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
@ryantwolf , Addressed your review and added improvements, thanks again for the careful review and helping make the performance and user experience better. |
Signed-off-by: Vibhu Jawa <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
X: list[list[int]] = [] | ||
y: list[float] = [] | ||
|
||
max_seq = min(AutoTokenizer.from_pretrained(path_or_name).model_max_length, end_seq_len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the other issue, I think every time AutoTokenizer
or AutoConfig
is used in this file it should be using the corresponding methods of the model. Now I'm getting this error with llama guard:
Traceback (most recent call last):
File "/usr/local/bin/aegis_classifier_inference", line 8, in <module>
sys.exit(console_script())
File "/usr/local/lib/python3.10/dist-packages/nemo_curator/scripts/aegis_classifier_inference.py", line 126, in console_script
main()
File "/usr/local/lib/python3.10/dist-packages/nemo_curator/scripts/aegis_classifier_inference.py", line 62, in main
domain_classifier = AegisClassifier(
File "/usr/local/lib/python3.10/dist-packages/nemo_curator/classifiers/aegis.py", line 161, in __init__
model = AegisHFModel(config=config)
File "/usr/local/lib/python3.10/dist-packages/nemo_curator/classifiers/aegis.py", line 90, in __init__
super().__init__(
File "/usr/local/lib/python3.10/dist-packages/crossfit/backend/torch/hf/model.py", line 58, in __init__
self.mem = fit_memory_estimate_curve(
File "/usr/local/lib/python3.10/dist-packages/crossfit/backend/torch/hf/memory_curve_utils.py", line 44, in fit_memory_estimate_curve
max_seq = min(AutoTokenizer.from_pretrained(path_or_name).model_max_length, end_seq_len)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 843, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 2032, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for 'meta-llama/LlamaGuard-7b'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'meta-llama/LlamaGuard-7b' is the correct path to a directory containing all relevant files for a LlamaTokenizerFast tokenizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved by: 987836f
Signed-off-by: Vibhu Jawa <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tests ran successfully. Thanks so much for doing this @VibhuJawa!
This PR plans to fix:
TODO: