-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what's the specific meaning of dsir? #99
Comments
Hi @BBetteroff , DSIR stands for "Data Selection with Importance Resampling" (see paper here) and is used to compute importance weights for each sample with respect to different target domains. The screenshot you posted is from the RedPajama-Data/configs/rp_v2.0.conf Lines 31 to 33 in bb594b0
|
Thanks! I'll keeping reproducing this repo and talking to you. |
what‘s the content of listing file?,can you show me a example? and what's the use? |
The listing files contain the ids of inputs which, when concatenated with the base uri point to the location of the data. For example:
For example, if your data is stored locally under, e.g., |
I am trying to reproduce this repo on my macOS, and I don't have a aws account .can i get your help, i'd appreciate it
The text was updated successfully, but these errors were encountered: