Skip to content

Latest commit

 

History

History

regression_fitting

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Regression Fitting

This directory contains the code for fitting the regression model and for predicting the optimal data mixture.

Prepare Data

Before fitting the regression model, you need to prepare the data obtained after training the proxy models. If you have not trained the proxy models, please refer to the mixture_config directory and the model_training directory for more details.

In our paper, we use the validation loss on the Pile-CC subset as the Target, and the domain weights as the Features for regression model fitting. The already prepared data is stored in the data directory. You can also prepare your own data by following the instructions in the mixture_config directory.

Model Fitting

You can follow the notebook to do both:

  • Regression fitting with proxy model training logs
  • Simulate and choose the optimal data mixture

With the notebook, you can easily fit the regression model and predict the optimal data mixture for training the large language models.