Skip to content

Official implementation of "OffsetBias: Leveraging Debiased Data for Tuning Evaluators"

License

Notifications You must be signed in to change notification settings

ncsoft/offsetbias

Repository files navigation

OffsetBias: Leveraging Debiased Data for Tuning Evaluators

🤗 Dataset  | Generation Model  | Reward Model  | 📜 Paper 

Official implementation for paper OffsetBias: Leveraging Debiased Data for Tuning Evaluators. In the paper we present:

  • EvalBiasBench, a meta-evaluation benchmark for testing judge models,
  • OffsetBias Data, a training dataset for pairwise preference evaluation,
  • OffsetBias Model, a judge model trained using Offsetbias.

This repository contains sample code for running Offsetbias Model for evaluation, EvalBiasBench dataset, and an inference script for various evaluation models on various meta-evaluation benchmarks.

Requirements

pip install -r requirements.txt

Evaluation Inference with OffsetBias Model

OffsetBias Model works as a judge model that performs pairwise preference evaluation task, where Instruction, Output (a), Output (b) are given, and a better output to the instruction needs to be found. You can use modules from this repository for simple and quick inference. Example code is in offsetbias_inference.py.

from module import VllmModule

instruction = "explain like im 5"
output_a = "Scientists are studying special cells that could help treat a sickness called prostate cancer. They even tried these cells on mice and it worked!"
output_b = "Sure, I'd be happy to help explain something to you! What would you like me to explain?"

model_name = "NCSOFT/Llama-3-OffsetBias-8B"
module = VllmModule(prompt_name="offsetbias", model_name=model_name)

conversation = module.make_conversation(
  instruction=instruction,
  response1=output_a,
  response2=output_b,
  swap=False)

output = module.generate([conversation])
print(output[0])
# The model should output "Output (b)"

Running EvalBiasBench

EvalBiasBench, ia benchmark for testing judge models robustness to evaluation scenarios containing biases. You can find the benchmark data under data/evalbiasbench/. The following shows instructions for running inference with various judge models, including OffsetBias model, on several benchmarks, including EvalBiasBench.

Configuration

Prepare model configuration under config/. For OpenAI models, api key is required.

A configuration file, offsetbias-8b.yaml, looks like the following:

prompt: llmbar # name of prompt file under prompt/

vllm_args:
  model_args: # args for vllm.LLM()
    model: NCSOFT/Llama-3-OffsetBias-8B
    dtype: float16
  sampling_params: # args for vllm.SamplingParams()
    temperature: 0
    max_tokens: 20

hf_args:
  model_args: # args for AutoModelForCausalLM.from_pretrained()
    model: NCSOFT/Llama-3-OffsetBias-8B
    dtype: float16
  generate_kwargs: # args for model.generate()
    max_new_tokens: 20
    pad_token_id: 128001
    do_sample: false
    temperature: 0

Run Inference

Running inference will automatically create inference result file and score file under result/. Below are various possible commands.

# run offsetbias inference on BiasBench dataset
python run_bench.py --config config/offsetbias-8b.yaml

# run offsetbias with custom name on all benchmarks
python run_bench.py --name my_inference --config config/offsetbias-8b.yaml --benchmarks llmbar,hhh,mtbench,biasbench

# no inference, redo parsing on existing inference result
python run_bench.py --name my_inference --config config/offsetbias-8b.yaml --benchmarks biasbench --parse

# no inference, recalculate score
python run_bench.py --name my_inference --score

Citation

If you find our work useful, please cite our paper:

@misc{park2024offsetbias,
      title={OffsetBias: Leveraging Debiased Data for Tuning Evaluators},
      author={Junsoo Park and Seungyeon Jwa and Meiying Ren and Daeyoung Kim and Sanghyuk Choi},
      year={2024},
      eprint={2407.06551},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

Official implementation of "OffsetBias: Leveraging Debiased Data for Tuning Evaluators"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published