How to supervise non-math data? #56

Luodian · 2025-01-26T14:30:13Z

I see the accuracy reward only can check the numerical equal? But what if my question is MCQ and asking an option?

I did a quick check and find it's not working.

from math_verify import parse, verify

# Parse the gold and answer
# If you know that gold will only contain latex or expr (no latex env), use
# parse(gold, extraction_config=[LatexExtractionConfig()]) or parse(gold, extraction_config=[ExprExtractionConfig()])

gold = parse("So the answer is B")
answer = parse("B")

print(gold)
print(answer)
# Order here is important!
print(verify(gold, answer))


[]
[]
False

The text was updated successfully, but these errors were encountered:

qgallouedec · 2025-01-26T14:49:22Z

Might be solved by #55. Can you kindly verify?

Luodian · 2025-01-26T14:55:26Z

great, developing so fast.

qgallouedec · 2025-01-26T14:57:28Z

hynky1999 · 2025-01-26T17:46:46Z

Hi, @Luodian,
You are right the code won't work for anything but Math, but the good thing is that the code for extracting indices is ready in lighteval, I have just not ported the code to math-verify to keep the domain concise see it here:
https://github.com/huggingface/lighteval/pull/495/files#diff-66345b827d57f4adb7b6736115827dd6bbe9383372e3d10f1c2bbdd9992d538aR198

I think I am ok, with also adding it to math-verify, while it's not math exactly, it follow pretty much the same logic for extraction.

One thing we need however is to know, the targets we are gonna search to prevent false positive/negatives during retrieval.
cc @qgallouedec, would it be possible to pass some tasks specification (type: math/code/mcq + n_choices) through kwargs?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to supervise non-math data? #56

How to supervise non-math data? #56

Luodian commented Jan 26, 2025

qgallouedec commented Jan 26, 2025 •

edited

Loading

Luodian commented Jan 26, 2025

qgallouedec commented Jan 26, 2025

hynky1999 commented Jan 26, 2025 •

edited

Loading

How to supervise non-math data? #56

How to supervise non-math data? #56

Comments

Luodian commented Jan 26, 2025

qgallouedec commented Jan 26, 2025 • edited Loading

Luodian commented Jan 26, 2025

qgallouedec commented Jan 26, 2025

hynky1999 commented Jan 26, 2025 • edited Loading

qgallouedec commented Jan 26, 2025 •

edited

Loading

hynky1999 commented Jan 26, 2025 •

edited

Loading