Default grader now crashes on empty test groups with mode min #189

thorehusfeldt · 2021-01-18T11:48:50Z

Commit 4251179 based on pull request #174 changed the behaviour of the default grader in certain corner cases from JE 0 to JE,see

4251179#commitcomment-46068430

This leads to verifyproblem with the -d data flag crashing in normal use:

> verifyproblem . -d a
[...]
File "/usr/lib/python3/dist-packages/problemtools/verifyproblem.py", line 567, in aggregate_results
    if not (min_score <= score <= max_score) and not self._seen_oob_scores:
TypeError: '<=' not supported between instances of 'float' and 'NoneType'

Indeed, min_score is typically 0.0, and score is now None (rather than 0).

Why does this happen?

A scoring problem with a grader aggregation mode min wants to compute the total score of an empty test group.

(The empty test group occurs during problem development because the -d flag to verifyproblem was used to only run the submissions on a certain test group, such as -d group3, or on certain input files not occurring in all test groups, such as -d huge. This flag correctly leads to empty test groups.)

The behaviour of min on aggregating the empty set is undefined in the specification, but the choice JE 0, though producing the unhelpful verdict JE also produced the score 0, making the behaviour confusing but useful during problem development.

With the choice JE, verifyproblem’s verdict is still the unhelpful JE, but now the variable score is set to None, leading to a run-time error in the comparison in line 567 quoted above and aborting verification.

Fix (quick)

Go back to JE 0 instead of JE.

Fix (better)

Discuss what min([]) should mean. The most useful behaviour would be AC 0 or even verdict AC and score accept_score, but with a warning “there were empty test groups.”

Minimal example

Here is a baby problem about adding two integers with two secret input files a.in and b.in, each in their own test group.

To crash:

verifyproblem . -d a

JE.zip

The text was updated successfully, but these errors were encountered:

thorehusfeldt · 2021-01-18T11:53:01Z

Possibly related:

Issue Scoring validator should not have to print score on non-AC #57
Pull Request Don't propagate JE on empty test groups #176

ghamerly · 2023-01-06T17:17:09Z

I ran into this this week, so now I care about it more 😄. In my case, the issue was avg rather than min, but the same issues apply.

Without reading this issue, I came to similar conclusions about the fixes that @thorehusfeldt suggested. I lean strongly towards defining "min([]) = 0" and "avg([]) = 0" and "max([]) = 0". These are not mathematically correct, but they are more helpful than crashing, and more helpful than reporting "JE 0", which I feel is confusing. Also, they are consistent with "sum([]) = 0".

My suggested change is to do the following. Instead of redefining min/max/avg, do this at https://github.com/Kattis/problemtools/blob/develop/support/default_grader/default_grader#L68:

Instead of

    score = aggregate_scores(scores)

do this

    if scores:
        score = aggregate_scores(scores)
    else:
        score = 0.0

It would be better to use the accept_score in place of 0.0, but the grader does not have access to that.

ghamerly · 2023-01-06T21:26:07Z

I had a discussion with @niemela about this today. We agreed that rather than define the min/max/avg of an empty set, it makes more sense to never call a grader when the set of results is empty. In other words, we should change verifyproblem and leave the default grader as it is. (It never makes sense to have an empty grading group in a problem, but when running verifyproblem with -d some of the groups are temporarily filtered out and the result sets for those are empty, so verifyproblem should avoid calling the grader in these cases.)

ghamerly mentioned this issue Jan 6, 2023

Do not grade empty groups #215

Merged

pehrsoderman closed this as completed in #215 May 1, 2023

thorehusfeldt mentioned this issue Nov 30, 2023

Expectations 0.6: Framework for specifying expectations in terms of timings and verdicts Kattis/problem-package-format#135

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default grader now crashes on empty test groups with mode min #189

Default grader now crashes on empty test groups with mode min #189

thorehusfeldt commented Jan 18, 2021 •

edited

Loading

thorehusfeldt commented Jan 18, 2021

ghamerly commented Jan 6, 2023

ghamerly commented Jan 6, 2023

Default grader now crashes on empty test groups with mode min #189

Default grader now crashes on empty test groups with mode min #189

Comments

thorehusfeldt commented Jan 18, 2021 • edited Loading

Why does this happen?

Fix (quick)

Fix (better)

Minimal example

thorehusfeldt commented Jan 18, 2021

ghamerly commented Jan 6, 2023

ghamerly commented Jan 6, 2023

thorehusfeldt commented Jan 18, 2021 •

edited

Loading