Don't propagate JE on empty test groups #176

tyilo · 2020-07-01T14:17:59Z

Assuming we have a problem with two test groups group1 and group2 then running verifyproblem foo -d group1/ always results in a JE verdict, because the pattern doesn't match any test cases in group2.

This fixes it, by returning the ? verdict for an empty group.

I don't know if this is the best solution.

thorehusfeldt · 2020-10-18T07:03:29Z

I agree that the current behaviour is suboptimal and would like to see it changed. verifypoblem’s ability to selectively remove test groups is very useful during problem development, and the JE verdict is highly misleasing.

However, I’d rather see this changed at the level of the default grader, here:

problemtools/support/default_grader/default_grader

Line 64 in 307bcbf

if accept_if_any_accepted and 'AC' in verdicts:

The specification at https://www.problemarchive.org/wiki/index.php/Problem_Format#Graders is silent about how to handle this—it boils down to whether “no errors found because no tests were run” should mean AC. It’s aesthetic choice as much as a moral one, to quote Bill Haydon, akin to whether the empty product is the multiplicative unity.

Suggestion

The cleanest solution would be to introduce the verdict ETG for empty test group. Currently, there is already a cornucopia of Verdict/Grade/Judgement in problemtools. At the grader level, these currently include:

problemtools/support/default_grader/default_grader

Line 7 in 307bcbf

sorting_order = ['JE', 'IF', 'RTE', 'MLE', 'TLE', 'OLE', 'WA', 'PE', 'AC']

but at other levels in the infrastructure, other verdicts are used. For instance, I don’t think PE and OLE ever make it through verifyproblem.

It would then be up to the default grader to decide how ETG propagates (and this can be described in the documentation). In particular, making ETG into an explicit verdict allows authors to change the grader if they have use cases with different preferences.

My own preference would be that “passing the empty test” gives AC yet that verifyproblem emits a friendly warning (such as “no tests run”, or “there were empty test groups”) . Alternatively I can also see the new AC-ish verdict ANT (accepted with no tests), mimicking APE from https://clics.ecs.baylor.edu/index.php/Contest_API#Judgement_Types , but it seems heavy-handed to burden the top-level family of verdicts with that.

But my experience with this is limited, and there are many other problem construction traditions that I haven’t thought through at all.

Also replace libmozjs with nodejs.

thorehusfeldt · 2021-01-01T18:06:17Z

Concrete suggestion:

The Probem format does not specify the result of aggregating with mode min an empty set of scores. (Such empty sets of scores typically arise during problem development, when empty test groups arise from filtering.)

verifyproblem returns the code AC and the score specified by accept_score for this group, yet will emit the warning Em pty test groups: followed by a list of test groups.

Don't propagate JE on empty test groups

a97d975

thorehusfeldt referenced this pull request Jan 1, 2021

Remove dependencies on python2, move to python3

47277d5

Also replace libmozjs with nodejs.

thorehusfeldt mentioned this pull request Jan 18, 2021

Default grader now crashes on empty test groups with mode min #189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't propagate JE on empty test groups #176

Don't propagate JE on empty test groups #176

tyilo commented Jul 1, 2020

thorehusfeldt commented Oct 18, 2020 •

edited

Loading

thorehusfeldt commented Jan 1, 2021

Don't propagate JE on empty test groups #176

Are you sure you want to change the base?

Don't propagate JE on empty test groups #176

Conversation

tyilo commented Jul 1, 2020

thorehusfeldt commented Oct 18, 2020 • edited Loading

Suggestion

thorehusfeldt commented Jan 1, 2021

thorehusfeldt commented Oct 18, 2020 •

edited

Loading