Spec: Combine the "Test" and "Suite" concepts #129

Krinkle · 2021-01-22T05:35:29Z

Status quo

The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests.

Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it.

Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than > symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of.

Ref TestAnything/testanything.github.io#36.

Summary of changes

See the diff of test/integration/reference-data.js for the concrete changes this makes to the consumable events.

Remove suiteStart and suiteEnd events.

Instead, the spec now says that tests are permitted to have children.

The link from child to parent remains the same as before, using the fullName field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end.
Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does runStart. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory.

I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object.
New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed.

Caveats

A test with the "failed" status is no longer expected to always have an error directly associated with it.

Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an errors and assertions array.

I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated.
A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped.

This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit.
Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parent can now have both its own assertions/errors, as well as other tests beneath it.

This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around.

Ref Allow for tests nested within tests #126.
The "Console" reporter that comes with js-reporter now no longer uses console.group() for collapsing nested tests.

Misc

Add definitions for the "Adapter" and "Producer" terms.
Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter".
Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere.

This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been.

Fixes #126.

== Status quo == The TAP 13 specification does not standardise a way of describing parent-child relationships between tests, nor does it standardise how to group tests. Yet, all major test frameworks have a way to group tests (e.g. QUnit module, and Mocha suite) and/or allow nesting tests inside of other tests (like tape, and node-tap). While the CRI draft provided a way to group tests, it did not accomodate Tap. They would either need to flatten the tests with a separator symbol in the test name, or to create an implied "Suite" for every test that has non-zero children and then come up with an ad-hoc naming scheme for it. Note that the TAP 13 reporter we ship, even after this change, still ends up flattening the tests by defaut using the greater than `>` symbol, but at least the event model itself recognises the relationships so that other output formats can make use of it, and in the future TAP 14 hopefully will recognise it as well, which we can then make use of. Ref TestAnything/testanything.github.io#36. == Summary of changes == See the diff of `test/integration/reference-data.js` for the concrete changes this makes to the consumable events. - Remove `suiteStart` and `suiteEnd` events. Instead, the spec now says that tests are permitted to have children. The link from child to parent remains the same as before, using the `fullName` field which is now a stack of test names. Previously, it was a stack of suite names with a test name at the end. - Remove all "downward" links from parent to child. Tests don't describe their children upfront in detail, and neither does `runStart`. This was information was very repetitive and tedious to satisy for implementors, and encouraged or required inefficient use of memory. I do recognise that a common use case might be to generate a single output file or stream where real-time updates are not needed, in which case you may want a convenient tree that is ready to traverse without needing to listen for async events and put it together. For this purpose, I have added a built-in reporter that simply listens to the new events and outputs a "summary" event with an object that is similar to the old "runEnd" event object where the entire run is described in a single large object. - New "SummaryReporter" for simple use cases of non-realtime traversing of single structure after the test has completed. == Caveats == - A test with the "failed" status is no longer expected to always have an error directly associated with it. Now that tests aggregate into other tests rather than into suites, this means tests that merely have other tests as children do still have to send a full testEnd event, and thus an `errors` and `assertions` array. I considered specifying that errors have to propagate but this seemed messy and could lead to duplicate diagnostic output in reporters, as well ambiguity or uncertainty over where errors originated. - A suite containing only "skipped" tests now aggregates as "passed" instead of "skipped". Given we can't know whether a suite is its own test with its own assertions, we also can't assume that if a test parent has only "skipped" children that the parent was also skipped. This applies to our built-in adapters, but individual frameworks, if they know that a suite was skipped in its entirety, can of course still set the status of parents however they see fit. - Graphical reporters (such as QUnit and Mocha's HTML reporters) may no longer assume that a test parent has either assertions/errors or other tests. A test parente can now have both its own assertions/errors, as well as other tests beneath it. This restricts the freedom and possibilities for visualisation. My recommendation is that, if a visual reporter wants to keep using different visual shapes for "group of assertions" and "group of tests", that they buffer the information internally such that they can first render all the tests's own assertions, and then render the children, even if they originally ran interleaved and/or the other way around. Ref #126. - The "Console" reporter that comes with js-reporter now no longer uses `console.group()` for collapsing nested tests. == Misc == - Add definitions for the "Adapter" and "Producer" terms. - Use terms "producer" and "reporter" consistently, instead of "framework", "runner", or "adapter". - Remove mention that the spec is for reporting information from "JavaScript test frameworks". CRI can be used to report information about any kind of test that can be represented in CRI's event model, including linting and end-to-end tests for JS programs, as well as non-JS programs. It describes a JS interface for reporters, but the information can come from anywhere. This further solifies that CRI is not meant to be used for "hooking" into a framework, and sets no expectation about timing or run-time environment being shared with whatever is executing tests in some form or another. This was already the intent originally, since it could be used to report information from other processes or from a cloud-based test runner like BrowserStack, but this removes any remaining confusion or doubt there may have been. Fixes #126.

…ta}` methods No longer used.

Krinkle · 2021-01-24T07:34:26Z

For a quick overview, see diff of the test fixture: reference-data.js and summary-reporter.js.

/cc @isaacs @ljharb
Could use a second pair of eyes on this. The effective difference is quite straight-forward but there are a number of caveats I outlined above. It'd be great if you could confirm that the listed caveats are indeed accepted status quo with node-tap and TAP, or whether there's perhaps another way that I missed. Thanks!

Krinkle added 2 commits January 22, 2021 05:39

[BREAKING CHANGE] Helpers: Remove `collectSuite{Start,StartData,EndDa…

6c32397

…ta}` methods No longer used.

Krinkle force-pushed the nested-tests branch from 57ba186 to 6c32397 Compare January 22, 2021 05:39

Krinkle merged commit 05fc407 into main Feb 14, 2021

Krinkle deleted the nested-tests branch February 14, 2021 21:30

This was referenced Feb 14, 2021

Recommend purging actual/expected values of assertions #100

Closed

Request for comment: CRI Version 1 #117

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spec: Combine the "Test" and "Suite" concepts #129

Spec: Combine the "Test" and "Suite" concepts #129

Krinkle commented Jan 22, 2021 •

edited

Loading

Krinkle commented Jan 24, 2021

Spec: Combine the "Test" and "Suite" concepts #129

Spec: Combine the "Test" and "Suite" concepts #129

Conversation

Krinkle commented Jan 22, 2021 • edited Loading

Status quo

Summary of changes

Caveats

Misc

Krinkle commented Jan 24, 2021

Krinkle commented Jan 22, 2021 •

edited

Loading