Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add restrictions around node:test usage #56027

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Nov 27, 2024

Adds a contributing guideline around the usage of node:test in tests/ folder. This pull-request is open as a result of last weeks TSC meeting.

Potentially unblocks #55716

cc @nodejs/tsc

@anonrig anonrig requested a review from lpinca November 27, 2024 16:07
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/tsc

@nodejs-github-bot nodejs-github-bot added the doc Issues and PRs related to the documentations. label Nov 27, 2024

These dependencies are:

- `node:async_hooks`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, async_hooks is probably the only thing I would include here (and I may update the test runner to migrate off of that in the future).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, but before removing the rest, what's your reasoning for keeping only async_hooks here?

Copy link
Contributor

@cjihrig cjihrig Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_hooks actually changes how things work.

child_process and fs are so heavily depended on by other things that if they stop working we will definitely notice. The test runner also doesn't do anything "fancy" with them. You can also use the test runner without spawning child processes. But, child processes are only used by the test runner CLI, which Node core doesn't use at all anyway.

The only place the test runner uses a stream is for emitting events. If you were going to include that, you may as well include event emitter as well since it is part of streams.

The vm module is only used (directly) for evaluating snapshot files.

Also worth noting that the test runner is already used to test the test runner itself 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you recommend changes to the text, please?

- `node:child_process`
- `node:fs`
- ReadableStream in `node:streams`
- `node:vm`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the files listed in test/parallel/test-bootstrap-modules.js can be a good measure here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? I don't follow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth adding anything related to the bootstrapping process to the list of things not to test with the test runner since I'm not sure you can be 100% certain the test runner itself is bootstrapped properly at that point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Most of) the files listed there are essential parts of the Node.js functionality that are used more ubiquitously, hence more likely to be depended on by node:test itself (e.g. when we talk about node:async_hooks, that's actually built on top of other modules, not just itself, test/parallel/test-bootstrap-modules.js list a set of files that are generally used everywhere)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you recommend changes to the text, please?

@joyeecheung
Copy link
Member

joyeecheung commented Nov 27, 2024

Some issues I've found with node:test that I think worth looking into before recommending using node:test.

  1. The test does not immediately stop when a test case fails, instead it continue running other test cases and output a bunch of lines in the form of ✔ <test description> (time spent on test cases) that I have to ignore. In some files the lines I have to ignore can be multi page and it can be very frustrating. When the test is written in the "classic" style there's no fishing required - you only see failure information in stderr, and the passed test cases are silent unless you deliberately log things. I find that behavior a lot more efficient. Is there a way to make that the default in the test reporter we use to reduce the noise level?

  2. 1 is amplified in the CI, where the special unicode characters used by the test reporters are not decoded properly on e.g. Windows and adds a lot of gibberish to the logs.

  3. IIUC if the tests cases are async and you console.log() in them, the logs would be mixed up when there are multiple test cases logging. It encourage a test pattern where one has to comment stuff out of the test to get any meaningful logs, which makes working on test failures harder and is not always practical e.g. when trying to decypher logs in the CI especially when investigating old flakes.

  4. I also wonder how much longer it takes to run the tests structured with just blocks v.s. running them using node:test and whether that would make the tests run even longer if we do a mass migration. I did some testing locally and it suggests that not using node:test would make the test run 30% faster.

    'use strict';
    
    require('../common');
    const assert = require('assert');
    const test = require('node:test');
    
    test('test', () => {
      assert.strictEqual(typeof process.argv[0], 'string');
    });
    'use strict';
    
    require('../common');
    const assert = require('assert');
    const test = require('node:test');
    assert.strictEqual(typeof process.argv[0], 'string');
    hyperfine "out/Release/node test/parallel/test-b.js" "out/Release/node test/parallel/test-a.js"
    Benchmark 1: out/Release/node test/parallel/test-b.js
      Time (mean ± σ):      49.8 ms ±  26.6 ms    [User: 40.8 ms, System: 4.5 ms]
      Range (min … max):    42.7 ms … 156.3 ms    18 runs
    
    Benchmark 2: out/Release/node test/parallel/test-a.js
      Time (mean ± σ):      38.4 ms ±   2.5 ms    [User: 35.0 ms, System: 2.8 ms]
      Range (min … max):    36.8 ms …  50.6 ms    71 runs
    
    Summary
      'out/Release/node test/parallel/test-a.js' ran
        1.30 ± 0.70 times faster than 'out/Release/node test/parallel/test-b.js'
    

@@ -141,6 +141,26 @@ request. Interesting things to notice:

## General recommendations

### Usage of `node:test`

It is optional to use `node:test` in tests outside of testing the `node:test`
Copy link
Member

@joyeecheung joyeecheung Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems important to document things like #52177 or otherwise we would see more flakes coming up once people start to spawn hundreds of child processes in parallel and overloading the machine using spawnPromisified + node:test....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Using the concurrency option is fine though unless you are specifically planning to spawn child processes. But that applies to things like Promise.all() as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you recommend changes to the text, please?

@cjihrig
Copy link
Contributor

cjihrig commented Nov 27, 2024

The test does not immediately stop when a test case fails

I think people will have different opinions on this. When there are a number of tests in a single file and I am relying on the CI for some platform other than macOS I actually want to see everything that passes and fails before pushing up another commit. Another general solution is to not test multiple things in a single file.

In some files the lines I have to ignore can be multi page and it can be very frustrating

The failures should be at the very bottom?

Windows and adds a lot of gibberish to the logs

This drives me crazy as well. I'm not sure if this is something specific to Jenkins that can be fixed or what. I haven't noticed it in GitHub action for example.

@cjihrig
Copy link
Contributor

cjihrig commented Nov 27, 2024

From a quick check, it does appear that Deno and Bun both use their own test runners in at least some places. Of course, I didn't check every test and I don't know what policies they might have in place around that.

@cjihrig
Copy link
Contributor

cjihrig commented Nov 27, 2024

it suggests that not using node:test would make the test run 30% faster.

I certainly expect using a test runner to add overhead that isn't there when not using a test runner. A few things to note: this sounds like web frameworks claiming to be x% faster at serving an empty response, but this often goes away once any real logic is introduced. Also, the test runner bootstraps itself when the first test is run, so I would expect subsequent test() calls to be faster.

@lpinca
Copy link
Member

lpinca commented Nov 27, 2024

I think people will have different opinions on this. When there are a number of tests in a single file and I am relying on the CI for some platform other than macOS I actually want to see everything that passes and fails before pushing up another commit.

It is indeed opinionated. I prefer to stop at the first failure, fix it, rerun, repeat.

Another general solution is to not test multiple things in a single file.

Definitely.

From a quick check, it does appear that Deno and Bun both use their own test runners in at least some places.

I honestly don't care how other projects run their tests and I think there is nothing wrong with our tests. I actually think the way our tests are currently written and run is one of the best part of the project. No bullshit, only the strictly needed code and dependencies. I am very convinced that a refactor to use node:test is a mistake. What does this bring to the project? There are multiple comments here and in other threads with concerns and real downsides. What are are the upsides? No specific Node.js API? That's a lie, for example see #56027. More details in the tests? Comments work better.

Instead of harmful refactors, I think that our time is better spent on investigating and fixing dozen of tests marked flaky and issues like #54918.

@anonrig
Copy link
Member Author

anonrig commented Nov 27, 2024

@lpinca I've mentioned couple of upsides of using node:test using the last TSC meeting. I recommend watching it since it also includes several different opinions from other TSC members as well.

@joyeecheung
Copy link
Member

joyeecheung commented Nov 27, 2024

I think people will have different opinions on this. When there are a number of tests in a single file and I am relying on the CI for some platform other than macOS I actually want to see everything that passes and fails before pushing up another commit.

There is a difference between running it on CI and locally. For example we use different output settings in the python test runner locally and in the CI as well. I think it would make more sense to align with what we do with the Python test runner: low-noise output when run locally, more details in the CI.

The failures should be at the very bottom?

The logs are not, and are in the middle of a bunch of passing test descriptions that you need to ignore - and when you are debugging a test failures, you mostly care about the assertion failure and logs, not the test descriptions (especially when there's no requirement about writing good test descriptions and they might just be random words that people put together...).

Also, this is assuming that only a single test is run. When multiple tests are failing during a run by the Python test runner, you are still going to have to scroll and fish out failure form pages of noise from later tests, instead of just looking at only relevant error information from all the tests that are failing.

At the very least, is there e.g. an environment variable that allows us to skip the logs about successful tests? It can be opinionated but personally I find them rather counter productive especially when I put any logs in the tests to aid debugging.

@joyeecheung
Copy link
Member

joyeecheung commented Nov 27, 2024

A few things to note: this sounds like web frameworks claiming to be x% faster at serving an empty response, but this often goes away once any real logic is introduced. Also, the test runner bootstraps itself when the first test is run, so I would expect subsequent test() calls to be faster.

Isn't that going to be in conflict with the recommendation of:

Another general solution is to not test multiple things in a single file.

? The more singled out tests we have, the more overhead we will introduce; but if we squeeze the tests in one file, the reporter will make the test failures harder to fish out from the noise?

Also many core tests are just very light weight - they are core tests, after all, and many of them don't test complex operations but just trivial edge cases (validation errors, simple calls to deps, pure JS computations etc.). In many cases the biggest part of the overhead is the bootstrapping overhead and the actual tests actually take less time than the bootstrap itself. Of course there are also tests that are more complex and async e.g. the http tests, which I think might benefit from using node:test. But I think we should also have some guidelines about when to avoid using node:test on other smaller tests (e.g. many of the util tests).

@cjihrig
Copy link
Contributor

cjihrig commented Nov 27, 2024

I am very convinced that a refactor to use node:test is a mistake.

I'm not the person advocating for a massive refactor and never have been. I will say that I would not put that in the top 10 all time worst mistakes made by the project, or even top 3 in the past 6 months 😄

Isn't that going to be in conflict with the recommendation of

Yes. Some of the tests are currently written with many tests in the same file. We should aim to change that.

Also many core tests are just very light weight...

Of course there are also tests that are more complex and async...

This. It would be nice to have more subtlety around this topic.

I was mostly just commenting here because I think these threads are mixing valid feedback, personal opinions, and things presented as fact that are incorrect based on people's opinions. I'll stop now.

@lpinca
Copy link
Member

lpinca commented Nov 27, 2024

@anonrig I've just finished watching it. The following arguments in favor of node:test are raised

  1. The current tests depends on things other runtimes don't have like process hooks.
  2. There are duplicate tests and by enforcing a description node:test will help find them.
  3. node:test improves readability.

  1. Yes, Node.js tests use Node.js features. This was also discussed in Discussion/Tracking: Adding more structure to Node.js' tests #54796.
  2. I struggle to understand how node:test can help with duplicate tests. Nothing will change for file names and as Joyee pointed out, there is no requirement for the the description, it can be just few letters. Comments are a better alternative.
  3. It might for some tests, but definitely not here https://github.com/nodejs/node/pull/55716/files or here https://github.com/nodejs/node/pull/55748/files or here https://github.com/nodejs/node/pull/55751/files.

I'm getting bored of repeating myself but don't add complexity where it is not needed, especially in tests. We only hurt ourselves with those refactors.

@joyeecheung
Copy link
Member

joyeecheung commented Nov 28, 2024

This. It would be nice to have more subtlety around this topic.

I think maybe a good measure about this might be: only the people maintaining what the test is testing get to choose what format the test should be written in, and forbid test-only changes from PRs that don't simultaneously change the features that the test is testing (unless they obviously had many commits in said feature or they reach consensus about this if there are multiple people maintaining said feature).

The reasoning is that those who maintain the feature being tested would be the ones impacted the most by the test change and they should have better judgement about whether the test format changes can make their life easier or harder. That's what I have been doing so far as well - if I am touching an existing test, I respect whatever the test format it is in and just follow it. But I otherwise would not author node:test tests by myself at this point because of the UX issues mentioned in #56027 (comment) , especially the noise about successful tests. I don't mind tests for things that I don't maintain to be rewritten since I normally don't break them anyway, and don't normally need to spend minutes fishing out useful information out of the noise from dozens of tests being broken in a single test run because I make a mistake in the implementation. And I trust people who maintain things that I am not familiar with to write better tests using node:test if they believe this makes their life eaiser.

@JakobJingleheimer
Copy link
Member

  1. The test does not immediately stop when a test case fails

This is exactly what I want

  1. Is there a way to make that the default in the test reporter we use to reduce the noise level?

I've seen other test runners with a reporter that outputs only on failure. I think that would be good here and addresses the issue Joyee raised (CI output is indeed quite long). Writing to stdout is also not free, so that should help with performance too.

2. 1 is amplified in the CI, where the special unicode characters used by the test reporters are not decoded properly on e.g. Windows and adds a lot of gibberish to the logs.

I think the dot reporter won't have that issue.

3. IIUC if the tests cases are async and you console.log() in them, the logs would be mixed up when there are multiple test cases logging.

Oof, yes, this is not ideal.

doc/contributing/writing-tests.md Outdated Show resolved Hide resolved
doc/contributing/writing-tests.md Show resolved Hide resolved
@legendecas
Copy link
Member

My personal experience with node:test in the core repo is that it outputs super long failures because a single failure is printed twice: #56316. I find this really hurts the experience.

@JakobJingleheimer
Copy link
Member

JakobJingleheimer commented Dec 21, 2024

It is indeed opinionated. I prefer to stop at the first failure, fix it, rerun, repeat.

I absolutely do not want this in the strongest of terms. IMO this an atrocious DX. I do not have a Windows PC, which is the least reliable platform. Whenever there are cross-platform issues, it's always Windows. If CI stops on the first of 50 failures and I have to discover and fix them one by one, with CI taking 4 hours each run? Absolutely. Not. I would never send another PR again.

@JakobJingleheimer
Copy link
Member

These concerns seem very actionable though 🙂

I expect we could even fairly easily facilitate the kill-all-on-first-failure via a label (I think it shouldn't be the default though, and definitely not the only option).

For the output noise, I don't know enough about test reporters, but if we can't configure an existing one for a quiet mode, could we fork one to facilitate it?

@JakobJingleheimer

This comment was marked as duplicate.

@joyeecheung
Copy link
Member

joyeecheung commented Dec 22, 2024

I do not have a Windows PC, which is the least reliable platform. Whenever there are cross-platform issues, it's always Windows. If CI stops on the first of 50 failures and I have to discover and fix them one by one, with CI taking 4 hours each run? Absolutely. Not. I would never send another PR again.

I think we are talking about local behavior, which is different from the CI, and obviously the settings in the CI is different from the local one, just like when you use the Python runner it produces TAP output in the CI, but a progress bar locally, and it can skip or adapt to the environment as needed/with flags.

@lpinca
Copy link
Member

lpinca commented Dec 22, 2024

I absolutely do not want this in the strongest of terms. IMO this an atrocious DX.

In most cases it is a single test, so it does not matter and even if there are subtests, usually different failures depends on different things that are investigated and fixed separately, so the test is still run multiple times.

After months of discussions (#54796, here, several refactoring PRs) I still can't see any good/solid reason in favor of using node:test for existing (and new) tests. All the reasons listed so far by proponents (nodejs/TSC@3de15b8, #56027 (comment)) are weak (sometimes invalid) and don't even remotely outweigh the disadvantages.

@@ -141,6 +141,27 @@ request. Interesting things to notice:

## General recommendations

### Usage of `node:test`

It is optional to use `node:test` in tests outside of testing the `node:test`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about this a bit, and I don't believe we should include this list.

  • First, it is inaccurate and subject to change. For example, child_process is only used by the run() API, which is not something we would be using in our tests anyway.
  • Second, by following this logic, we would need to update our current test suite to not use test/common in many places. For example, test/common/index.js, which is included in almost every test file imports a number of core modules.
  • Third, some modules are almost impossible to avoid and our current testing setup does not try to do that. For example, we use fs to read test files to look for test flags, we use child_process to respawn the test file with the correct flags, we use an 'exit' event to check for leaked globals and mustCalls().
  • Fourth, if we are going to maintain a list of problematic things to test, then that list should probably be based on real problems that have been encountered (such as spawning a bunch of child processes concurrently, which coincidentally the test runner can actually help with, and we have a workaround for running tests locally vs. in CI already).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fourth, if we are going to maintain a list of problematic things to test, then that list should probably be based on real problems that have been encountered

I agree about this statement. Let's wait for tomorrow's @nodejs/tsc meeting to add this.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina
Copy link
Member

@cjihrig are you against using the test runner itself in core?

@cjihrig
Copy link
Contributor

cjihrig commented Jan 22, 2025

Not at all. I have been happily using it.

Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran into https://ci.nodejs.org/job/node-test-commit-osx/63165/nodes=osx11-x64/testReport/junit/(root)/parallel/test_runner_module_mocking/

Compared to the other test that does not use node:test which was broken by the same reason (I think the main branch was broken, to reproduce just revert #56685):

https://ci.nodejs.org/job/node-test-commit-osx/63165/nodes=osx11-x64/testReport/junit/(root)/parallel/test_permission_dc_worker_threads/

I think output of node:test doesn't work with our Jenkins setups. I am blocking until this gets fixed somehow, either the Jenkins get fixed, or some workaround developed in the test runner, or at least amend this style guide to in a way that makes people write tests with node:test that do not produce a mess like https://ci.nodejs.org/job/node-test-commit-osx/63165/nodes=osx11-x64/testReport/junit/(root)/parallel/test_runner_module_mocking/ when it gets broken (this is a test runner test so I assume people knew what they doing when writing this test)

@@ -141,6 +141,27 @@ request. Interesting things to notice:

## General recommendations

### Usage of `node:test`

It is optional to use `node:test` in tests outside of testing the `node:test`
Copy link
Member

@joyeecheung joyeecheung Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is optional to use `node:test` in tests outside of testing the `node:test`
It is discouraged to change existing tests that do not use `node:test` to use it, if there is no
other better motivation. For newly added tests, is optional to use `node:test` in tests outside of testing
the `node:test`

I think we should mention that this is optional for newly added tests, to avoid encouraging people update existing tests solely for stylistic preferences like what happened in #56671. If people have to update it to use node:test based on stylistic preferences, IMO we should require them to break what it is testing deliberately in the PR and and run the CI both before and after the update, and see what the output looks like, to see if the use of node:test makes the output look worse than before, so that other people don't get hit by something like https://ci.nodejs.org/job/node-test-commit-osx/63165/nodes=osx11-x64/testReport/junit/(root)/parallel/test_runner_module_mocking/ afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants