Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSCode Test Explorer - bugs running tests for packages with multiple targets #18955

Open
duncanawoods opened this issue Jan 16, 2025 · 8 comments · May be fixed by #19005
Open

VSCode Test Explorer - bugs running tests for packages with multiple targets #18955

duncanawoods opened this issue Jan 16, 2025 · 8 comments · May be fixed by #19005
Labels
C-bug Category: bug

Comments

@duncanawoods
Copy link

duncanawoods commented Jan 16, 2025

There are a number of bugs running tests for packages that define multiple targets:

Running tests for a target can run unrelated tests in other packages / targets.

RA tries to run a test from a fully qualified name like:

root::module::test_name

a) RA assumes the root is a package name but it can be the name of a target inside a package.

b) RA does not supply cargo test with target parameters e.g. --bin name to run in the correct test

c) if test code is shared between multiple targets:

  • RA will be asked to run a test for each target
  • RA will try to combine the name of all tests into a single pattern with unpredictable results

e.g. if there are targets named "target1", "target2", "target3", RA will run every test in the workspace containing the word "target".

The results of a test are often assigned against the wrong target or crate

This is because the cargo test json output does not contain the package or target of the test it has run. RA assigns the result using hack_recover_crate_name which guesses at an answer.

@duncanawoods duncanawoods added the C-bug Category: bug label Jan 16, 2025
@duncanawoods
Copy link
Author

duncanawoods commented Jan 16, 2025

Suggested fixes

1. Running Tests

1 a) and 1 b) can be solved by extending test_runner::TestTarget and request::find_package_name to find a matching target not just the package.

c) When invoking a test from source code shared by multiple targets, there are two main options:

i) run the test for all the targets => this is what vscode is trying to do right now and would be fixed by replacing RA's effort to combine tests into a single path and running each requested test with separate invocations of cargo test.

ii) run the test for one target only => when developing a test, it's often undesirable to run it for every target. It's likely slow and redundant e.g. a common reason for multiple targets is a library with a cli wrapper. Identifying a default target to use would likely match the developer's intention.

2. Test Results

The current RA design tries to parse cargo command output without the context of the original command. This is insufficient to match test output to targets. The required fix is to extend the cargo command running infrastructure to make command context available when the output is being interpreted.

@duncanawoods
Copy link
Author

Background Info

  • the default target in a package can be a lib or a bin
  • there can be multiple bin targets but only one lib
  • tests can be specific to a target or in shared code used by multiple targets

I am testing against a workspace with two members that define additional targets:

  • test_bin
    • bin target
    • contains
      • test_bin_lib : a lib target
      • test_bin2 : am extra bin target
  • test_lib
    • lib target
    • contains
      • test_lib_bin1 : a bin target
      • test_lib_bin2 : a second bin target

Each with shared and target specific tests.

@duncanawoods
Copy link
Author

@HKalbasi

@HKalbasi
Copy link
Member

@duncanawoods Thanks for the write up.

This is because the cargo test json output does not contain the package or target of the test it has run.

I think the way to go is to make cargo test include the target name in the json, and we shouldn't extend our hacks to support cargo's incomplete output.

Last time I checked, people are working on cargo test json output, but cargo is slow to update, for example each PR needs 12 weeks to reach stable.

Another promising project is cargo nextest. It seems they already have a non broken json output, and even if we need more data, I guess they are willing to add (merge PRs) quickly. @duncanawoods are you willing to add support for running cargo nextest in addition of cargo test in the test explorer?

@duncanawoods
Copy link
Author

duncanawoods commented Jan 16, 2025

Hi @HKalbasi

I think the way to go is to make cargo test include the target name in the json

I'm even more pessimistic about cargo given the json output feature has been unstable for 7 years, I don't think it's changed much since then and I don't fancy our chances getting a breaking format change incorporated.

nextest

Yeah, I have taken a look at it. Most of it's features benefit CI (test-per-process, performance, retries, timeouts, build archives) or the terminal (can connect an interactive session to interrogate status).

I don't see too much it can offer RA. It's coverage and miri integrations intrigue me and might be a useful path for RA to support "test with coverage" and "test with leak detection" type features.

It has two unstable json formats (libtest-json, libtest-json-plus) and both contain the target name. I agree the prospect of getting PRs adopted feels brighter! I'll take a look at integrating it.

we shouldn't extend our hacks to support cargo's incomplete output.

I believe the change to CargoActor (or an equivalent) is an appropriate bug fix and likely necessary beyond this issue.

It's unusual for RA to not know anything about what triggered a cargo command when it handles it's output. I don't think it's sustainable for a user interface and would expect future scenarios e.g. error reporting or needing to know which UI element triggered an action so that appropriate visual updates can be made. Retaining information about the source of an async command in order to handle it's progress/results is just a basic building-block to me.

If it appears as a hack, are there some adjustments that would make it look more intentional?

(note, the fixes for identifying and running tests in the correct target are unrelated to cargo output).

@HKalbasi
Copy link
Member

I believe the change to CargoActor (or an equivalent) is an appropriate bug fix and likely necessary beyond this issue.

Is it necessary even if cargo did provide the target in its json output?

If it appears as a hack, are there some adjustments that would make it look more intentional?

By hack, I mean it is a problem that only cargo can fix it completely, and we can only workaround it in some cases. That is, currently it works 80% of the time, and with your PR it will work 90% of the time. We can make it working 95% of the time by throwing more code at it, but I think instead of that we should focus on fixing cargo json output (or switching to something like nextest) to make it working 100% of time and simplifying our code at the same time.

@duncanawoods duncanawoods linked a pull request Jan 22, 2025 that will close this issue
@duncanawoods
Copy link
Author

with your PR it will work 90% of the time.

I believe the updated PR fixes 100% of the cases I am aware of where RA can run the wrong test, assign results to the wrong test or build/test more than is required. hack_recover_crate_name can be removed.

e.g.

  • r-click a specific target in test explorer
    • RA will run the test only for that target of a package
  • r-click a specific package in file explorer
    • RA will run the tests for each target in the package
  • r-click a folder of crates in file explorer
    • RA will run the tests for all crates in the folder
  • from lines of code
    • RA will run the test for each target the code is used in

The previous PR fixed everything except where RA failed to reduce a list of tests to a single target so it executed all the tests in the workspace with no context. I consider that buggy behaviour. RA doesn't need to do this reduction and can execute the list of tests as provided. By invoking cargo for targets and not the workspace, all tests can be successfully identified.

By hack, I mean it is a problem that only cargo can fix it completely

I don't consider cargo at fault here. It's more of an RA bug because it discards what it knows about the command before it parses the output. A typical implementation would be a command pattern that retains context during execution. RA is atypical, requiring commands to be a stateless stream of events where each line of stdout from an external tool is fully self-describing.

@duncanawoods
Copy link
Author

duncanawoods commented Jan 22, 2025

cargo-nextest

I have created a sample integration.

https://github.com/duncanawoods/rust-analyzer/tree/nextest-integration

cargo install cargo-nextest

I have used the bazel inspired DSL to execute tests across targets and can identify tests from the additional information it adds to the test name in the emitted json. It fixes the same bugs as the PR.

  • I suspect there will be difficulty replacing cargo test. The bundled toolchain is likely just too sticky due to platform support, plug-in support, trust etc. For example, it looks like nextest doesn't support BSD.

  • A deal-breaker for this DSL method is that it requires successfully building the whole workspace. I believe it's essential to be able to run tests while unrelated packages are broken during tasks like refactoring or updating dependencies. This would mean switching to a per-target invocation of nextest instead.

  • I haven't done final polish like creating vscode settings etc.

    • toggling between cargo-test and cargo-nextest is hard-coded in request::handle_run_test
    • for my convenience, it's logging the test list and command-line at error level

Overall, I would recommend the PR given it fixes the same bugs without needing a new dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants