Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutation testing - pr differences #4178

Merged
merged 26 commits into from
Jan 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
8370213
feat: mutation testing initial integration
ASuciuX Nov 24, 2023
06d68b2
Merge branch 'develop' into test/cargo-mutants-testing
ASuciuX Nov 24, 2023
1c5a75c
fix: made functions discoverable to be mutants
ASuciuX Nov 24, 2023
19f56ea
feat: added mutants output before fix clarity package
ASuciuX Nov 27, 2023
a3caebb
feat: added mutants output after fix clarity package
ASuciuX Nov 27, 2023
e41e89a
Update mutants-testing-general.sh
ASuciuX Nov 27, 2023
b1649a3
feat: renamed mod to lib.rs
ASuciuX Nov 29, 2023
32aa967
Delete Dockerfile.mutation-testing as it is also run locally with cargo
ASuciuX Nov 30, 2023
92b4e23
feat: modular mutations on shell
ASuciuX Dec 1, 2023
5387f44
feat: restructure mutants to new CI workflows
ASuciuX Dec 11, 2023
0bf76dd
feat: update link to stacks action repo
ASuciuX Dec 11, 2023
9616e07
Merge branch 'develop' into test/cargo-mutants-testing
ASuciuX Dec 11, 2023
7c91912
feat: keep only filter pr workflow file
ASuciuX Dec 15, 2023
491557d
added specific triggers for the CI action on PR
ASuciuX Dec 15, 2023
ff3124e
modularize mutants' runs for different package size cases
ASuciuX Jan 7, 2024
b90dc06
feat: rename from `filter-pr` to `pr-differences`
ASuciuX Jan 8, 2024
13f7197
feat: shorter names for steps
ASuciuX Jan 8, 2024
921a01d
feat: check if `tac`, `awk` and `sed` commands exist on host
ASuciuX Jan 8, 2024
c6927a7
feat: check files before accessing them
ASuciuX Jan 9, 2024
298e8e0
feat: move shell runs from main workflow to composite action
ASuciuX Jan 10, 2024
12388b7
feat: add documentation for mutation testing
ASuciuX Jan 11, 2024
5a4f9b3
feat: mutation testing - update composite branch
ASuciuX Jan 16, 2024
16b7bff
feat: renamed back the lib files as cargo-mutants supports them now
ASuciuX Jan 17, 2024
9b54fcb
Merge branch 'develop' into test/mutants-filter-pr
wileyj Jan 18, 2024
0875d10
feat: mutants docs - time related outcomes
ASuciuX Jan 23, 2024
c16d87c
feat: update mutants doc format & some extra context
ASuciuX Jan 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions .github/workflows/pr-differences-mutants.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
name: PR Differences Mutants

on:
pull_request:
types:
- opened
- reopened
- synchronize
- ready_for_review
paths:
- "**.rs"

concurrency:
group: pr-differences-${{ github.head_ref || github.ref || github.run_id }}
# Always cancel duplicate jobs
cancel-in-progress: true

jobs:
# Check and output whether to run big (`stacks-node`/`stackslib`) or small (others) packages with or without shards
check-big-packages-and-shards:
name: Check Packages and Shards

runs-on: ubuntu-latest

outputs:
run_big_packages: ${{ steps.check_packages_and_shards.outputs.run_big_packages }}
big_packages_with_shards: ${{ steps.check_packages_and_shards.outputs.big_packages_with_shards }}
run_small_packages: ${{ steps.check_packages_and_shards.outputs.run_small_packages }}
small_packages_with_shards: ${{ steps.check_packages_and_shards.outputs.small_packages_with_shards }}

steps:
- id: check_packages_and_shards
uses: stacks-network/actions/stacks-core/mutation-testing/check-packages-and-shards@main

# Mutation testing - Execute on PR on small packages that have functions modified (normal run, no shards)
pr-differences-mutants-small-normal:
name: Mutation Testing - Normal, Small

needs: check-big-packages-and-shards

if: ${{ needs.check-big-packages-and-shards.outputs.run_small_packages == 'true' && needs.check-big-packages-and-shards.outputs.small_packages_with_shards == 'false' }}

runs-on: ubuntu-latest

steps:
- name: Run mutants on diffs
uses: stacks-network/actions/stacks-core/mutation-testing/pr-differences@main
with:
package-dimension: "small"

# Mutation testing - Execute on PR on small packages that have functions modified (run with strategy matrix shards)
pr-differences-mutants-small-shards:
name: Mutation Testing - Shards, Small

needs: check-big-packages-and-shards

if: ${{ needs.check-big-packages-and-shards.outputs.run_small_packages == 'true' && needs.check-big-packages-and-shards.outputs.small_packages_with_shards == 'true' }}

runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
shard: [0, 1, 2, 3]

steps:
- name: Run mutants on diffs
uses: stacks-network/actions/stacks-core/mutation-testing/pr-differences@main
with:
shard: ${{ matrix.shard }}
package-dimension: "small"

# Mutation testing - Execute on PR on big packages that have functions modified (normal run, no shards)
pr-differences-mutants-big-normal:
name: Mutation Testing - Normal, Big

needs: check-big-packages-and-shards

if: ${{ needs.check-big-packages-and-shards.outputs.run_big_packages == 'true' && needs.check-big-packages-and-shards.outputs.big_packages_with_shards == 'false' }}

runs-on: ubuntu-latest

steps:
- name: Run Run mutants on diffs
env:
BITCOIND_TEST: 1
RUST_BACKTRACE: full
uses: stacks-network/actions/stacks-core/mutation-testing/pr-differences@main
with:
package-dimension: "big"

# Mutation testing - Execute on PR on big packages that have functions modified (run with strategy matrix shards)
pr-differences-mutants-big-shards:
name: Mutation Testing - Shards, Big

needs: check-big-packages-and-shards

if: ${{ needs.check-big-packages-and-shards.outputs.run_big_packages == 'true' && needs.check-big-packages-and-shards.outputs.big_packages_with_shards == 'true' }}

runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
shard: [0, 1, 2, 3, 4, 5, 6, 7]

steps:
- name: Run mutants on diffs
env:
BITCOIND_TEST: 1
RUST_BACKTRACE: full
uses: stacks-network/actions/stacks-core/mutation-testing/pr-differences@main
with:
shard: ${{ matrix.shard }}
package-dimension: "big"

# Output the mutants and fail the workflow if there are missed/timeout/unviable mutants
output-mutants:
name: Output Mutants

runs-on: ubuntu-latest

needs:
[
check-big-packages-and-shards,
pr-differences-mutants-small-normal,
pr-differences-mutants-small-shards,
pr-differences-mutants-big-normal,
pr-differences-mutants-big-shards,
]

steps:
- name: Output Mutants
uses: stacks-network/actions/stacks-core/mutation-testing/output-pr-mutants@main
with:
big_packages: ${{ needs.check-big-packages-and-shards.outputs.run_big_packages }}
shards_for_big_packages: ${{ needs.check-big-packages-and-shards.outputs.big_packages_with_shards }}
small_packages: ${{ needs.check-big-packages-and-shards.outputs.run_small_packages }}
shards_for_small_packages: ${{ needs.check-big-packages-and-shards.outputs.small_packages_with_shards }}
95 changes: 95 additions & 0 deletions docs/ci-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,4 +228,99 @@ ex: Branch is named `develop` and the PR is numbered `113`
- `stacks-core:2.1.0.0.0`
- `stacks-core:latest`

## Mutation Testing

When a new Pull Request (PR) is submitted, this feature evaluates the quality of the tests added or modified in the PR.
It checks the new and altered functions through mutation testing.
Mutation testing involves making small changes (mutations) to the code to check if the tests can detect these changes.

The mutations are run with or without a [Github Actions matrix](https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs).
The matrix is used when there is a large number of mutations to run ([check doc specific cases](https://github.com/stacks-network/actions/blob/main/stacks-core/mutation-testing/check-packages-and-shards/README.md#outputs)).
We utilize a matrix strategy with shards to enable parallel execution in GitHub Actions.
This approach allows for the concurrent execution of multiple jobs across various runners.
The total workload is divided across all shards, effectively reducing the overall duration of a workflow because the time taken is approximately the total time divided by the number of shards (+ initial build & test time).
This is particularly advantageous for large packages that have significant build and test times, as it enhances efficiency and speeds up the process.

Since mutation testing is directly correlated to the written tests, there are slower packages (due to the quantity or time it takes to run the tests) like `stackslib` or `stacks-node`.
These mutations are run separately from the others, with one or more parallel jobs, depending on the amount of mutations found.

Once all the jobs have finished testing mutants, the last job collects all the tested mutations from the previous jobs, combines them and outputs them to the `Summary` section of the workflow, at the bottom of the page.
There, you can find all mutants on categories, with links to the function they tested, and a short description on how to fix the issue.
The PR should only be approved/merged after all the mutants tested are in the `Caught` category.

### Time required to run the workflow based on mutants outcome and packages' size

- Small packages typically completed in under 30 minutes, aided by the use of shards.
- Large packages like stackslib and stacks-node initially required about 20-25 minutes for build and test processes.
- Each "missed" and "caught" mutant took approximately 15 minutes. Using shards, this meant about 50-55 minutes for processing around 32 mutants (10-16 functions modified). Every additional 8 mutants added another 15 minutes to the runtime.
ASuciuX marked this conversation as resolved.
Show resolved Hide resolved
- "Unviable" mutants, which are functions lacking a Default implementation for their returned struct type, took less than a minute each.
- "Timeout" mutants typically required more time. However, these should be marked to be skipped (by adding a skip flag to their header) since they indicate functions unable to proceed in their test workflow with mutated values, as opposed to the original implementations.

File:

- [PR Differences Mutants](../.github/workflows/pr-differences-mutants.yml)

### Mutant Outcomes

- caught — A test failed with this mutant applied.
This is a good sign about test coverage.

- missed — No test failed with this mutation applied, which seems to indicate a gap in test coverage.
Or, it may be that the mutant is undistinguishable from the correct code.
In any case, you may wish to add a better test.

- unviable — The attempted mutation doesn't compile.
This is inconclusive about test coverage, since the function's return structure may not implement `Default::default()` (one of the mutations applied), hence causing the compile to fail.
It is recommended to add `Default` implementation for the return structures of these functions, only mark that the function should be skipped as a last resort.

- timeout — The mutation caused the test suite to run for a long time, until it was eventually killed.
You might want to investigate the cause and only mark the function to be skipped if necessary.

### Skipping Mutations

Some functions may be inherently hard to cover with tests, for example if:

- Generated mutants cause tests to hang.
- You've chosen to test the functionality by human inspection or some higher-level integration tests.
- The function has side effects or performance characteristics that are hard to test.
- You've decided that the function is not important to test.

To mark functions as skipped, so they are not mutated:

- Add a Cargo dependency of the [mutants](https://crates.io/crates/mutants) crate, version `0.0.3` or later (this must be a regular `dependency`, not a `dev-dependency`, because the annotation will be on non-test code) and mark functions with `#[mutants::skip]`, or

- You can avoid adding the dependency by using the slightly longer `#[cfg_attr(test, mutants::skip)]`.

### Example

```rust
use std::time::{Duration, Instant};

/// Returns true if the program should stop
#[cfg_attr(test, mutants::skip)] // Returning false would cause a hang
fn should_stop() -> bool {
true
}

pub fn controlled_loop() {
let start = Instant::now();
for i in 0.. {
println!("{}", i);
if should_stop() {
break;
}
if start.elapsed() > Duration::from_secs(60 * 5) {
panic!("timed out");
}
}
}

mod test {
#[test]
fn controlled_loop_terminates() {
super::controlled_loop()
}
}
```

ASuciuX marked this conversation as resolved.
Show resolved Hide resolved
---