SEMu (KLEE-SEMu) is a (Dynamic) Symbolic Execution-based Mutant analysis framework build on top of KLEE Symbolic Virtual Machine.
SEMu uses a form of differential symbolic execution to generate test inputs to kill mutants. The relevant paper can be found here (the paper's pdf here and the data-set here), and cited as following:
[x] Thierry Titcheu Chekam, Mike Papadakis, Maxime Cordy, Yves Le Traon. Killing Stubborn Mutants.
ACM Transactions on Software Engineering and Methodology. January 2021.
Article No.: 19. https://doi.org/10.1145/3425497
SEMu is easily used with the Muteria Framework and generate test to kill mutants generated by the Mart mutation tool.
- SEMu Symbolic Execution-based Mutant Analysis Framework
A. Using pre-build Docker image
This tool has a Docker image available here https://hub.docker.com/r/thierrytct/klee-semu. Start the Docker container in interactive mode with the following command:
sudo docker run -it --rm thierrytct/klee-semu:llvm-3.4.2 bash
The Dockerfile
used to build SEMu's Docker image is found in the code location lib/Mutation/Dockerfile
.
Follow the procedure of KLEE to build from source. The procedure using configure/make is available here. The procedure using cmake is available here.
The usage of SEMu is similar to that of KLEE. For integration with other testing tools and/or perform automated test execution, consider using Muteria (find some examples here).
The command line is of this form:
klee-semu <KLEE and SEMu specific control arguments> \
<LLVM bitcode (.bc) file> \
<Optional Symbolic arguments>
These are the 7 main arguments described in the SEMu paper.
Argument | Type | Description |
---|---|---|
--semu-precondition-length | value | The Precondition Length (PL) whose value is an integer >= 0, or value -1 for GMD2MS or value -2 for SMD2MS. |
--semu-checkpoint-window | value | The Checkpoint Window (CW) whose value is an integer >= 0. |
--semu-propagation-proportion | value | Propagation Proportion (PP) whose value is a real number in the interval [0, 1]. |
--semu-MDO-propagation-selection-strategy | boolean | Propagation Selection Strategy (PSS) set to Minimum Distance to Output (MDO). By default, Random strategy (RND) is used. Note that this is used only if PP is neither 0 nor 1. |
--semu-minimum-propagation-depth | value | The Minimum Propagation Depth (MPD) whose value is an integer >= 0. |
--semu-no-state-difference | boolean | No State Difference (NSD) disables state comparison for test generation. |
--semu-number-of-tests-per-mutant | value | The Number Of Tests Per Mutant (NTPM) whose value is the maximum number of tests to generate per mutants. |
These arguments affect how the values of some main arguments are interpreted, for more flexibility.
Argument | Type | Description |
---|---|---|
--semu-use-basicblock-for-distance | boolean | This argument enable the use of the number of basic blocks as distance for the Minimum Distance to Output (MDO) strategy. By default, the number of statements is used. |
--semu-use-only-multi-branching-for-depth | boolean | This argument enable to only count the branching points where both branches are feasible when calculating the checkpoint Window (CW). By default, all branching points are used. |
--semu-no-environment-output-diff | boolean | This argument disables the consideration of calls to the output environment (library calls that changes the environment, like printf for example) when comparing states at the checkpoints. |
--semu-max-total-tests-gen | value | This argument sets the total number of tests to generate across all mutants. Setting this to an integer > 0 invalidate the NTPM argument such that the maximum number of test per mutant is not fixed anymore. |
--semu-custom-output-function | value | Specify the functions to consider as output functions, besides the standard ones. Specify this multiple times for multiple functions. |
These are arguments that were added do to extra stuffs.
Argument | Type | Description |
---|---|---|
--semu-disable-post-mutation-check | boolean | This argument disable the conservative path pruning. With this enabled, SEMu would not check for infection to discard non infected states. This is important for higher order mutation. Make sure to enable this to generate tests inputs for higher order mutants. |
--semu-testsgen-only-for-critical-diffs | boolean | This argument enable to generate tests only for state difference that are severe (these do not include variable values difference) |
--semu-forkprocessfor-segv-externalcalls | boolean | This argument enable SEMu to create a ne process when making a system call, to avoid the whole execution to be stopped due to an exception in the system call execution. This happens which invalid calls to printf (when the string format does not match with the values given). |
--semu-no-error-on-memory-limit | boolean | This argument allows SEMu to discard states, without an error, when the specified memory limit is exceeded. note that the states are discarded randomly. Not setting this enable to ensure that no state is discarded randomly, due to memory limit. |
--semu-candidate-mutants-list-file | value | This argument specify a file that contains a new line-separated list of mutants ID to consider for test generation. This is important to filter the mutants and generate test only for the few specified mutants. This can also be useful, if there are too many mutants, to split the test generation in rounds (example 10 mutants per round). |
--semu-quiet | boolean | This argument, when set, suppress extra log when running SEMu. |
--semu-loop-break-delay | value | This argument stops SEMu from staying on a loop forever, by giving the amount of time it should stay exploring a loop. Once the time is exceeded, the paths that stay in the loop are pruned. The specified value is the time in seconds. |
These KLEE arguments and the specified values are required to be set when running SEMu.
--search bfs
In seeding mode (When using seeds), the following must additionally be set:
--only-replay-seeds
--seed-out-dir <path-to-seed_dir>
Note (for maintenance - TODO) that for the newer version of KLEE, --seed-out-dir
is changed to seed-dir
.
The LLVM bitcode file must contain the following:
- The mutants IDs must go from 1 to N, where N is the number of mutants.
- The module must contain a mutant ID selector global variable, named
klee_semu_GenMu_Mutant_ID_Selector
, which is a 32 bit integer (i32
in LLVM). - The mutant ID selector needs to be initialized to the highest mutant ID + 1 (N + 1). This is so that a direct execution will execute the original program (Mutant ID varaible not corresponding to any mutant).
- The mutants need to be represented using a
switch
statement controlled by the mutant ID global variable. - There must be a declaration of the selection function function named
klee_semu_GenMu_Mutant_ID_Selector_Func
, which takes two mutant IDs as arguments (twoi32
values), representing a range of mutant IDs, and returnsvoid
. - Right before any mutant selection
switch
statement, there must be calls to the selection functionklee_semu_GenMu_Mutant_ID_Selector_Func
where the arguments represent the range of mutants ID present in theswitch
statement. In case the IDs are not continuous, make multiple consecutive calls toklee_semu_GenMu_Mutant_ID_Selector_Func
for each interval. e.g. make two calls with (4,6) and (8,8) as arguments, respectively, if the mutants in the correspondingswitch
statement are with IDs 4, 5, 6 and 8. - If
--semu-disable-post-mutation-check
is not used, there must be a declaration of the post mutant functionklee_semu_GenMu_Post_Mutation_Point_Func
, which is similar in signature withklee_semu_GenMu_Mutant_ID_Selector_Func
, but instead of being called before the mutant, is called right after the mutants. - If
--semu-disable-post-mutation-check
is not used, there must be call to the functionklee_semu_GenMu_Post_Mutation_Point_Func
right after the corresponding mutants. This enable SEMu to do conservative pruning and remove the mutant states that are uninfected (same as corresponding original ones). Note that each mutant point'sklee_semu_GenMu_Post_Mutation_Point_Func
mush also have the call for the original program as followingklee_semu_GenMu_Post_Mutation_Point_Func(0,0)
.
Here is an example of Meta-mutant code that is expected by SEMu. This is represented in C for simplicity and can be compiled to LLVM for use with SEMu.
Initial code:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char ** argv) {
int x = atoi(argv[1]);
if (x > 0) {
x = x + 10;
printf ("Changed\n");
}
printf ("DONE!\n");
return x;
}
After creating 4 mutants, such that Mutant 1 changes x = x + 10
into x = x - 10
, mutant 2 changes x = x + 10
into x = 10
, mutant 3 deletes the statement printf ("DONE!\n")
, and mutant 4 deletes the statement x = x + 10
. The meta mutant file will be following:
#include <stdio.h>
#include <stdlib.h>
// Mutant ID selector global initialized to N+1 (N=4)
unsigned long klee_semu_GenMu_Mutant_ID_Selector = 5;
// mutant ID selector function. SEMu forks mutant states
// at calls to this
void klee_semu_GenMu_Mutant_ID_Selector_Func (unsigned long fromID, unsigned long toID);
// mutated code successor code. First code after mutant code.
// This is used by SEMu to do conservative pruning
// of no infection
void klee_semu_GenMu_Post_Mutation_Point_Func (unsigned long fromID, unsigned long toID);
int main (int argc, char ** argv) {
int x = atoi(argv[1]);
if (x > 0) {
klee_semu_GenMu_Mutant_ID_Selector_Func(1,2);
klee_semu_GenMu_Mutant_ID_Selector_Func(4,4);
switch (klee_semu_GenMu_Mutant_ID_Selector) {
case 1: x = x - 10; break;
case 2: x = 10; break;
case 4: break;
default: x = x + 10;
}
klee_semu_GenMu_Post_Mutation_Point_Func(0,0);
klee_semu_GenMu_Post_Mutation_Point_Func(1,2);
klee_semu_GenMu_Post_Mutation_Point_Func(4,4);
printf ("Changed\n");
}
klee_semu_GenMu_Mutant_ID_Selector_Func(3,3);
switch (klee_semu_GenMu_Mutant_ID_Selector) {
case 3: break;
default: printf ("DONE!\n");
}
klee_semu_GenMu_Post_Mutation_Point_Func(0,0);
klee_semu_GenMu_Post_Mutation_Point_Func(3,3);
return x;
}
Assuming that we have an example program named example.c
and that was compiled into LLVM bitcode to produce example.bc
file as following:
clang -g -c -emit-llvm example.c -o example.bc
Then Mart was used to generate the meta mutant file for SEMu, named example.MetaMu.bc
, as following:
mart example.bc
The file example.MetaMu.bc
is located into the folder mart-out-0
.
Following are example of commands.
klee-semu \
--external-calls all \
--posix-runtime \
--semu-no-error-on-memory-limit \
--solver-backend z3 \
--max-memory 2048 \
--max-time 300 \
--libc uclibc \
--search bfs \
--semu-precondition-length 0 \
--semu-checkpoint-window 1 \
--semu-propagation-proportion 0.25 \
--semu-minimum-propagation-depth 1 \
--semu-number-of-tests-per-mutant 1 \
example.MetaMu.bc --sym-arg 4
Need to specify the directory that contains the seeds (assuming example_path/seeds_dir
for this example) and --only-replay-seeds
. The seeds can be created by running KLEE first. The output of KLEE (the .ktest
files) can be used as seeds for SEMu.
klee-semu \
--external-calls all \
--posix-runtime \
--semu-no-error-on-memory-limit \
--solver-backend z3 \
--max-memory 2048 \
--max-time 300 \
--libc uclibc \
--search bfs \
--semu-precondition-length 0 \
--semu-checkpoint-window 1 \
--semu-propagation-proportion 0.25 \
--semu-minimum-propagation-depth 1 \
--semu-number-of-tests-per-mutant 1 \
--only-replay-seeds \
--seed-out-dir example_path/seeds_dir
example.MetaMu.bc --sym-arg 4
The main components added in KLEE to create SEMu are located in in the code in lib/Mutation
(mainly the source code files ExecutionState_KS.h
, ExecutionState_KS.cpp
, Executor_KS.h
, and Executor_KS.cpp
), and tools/klee-semu
(mainly the source code file klee-semu.cpp
).
This tool need to be integrated into newer versions of KLEE. The master branch correspond to the master branch of klee. For each use version of klee, the branch semu-klee-<version>
corresponding to SEMu ported to the version <version>
of KLEE.
- lib/Core/Mutation/ (Folder added)
- KLEE-README.md (KLEE README copied)
- README.md (Created SEMu README)
- lib/Core/CMakeLists.txt
- lib/Core/ExecutionState.h
- lib/Core/ExecutionState.cpp
- lib/Core/Executor.h
- lib/Core/Executor.cpp
- tools/klee/CMakeLists.txt
Update the master branch from the KLEE repository and update the tags as following (assuming that the git remote klee
was added as following git remote add klee https://github.com/klee/klee.git
):
git checkout master
git pull klee master
git push
git fetch --tags klee master
git push origin --tags
After SEMu is ported to a newer version of klee, set the default branch of the remote SEMu repository to the newest semu-klee-<version>
branch