Skip to content

martineberlein/avicenna

Repository files navigation

Python Version GitHub release PyPI Tests LicenseCode style: black  

Avicenna

This repo contains the code to execute, develop and test our debugging prototype Avicenna.

Avicenna is a debugging tool designed to automatically determine the causes and conditions of program failures. It leverages both generative and predictive models to satisfy constraints over grammar elements and detect relations of input elements. Our tool uses the ISLa specification language to express complex failure circumstances as predicates over input elements. Avicenna learns input properties that are common across failing inputs and employs a feedback loop to refine the current debugging diagnoses by systematic experimentation. The result is crisp and precise diagnoses that closely match those determined by human experts, offering a significant advancement in the realm of automated debugging.

Quickstart

To demonstrate Avicenna's capabilities, we'll walk through a simple example using a program we call The Calculator.

The Calculator is a Python program designed to evaluate various mathematical expressions, including arithmetic equations and trigonometric functions:

import math

def arith_eval(inp) -> float:
    return eval(
        str(inp), {"sqrt": math.sqrt, "sin": math.sin, "cos": math.cos, "tan": math.tan}
    )

To identify bugs in The Calculator, we'll implement an oracle function. This function tests inputs and categorizes them as producing expected behavior (OracleResult.NO_BUG) or a bug (OracleResult.BUG):

from avicenna.avicenna import OracleResult
from avicenna.input import Input

def oracle(inp: str | Input) -> OracleResult:
    try:
        arith_eval(inp)
        return OracleResult.PASSING
    except ValueError:
        return OracleResult.FAILING

    return OracleResult.PASSING

To see the oracle function in action, we'll test a few sample inputs:

initial_inputs = ['sqrt(1)', 'cos(912)', 'tan(4)', 'sqrt(-3)']

for inp in initial_inputs:
    print(inp.ljust(30), oracle(inp))

Executing this code provides the following output:

sqrt(1)                        PASSING
cos(912)                       PASSING
tan(4)                         PASSING
sqrt(-3)                       FAILING

As we can see, the input sqrt(-3) triggers a bug in The Calculator.

In the following steps, we'll leverage Avicenna to pinpoint the root cause of this bug. We'll employ Avicenna's capabilities to identify the root cause of the bug and provide potential fixes. This will involve defining the input structure for our calculator and initializing Avicenna with the appropriate grammar, sample inputs, and oracle function.

To define the input structure, we utilize a grammar:

grammar = {
    "<start>": ["<arith_expr>"],
    "<arith_expr>": ["<function>(<number>)"],
    "<function>": ["sqrt", "sin", "cos", "tan"],
    "<number>": ["<maybe_minus><onenine><maybe_digits><maybe_frac>"],
    "<maybe_minus>": ["", "-"],
    "<onenine>": [str(num) for num in range(1, 10)],
    "<digit>": [str(num) for num in range(0, 10)],
    "<maybe_digits>": ["", "<digits>"],
    "<digits>": ["<digit>", "<digit><digits>"],
    "<maybe_frac>": ["", ".<digits>"],
}

The grammar represents the structure of valid inputs that can be given to the Calculator. For instance, it recognizes mathematical expressions involving functions (sqrt, sin, cos, tan) applied to numbers, which can be positive or negative.

Now, we're ready to initiate Avicenna with this grammar, the sample inputs (Note that at least one failure-inducing sample input is required), and the oracle function:

from avicenna import Avicenna

avicenna = Avicenna(
    grammar=grammar,
    initial_inputs=initial_inputs,
    oracle=oracle
)

diagnosis = avicenna.explain()

In the code above, we've created an instance of the Avicenna class and started the debugging process by invoking the explain() method. Avicenna will utilize its feedback loop to systematically probe and test the Calculator program, identify the root cause of the bug on the analysis of the bug's behavior.

from isla.language import ISLaUnparser

print(ISLaUnparser(diagnosis[0]).unparse())

This output is a symbolic representation -- an ISLa Constraint -- of the root cause of the failure detected by Avicenna in the Calculator program. Here's a breakdown of what it means:

(forall <number> elem in start:
        (<= (str.to.int elem) (str.to.int "-1")) and
  exists <function> elem_0 in start:
        (= elem_0 "sqrt"))

This output, expressed in first-order logic, is saying:

  • For all numbers (elements of type <number> in the grammar), if the integer representation of the number is less than or equal to -1 (<= (str.to.int elem) (str.to.int "-1")), and
  • There exists a function (an element of type <function> in the grammar) that equals to "sqrt" (= elem_0 "sqrt"),

then a bug is likely to occur.

In plain English, the output is indicating that the failure in our Calculator program occurs when trying to take the square root (sqrt) of a negative number (a number less than or equal to -1).

This is consistent with our expectations, since the square root of a negative number is not defined in the realm of real numbers. Consequently, Python's math.sqrt() function, which we've used in our Calculator program, throws a ValueError when given a negative number as input.

With this information, we can address the issue in our Calculator program to prevent crashes when dealing with such inputs. We might decide to handle such errors gracefully or implement support for complex numbers, depending on the requirements of our program.

Remember, these results are generated based on the information provided to Avicenna, such as the grammar and the oracle function, as well as the results of Avicenna's systematic testing of the Calculator program. So the more accurate and comprehensive these inputs are, the more helpful Avicenna's outputs will be.

More Examples:

If you want to explore more of how Avicenna works and how it can, for instance, be used to generate new and unseen behavior-triggering inputs, make sure to have a look at the jupyter notebooks in the notebooks folder:

  • Calculator.ipynb: A quick and more detailed tutorial on how to setup up Avicenna. It also showcases how to use the ISLa-Solver to generate new inputs from the final diagnosis.
  • Heartbleed.ipynb: This notebook demonstrates the capabilities of Avicenna on the infamous Heartbleed Bug.

Install, Development, Testing

Install

If all external dependencies are available, a simple pip install avicenna suffices. We recommend installing Avicenna inside a virtual environment (virtualenv):

python3.10 -m venv venv
source venv/bin/activate

pip install --upgrade pip
pip install avicenna

Development and Testing

For development and testing, we recommend using Avicenna inside a virtual environment (virtualenv). By doing the following steps in a standard shell (bash), one can run the Avicenna tests:

git clone https://github.com/martineberlein/avicenna.git
cd avicenna/

python3.10 -m venv venv
source venv/bin/activate

pip install --upgrade pip

# Run tests
pip install -e .[dev]
python3 -m pytest

Docker

For users who want to test Avicenna without managing external dependencies like Python, we offer a Docker container setup. Docker allows you to package and run applications in lightweight containers. Our Docker container already includes all necessary dependencies for Avicenna.

Step 1: Build the Docker image. Ensure you are in the directory containing the Dockerfile and execute:

docker build -t avicenna_image .

Step 2: Run a Docker container using the image you just built:

docker run -it --name avicenna avicenna_image

After executing the above command, you'll be inside the Docker container's shell.

Step 3: Update the Avicenna repository:

cd avicenna/
git pull

Step 4: Install Avicenna in editable mode and execute its tests:

pip install -e .[dev]
python3.10 -m pytest ./tests

Avicenna Research Paper

To reference Avicenna in your research, we request you to cite our ESEC/FSE'23 paper:

Martin Eberlein, Marius Smytzek, Dominic Steinhöfel, Lars Grunske, and Andreas Zeller. 2023. Semantic Debugging. In the Proceedings of the 31th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2023.

If you want to cite the Avicenna implementation of this repository, we request you to cite our GitHub repository as follows:

@misc{avicenna2023,
  author       = {Martin Eberlein and Marius Smytzek and Dominic Steinh{\"o}fel and Lars Grunske and Andreas Zeller},
  url          = {https://github.com/martineberlein/avicenna},
  title        = {AVICENNA},
  year         = {2023}
}