- The results for each algorithm and parameter setting can be found in
algorithm_results.csv
. - The resulting evaluation measures (for a Lag Window of
200
) as described in the paper are listed inevaluation_measures200.csv
. - The corresponding figures for the individual evaluation measures are in the
Evaluation_Results
folder.
To incorporate results from your approach into the evaluation, you can either:
- Run your approach separately, and create a csv containing at minimum the columns:
Algorithm
(The algorithm name)Log Source
(Where the log was sourced. In our case, "Ostovar", "Ceravolo", or "Bose")Log
(The log name, e.g., "bose_log.xes.gz")Detected Changepoints
(The indices (of cases) in the event log where changes where detected)Actual Changepoints for Log
(The ground truth changepoint indices)Duration (Seconds)
(The duration of the algorithm run in seconds)Duration
(Duration as hh:mm:ss string)- Columns specifying the parameter settings
- Append the results to the
algorithm_results.csv
file.
Or:
- Add your approach to the
testAll_reproducibility.py
file.- Define a function to run your approach on a single log that returns a list of dictionaries containing the columns mentioned above.
- Configure the parameters of your approach in the
testAll_config.yml
file by adding an entry to "approaches" containing:function
: The name of the function intestAll_reproducibility.py
params
: In here, list all the parameters that your function takes, specified as lists of possible values.
The required packages can be installed using anaconda:
> conda env create -f ./environment.yaml
...
> conda activate cdrift-evaluation
Or with the requirements.txt
file:
> pip install -r requirements.txt
- Additionally, a docker image is also provided. To use this image:
- Pull the image:
docker pull cpitsch/cdrift-evaluation
- Run the container with
docker run --name cdrift cpitsch/cdrift-evaluation
. Possible arguments are:--evaluate
: Run only the evaluation script. Assumes that the algorithm results are already present in the container.--runApproaches
: Run all the approaches (this will take a long time).--runAndEvaluate
: Run all the approaches and then evaluate the results.
- After algorithm runs and/or the evaluation, the results can be retrieved from the docker container through:
docker cp cdrift:cdrift_docker/algorithm_results.csv .
docker cp cdrift:cdrift_docker/Evaluation_Results .
- To run the algorithms on all event logs and parameter settings, execute the
testAll_reproducibility.py
file. - This will create a CSV file,
algorithm_results.csv
, containing the detected change points for every algorithm, event log, and parameter setting.- During the execution, also CSV files will be created containing the results for individual executions of the algorithms.
After running all the algorithms, the evaluation can be performed using the notebook: evaluate_results.ipynb
. This will create a folder Evaluation_Results
containing:
- A csv file
evaluation_measures<lag-window>.csv
containing all evaluation metrics as defined in the paper - All the generated figures
- One can explore the algorithms and their parameters in the provided jupyter notebooks
<algorithm-name>_example.ipynb
in the Examples directory:
Event Logs can be found in the EvaluationLogs
folder. These are the event logs used in the evaluation, which are sourced from:
- Ostovar et al. Robust Drift Characterization from Event Streams of Business Processes1 (Source)
- Ceravolo et al. Evaluation Goals for Online Process Mining: a Concept Drift Perspective 2 (Source)
- Bose et al. Handling Concept Drift in Process Mining 3
- Log-splitting (for Relation Type Count and Relation Entropy) with
logsplitter.py
- Extraction functions for J-Measure, Window Count, Relation Type Count, Relation Entropy are in
bose.py
- Apply the full detection pipeline with (example for J-Measure + KS Test):
from cdrift.approaches import bose
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
pvalues = bose.detectChange_JMeasure_KS(log, windowSize, measure_window, activityName_key) # Extract J-Measure, Apply Sliding Window tests with KS-Test
changepoints = bose.visualInspection(log, trim=windowSize) # Visual Inspection
- Extraction functions in
bose.py
- Apply the full pipeline with (example for J-Measure + KS Test):
from cdrift.approaches import martjushev
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
changepoints, pvalues = martjushev.detectChange_JMeasure_KS(log, windowSize, pvalue, return_pvalues=True, j_measure_window) # Extract J-Measure, apply sliding window with recursive bisection
- Apply the full pipeline with:
from cdrift.approaches import maaradji
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
changepoints, pvalues = maaradji.detectChangepoints(log, windowSize, pvalue, return_pvalues=True)
- Apply the full pipeline with:
from cdrift.approaches import earthmover
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
traces = earthmover.extractTraces(log) # Extract Time Series of Traces
em_dists = earthmover.calculateDistSeries(traces, windowSize) # Calculate Earth Mover's Distances
changepoints = earthmover.visualInspection(em_dists,trim=windowSize) # Visual Inspection
- Apply the full pipeline with:
from cdrift.approaches import process_graph_metrics as pgm
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
changepoints = pgm.detectChange(log, windowSize, maxWinSize, pvalue)
- Apply the full pipeline with:
from cdrift.approaches import zheng
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
changepoints = zheng.apply(log, mrid, epsilon)
- Apply the full pipeline with:
from cdrift.approaches import lcdd
from pm4py import read_xes
log = read_xes(PATH_TO_LOG) # Import the Log
changepoints = lcdd.calculate(log, complete_window_size, detection_window_size, stable_period)
- For a single execution of an algorithm on a single event log, a simple evaluation can be performed using functions from
evaluation.py
:
from cdrift import evaluation
f1 = evaluation.F1_Score(detected_cps, known_cps, lag_window)# Calculate F1-Score
# Or:
tp, fp = evaluation.getTP_FP(detected_cps, known_cps, lag_window) # Calculate True/False Positives
# Or:
from numpy import NaN
precision, recall = evaluation.calcPrecision_Recall(detected_cps, known_cps, lag_window, zero_division=NaN) # Calculate Precision/Recall
# Or:
avg_lag = evaluation.get_avg_lag(detected_cps, known_cps, lag=lag_window)
- The full evaluation for multiple algorithms, multiple parameter settings, and multiple event logs is performed by running the evaluation notebook.
- This takes the algorithm_results.csv file as input and generates the folder Evaluation_Results containing the evaluation results.
Footnotes
-
Ostovar et al. Robust Drift Characterization from Event Streams of Business Processes. URL https://www.doi.org/10.1145/3375398 ↩
-
Ceravolo et al. Evaluation Goals for Online Process Mining: a Concept Drift Perspective URL https://www.doi.org/10.1109/TSC.2020.3004532 ↩
-
Bose et al. Handling Concept Drift in Process Mining. URL https://www.doi.org/10.1007/978-3-642-21640-4_30 ↩