Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation of heuristicsmineR's causal nets using R's pm4py package #6

Open
vyoann opened this issue Feb 26, 2020 · 3 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@vyoann
Copy link

vyoann commented Feb 26, 2020

With R versions 3.6.1 and 3.6.2, using pm4py's evaluation functions on heuristicsmineR's causal nets converted to petri nets seems to give evaluation results which vary randomly.

For example, using the L_heur_1 provided directly with heuristicsmineR and used as in https://github.com/bupaverse/heuristicsmineR, we get the following petri net:

library(heuristicsmineR)
library(petrinetR)

data("L_heur_1")

cn<-causal_net(L_heur_1,
               threshold=.7)
pn<-as.petrinet(cn)
render_PN(pn)

l_heur_1_pn
Now, using the evaluation_all() function provided with pm4py and directly writing the net's final marking:

library(pm4py)

evaluation_all(L_heur_1,
               pn,
               pn$marking,
               c("p_in_6"))

#> $fitness
#> $fitness$perc_fit_traces
#> [1] 72.5
#> 
#> $fitness$average_trace_fitness
#> [1] 0.9692162
#> 
#> $fitness$log_fitness
#> [1] 0.9678538
#> 
#> 
#> $precision
#> [1] 0.9963899
#> 
#> $generalization
#> [1] 0.6225678
#> 
#> $simplicity
#> [1] 0.7777778
#> 
#> $metricsAverageWeight
#> [1] 0.8411473
#> 
#> $fscore
#> [1] 0.9819146

The same command exetuted once more gives us the following result:

evaluation_all(L_heur_1,
               pn,
               pn$marking,
               c("p_in_6"))
#> $fitness
#> $fitness$perc_fit_traces
#> [1] 97.5
#> 
#> $fitness$average_trace_fitness
#> [1] 0.9801938
#> 
#> $fitness$log_fitness
#> [1] 0.9784578
#> 
#> 
#> $precision
#> [1] 0.9966443
#> 
#> $generalization
#> [1] 0.6320084
#> 
#> $simplicity
#> [1] 0.7777778
#> 
#> $metricsAverageWeight
#> [1] 0.8462221
#> 
#> $fscore
#> [1] 0.9874673

All values have changed, particularly the perc_fit_traces.
However, the amount of different values the function will output seems to be finite and to depend on the number of unique traces present in the original log.

@fmannhardt fmannhardt self-assigned this Mar 3, 2020
@fmannhardt
Copy link
Member

Thanks for the very detailed bug report. This may be an issue in the source library PM4Py or the R bridge package pm4py. I am moving the issue to the pm4py repository as I think this has nothing to do with the heuristicsmineR package. I will look into this in the next days.

@fmannhardt fmannhardt transferred this issue from bupaverse/heuristicsmineR Mar 3, 2020
@fmannhardt
Copy link
Member

I tried to reproduce and get the following results:

x <- purrr::map(1:10, ~ evaluation_all(L_heur_1,
               pn,
               pn$marking,
               c("p_in_6")))

This gives the following differences:

Fitness

> purrr::map_dbl(x, ~ .x$fitness$average_trace_fitness)

 [1] 0.9592657 0.9935606 0.9787879 0.9714744 0.9935606 0.9620130 0.9935606
 [8] 0.9615385 0.9620130 0.9787879

Precision

> purrr::map_dbl(x, ~ .x$precision)

 [1] 0.9960938 0.9966443 0.9960938 0.9963899 0.9966443 0.9960938 0.9966443
 [8] 0.9960938 0.9960938 0.9960938

Same happens when using the individual fitness evaluation function:

x <- purrr::map(1:10, ~ evaluation_fitness(L_heur_1,
               pn,
               pn$marking,
               c("p_in_6")))

> purrr::map_dbl(x, ~ .x$average_trace_fitness)
 [1] 0.9787879 0.9861742 0.9711107 0.9692016 0.9615385 0.9714744 0.9692016
 [8] 0.9692016 0.9801938 0.9787879

So, something is definitely wrong.

@fmannhardt
Copy link
Member

fmannhardt commented Mar 19, 2020

However, I see that the default variant in PM4Py terminology is set to variant_fitness_token_based()

When comparing the result between the token replay and alignment, it is clear that the alignment is consistent and shows that the model perfectly fits the log:

x_token <- purrr::map_dbl(1:10, ~ evaluation_fitness(L_heur_1,
               pn,
               pn$marking,
               c("p_in_6"), variant = variant_fitness_token_based())$average_trace_fitness)
abs(max(x_token) - min(x_token))

x_alignment <- purrr::map_dbl(1:10, ~ evaluation_fitness(L_heur_1,
               pn,
               pn$marking,
               c("p_in_6"), variant = variant_fitness_alignment_based())$averageFitness)
abs(max(x_alignment) - min(x_alignment))

Gives:

[1] 0.03154762
[1] 0

I will ask at the PM4Py project if the token replay is expected to be randomised. Maybe it makes sense to change the default to alignment as this is the gold standard in many ways. Some more info from them here:
https://pm4py.fit.fraunhofer.de/documentation#conformance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants