Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correlations widget returns wrong values if fields are missing #6891

Open
ruvilonix opened this issue Sep 15, 2024 · 0 comments
Open

Correlations widget returns wrong values if fields are missing #6891

ruvilonix opened this issue Sep 15, 2024 · 0 comments
Assignees
Labels
bug report Bug is reported by user, not yet confirmed by the core team

Comments

@ruvilonix
Copy link

What's wrong?

Maybe I'm just a novice at statistics and this is how it's supposed to work, but it seems like a bug. When I connect a correlations widget to data that has missing fields in one of the features, the correlations are different than if I provide the same data but with the problem row removed. What I would expect is that if I am getting a correlation of six rows of two features, and one of the rows is missing the second feature, then only the five complete rows would factor into the correlation. But the value is not the same as a file with only those five rows.

I'll try to illustrate. Here is the test csv file, missing one value for score:

name,age,score
Bob,24,78
Gill,32,89
Fred,33,93
Julie,25,75
Sandra,20,98
Lucy,45,

Here is Orange:
image

The Scatter Plot shows the correct r value for the regression line of age and score (0.09). The upper Correlations widget shows an incorrect Pearson correlation (0.050). The lower Select Rows excludes the undefined "Lucy" row, then connects to another Correlations widget, which shows the correct value (0.090).

How can we reproduce the problem?

missing_row_test.ows.zip

Look at the correlations in the two Correlations widgets.

What's your environment?

  • Operating system: Ubuntu 24.04.1 LTS
  • Orange version: 3.37.0
  • How you installed Orange: mamba/conda
@ruvilonix ruvilonix added the bug report Bug is reported by user, not yet confirmed by the core team label Sep 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug report Bug is reported by user, not yet confirmed by the core team
Projects
None yet
Development

No branches or pull requests

2 participants