Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix expression column aggregate calculation with group_by after remove() #2311

Merged
merged 1 commit into from
Jul 20, 2023

Conversation

texodus
Copy link
Member

@texodus texodus commented Jul 20, 2023

Fixes a bug which causes expression columns to show null or 0 results when a remove() had taken place before.

When a View with expressions is created from a Table that has had remove() applied to it, some aggregates (like sum) are calculated incorrectly, as in the example below. Interestingly, this issue only occurs when the remove() occurs before the view(). If all remove() calls occur after the view is created, a slightly different code path correctly calculates the expressions.

Ultimately this issue was caused by a few under-tested code paths, in which he wrong mapping was chosen deep in a giant block of data processing code., given a choice between the "original" and remove()-corrected indices when matching primary keys to column's row index. The repro (below), simple as it is, was instrumental in identifying and fixing it.

schema = {"a": str, "b": float, "c": str}
data = [
    {
        "a": "A",
        "b": 46412.3804275,
        "c": "X"
    },
    {
        "a": "B",
        "b": 2317615.875,
        "c": "X"
    },
]

table = Table(schema, index="key")
table.update(data)
table.remove(["A"])
view = table.view(
    group_by=["c"],
    columns=["b", "alias"],
    expressions=['//alias\n"b"'],
)

# This has the wrong value for the `alias` column
view.to_records()

# This however would've worked, `.remove()` after `.view()`
table.remove(["A"])
view.to_records()

Fixes #1710 (???)

@texodus texodus merged commit 8e1ab71 into master Jul 20, 2023
12 checks passed
@texodus texodus deleted the fix-remove-expression-bug branch July 20, 2023 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Concrete, reproducible bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG in Expression] Null or incorrect expression field in JSON output after delete operation
1 participant