Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrapdims(::DataFrame) produces incorrect results when not all key combinations are present #105

Open
fredcallaway opened this issue Mar 11, 2022 · 4 comments

Comments

@fredcallaway
Copy link

df = DataFrame([
    (x="a", y="a", z=1),
    (x="a", y="b", z=2),
    (x="b", y="b", z=3),
])
wrapdims(df, :z, :x, :y)

The result varies, but here's one example

                 ("a")  ("b")
  ("a")           1      2
  ("b")  4615108224      3

I think the matrix should be initialized with an undefined value rather than an arbitrary value (maybe missing)?

@fredcallaway
Copy link
Author

Sorry, I should have read the documentation better. I think the default argument should be missing by default. The current default behavior can lead to errors from careless users like myself (and let's be honest, there are many of us).

@rofinn
Copy link
Collaborator

rofinn commented Mar 11, 2022

The issue with making the default missing is that this introduces unnecessary type complexity in many cases. If there are no missing values the returned element type is still Union{T, Missing} which may break downstream code. I'm not particularly bothered either way, that's the main reason we didn't make it the default.

@fredcallaway
Copy link
Author

Yeah, I totally see why you would want to not use missing if you know there are no missing values. But I guess I think it's better to fail obviously (with an error) than to produce incorrect values. One could even make an argument for requiring the default value be passed explicitly.

The minimal action would be to add a more salient warning to the documentation (both the README and the docstring). I can draft that if you like.

@rofinn
Copy link
Collaborator

rofinn commented Mar 11, 2022

default value be passed explicitly

Okay, yeah, that's probably the best path forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants