Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for issue #734 #735

Merged
merged 5 commits into from
Feb 11, 2024

Conversation

kevin-duclos
Copy link
Contributor

@kevin-duclos kevin-duclos commented May 24, 2023

Change traj.set_index call to avoid Pandas error: 'ValueError: cannot reindex on an axis with duplicate labels'.

Copy link
Member

@caspervdw caspervdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR @kevin-duclos. I haven't worked with pandas for a while so I am not sure why this error is happening. But looking at the code, I think it might be a fix for your specific issue, but in general, the code should work for inputs without the "particle" column too. This is not reflected by the unittests however (they all use inputs with the "particle" column).

Maybe @nkeim can comment on the specific intended usage of this function?

@nkeim nkeim added this to the 0.6.2 milestone Jan 4, 2024
@nkeim
Copy link
Contributor

nkeim commented Jan 16, 2024

Thanks @kevin-duclos . The error in #734 makes sense, except it should occur any time that there are many particles, including in the trackpy tests. So I'm confused.

I made changes to subtract_drift() and its docstring that I think address @caspervdw 's concern. However, at least temporarily, I added some lines to the tests that are supposed to trigger this error. We'll see if any of the CI environments catch it.

@nkeim
Copy link
Contributor

nkeim commented Jan 17, 2024

OK, I think this is ready to merge. In addition to the fix by @kevin-duclos , we now include two tests with no particle column:

  • DataFrame with only one particle. This is unlikely to cause any reindexing problem.
  • DataFrame with many (unlabeled) particles. This would be the case if one tried to use subtract_drift() with feature data prior to tracking. Note that compute_drift() requires a particle column, so the user would need to compute the drift some other way. I still don't know why this doesn't cause an error, but since it's a reasonable use case, we might as well test it.

I think this is ready to merge. @caspervdw does this address your question? @kevin-duclos do you think this still fixes the bug?

@nkeim
Copy link
Contributor

nkeim commented Feb 10, 2024

Just bumping this issue, especially for @kevin-duclos : Is this OK to merge?

@nkeim nkeim mentioned this pull request Feb 10, 2024
10 tasks
@caspervdw caspervdw merged commit 8883b8a into soft-matter:master Feb 11, 2024
17 checks passed
@kevin-duclos
Copy link
Contributor Author

All good with me, @nkeim. Thanks!

@kevin-duclos kevin-duclos deleted the fix-subtract_drift-error branch February 13, 2024 17:21
@xyc2718
Copy link

xyc2718 commented Mar 9, 2024

After the 0.6.2 update, the dataframe processed by subtract_drift() poses an issue for imsd() due to the presence of "particle" in both row and column indices. To address this, it needs to use dataframe.reset_index(drop=True) before running the function to ensure proper execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants