Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree making overrun in UMAP - fixes #277 and #278 #279

Merged
merged 1 commit into from
Oct 10, 2024

Conversation

tremblap
Copy link
Member

@tremblap tremblap commented Oct 9, 2024

The tree-making seems to need to remove the front of the matrix - there is a boolean in the algo, but since tree making can be run on a square matrix elsewhere in the codebase, this was creating an overflow in UMAP as the client was not checking for a smaller number of neighbour than the number of points.

@weefuzzy and @g-roma if the rejection of the first item at UMAP.hpp:373 in fluid::algorithm::UMAP::makeGraph is right for the algo (I am not certain) then this fixes everything.

@weefuzzy
Copy link
Member

weefuzzy commented Oct 9, 2024

Certainly not familiar enough with the internals to pronounce authoritatively, but support for this change comes from the fact that this is what the reference implementation checks for too:

https://github.com/lmcinnes/umap/blob/c72ac2f4bc1b107a055a30e9582b37ea881d95ca/umap/umap_.py#L2430

@g-roma
Copy link
Member

g-roma commented Oct 9, 2024

The number of neighbors obviously needs to be smaller than the dataset. I think the reason for discarding the first neighbor is that it is the same as the query point, but IIRC it is used when transforming new data, so the option is there. But in the fitTransform case it is true that the number of neighbours needs to be at most (size -1), to account for the query point when making the graph.

@tremblap
Copy link
Member Author

ok then I'll merge, thanks for the explanation

@tremblap tremblap merged commit abc0bd7 into main Oct 10, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants