Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions when Inference with InferenceModel and get_nearest_neighbors #429

Closed
lilium513 opened this issue Feb 16, 2022 · 4 comments
Closed
Labels
question A general question about the library

Comments

@lilium513
Copy link

Hello! Thank you for developing this great library. It helps me try metric learning a lot.

Problem happend when I inference after training.
I want to get target images from target_dataset which is top five closest to query image(which is query_dataset)

like↓

# expect to get images CosineSimilarity is bigger than 0.9
match_finder = MatchFinder(distance=CosineSimilarity(), threshold=0.9)

# some_model is trained with target_dataset
inference_model = InferenceModel(some_model, match_finder=match_finder) 

inference_model.train_knn(target_dataset)

# find top5  images  which is closest to  query_img
query_img = query_dataset[0].unsqueeze(0)
distances, indices = inference_model.get_nearest_neighbors(query_img, k= 5)

And var distances have tensor like
tensor([[0.5145, 0.5301, 0.5498, 0.5565, 0.5691]], device='cuda:0')

Then I have 2 questions.

Q1
The variable distances have tensor([[0.5145, 0.5301, 0.5498, 0.5565, 0.5691]], device='cuda:0')
I think all of cossim <= 0.9, smaller than threshold.
I imagine that "MatchFinder(distance=CosineSimilarity(), threshold=0.9)"
indicates "find points CosineSimilarity is bigger than 0.9".
Could you tell me What is wrong ?

Q2
And First value 0.5145 is less similar than last value 0.5691 if the value is CosineSimilarity.
What is happening here ?
As far as I know, the bigger CosineSimilarity , the more similar .
At a glance, from left to right, images got more and more similar to query image.
In these images, first one is worst and last one is best ?

Please let me know if I misunderstand.
Or If you need more information to answer these question, I'll happily give you.

Thank you so much! !

@lilium513 lilium513 changed the title Some questions when Inference Some questions when Inference with InferenceModel and get_nearest_neighbors Feb 16, 2022
@KevinMusgrave
Copy link
Owner

Q1 The variable distances have tensor([[0.5145, 0.5301, 0.5498, 0.5565, 0.5691]], device='cuda:0') I think all of cossim <= 0.9, smaller than threshold. I imagine that "MatchFinder(distance=CosineSimilarity(), threshold=0.9)" indicates "find points CosineSimilarity is bigger than 0.9". Could you tell me What is wrong ?

Q2 And First value 0.5145 is less similar than last value 0.5691 if the value is CosineSimilarity. What is happening here ? As far as I know, the bigger CosineSimilarity , the more similar . At a glance, from left to right, images got more and more similar to query image. In these images, first one is worst and last one is best ?

Sorry this is definitely confusing, and I don't think it's documented yet. Currently there are 2 components to InferenceModel:

  • A match finder, which is used by get_matches and is_match.
  • A knn function, which is used by get_nearest_neighbors.

The default knn function uses L2 distance, which is why the returned tensor starts small and gets larger (the nearest neighbor has the smallest distance.)

To use cosine similarity for the knn search, you can pass in this knn function:

import faiss
from pytorch_metric_learning.utils.inference import FaissKNN

knn_func = FaissKNN(reset_before=False, reset_after=False, index_init_fn=faiss.IndexFlatIP)
inference_model = InferenceModel(some_model, knn_func=knn_func) 

To reduce confusion, I should probably change the default match_finder and knn_func to use the same distance metric. I could also look into removing the match_finder and using knn_func in get_matches and is_match, though I'm not sure if using faiss on small batches is good performance-wise. Anyway, I've created a separate issue to keep track of that: #430

@KevinMusgrave KevinMusgrave added the question A general question about the library label Feb 16, 2022
@lilium513
Copy link
Author

Thank you for your really quick and kind answer !!
I totally understand !

@virusperfect
Copy link

This is definitely confusing. I thought the match finder would also be used for get_nearest_neighbors.

I also don't quite understand what the purpose of CustomKNN is in this context because I tried using knn_func=CustomKNN(CosineSimilarity()) but that doesn't work.

@KevinMusgrave
Copy link
Owner

@virusperfect Sorry about the confusion regarding match finder and CustomKNN. You're right, it should be compatible with InferenceModel. I've created an issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A general question about the library
Projects
None yet
Development

No branches or pull requests

3 participants