-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Nearest Neighbor] Code and basic test for nearest neighbor #28
base: main
Are you sure you want to change the base?
Conversation
test/basic.ml
Outdated
let mindist (p:floatarray) (e:envelope) = | ||
let (x0,y0,x1,y1) = Rtree.Rectangle.coords e in | ||
let gete ind = if ind==0 then x0 | ||
else if ind==1 then y0 else if ind==2 then x1 else y1 in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple if-else is not a good idea IMO
Maybe you can use a match statement here, something like this
let gete = match ind with
...
| 1 -> ...
|2 -> ...
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @AryanGodara here, also note the difference between =
and ==
in OCaml (both can be used to check for equality): https://cs3110.github.io/textbook/chapters/mut/refs.html?highlight=physical#physical-equality
test/basic.ml
Outdated
|
||
let minmaxdist p e = | ||
let (x0,y0,x1,y1) = Rtree.Rectangle.coords e in | ||
let gete ind = if ind==0 then x0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @AryanGodara. I also need some feedback on how the point shall be stored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @AryanGodara, I will change this but I also need feedback on where the point is stored in the code. @patricoferris I am waiting for the feedback
src/rtree_intf.ml
Outdated
@@ -43,6 +43,9 @@ module type Value = sig | |||
|
|||
val envelope : t -> envelope | |||
(** Given a value, calculates the envelope for it. *) | |||
|
|||
val mindist: t -> envelope -> float | |||
val minmaxdist: t -> envelope -> float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be exposed in the interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to test these functions as well. Is that not necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that they're a part of the Value
signature then we need them here and can be tested individually. Could we add some documentation strings to them ? minmaxdist
is a little opaque I think ?
test/basic.ml
Outdated
end | ||
|
||
module R = Rtree.Make (Rtree.Rectangle) (V) | ||
|
||
module Vp = struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is Vp short for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is short for (Value for Point)
Hi @patricoferris, please review the latest changes. I have listed my approach below which is based on the paper you provided.
There might be some mistakes as I have only tested a basic case for nearest neighbor. Please guide me on how to create a deterministic large R tree to correctly test the |
@patricoferris, today is the last day for some contribution, if you could review this and hopefully we could resolve this, it would be an addition for the outreachy application |
@harshey1103 please do still reference unmerged PRs in your application :)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great step in the right direction @kushalpokharel. When I was thinking about the interface for this feature, I thought the distance functions would end up in the Rtree_intf.Value
module? Users would then provide their own distance functions from whatever item they're storing to whatever envelope they are using? This would allow us to have a nearest neighbour function that works for all Rtree's and not just points, what do you think? Am I missing something?
src/rectangle.ml
Outdated
|
||
let mindist p e = | ||
let x0, y0, x1, y1 = (x0 e, y0 e, x1 e, y1 e) in | ||
let gete ind = match ind with 0 -> x0 | 1 -> y0 | 2 -> x1 | _ -> y1 in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these indirections are potentially a little more confusing than helpful. I think it would be clearer and more maintainable to write these in full, even if it seems a little more verbose. Seem my overview comment before making changes to this directly though.
src/rectangle.ml
Outdated
|
||
let minmaxdist p e = | ||
let x0, y0, x1, y1 = (x0 e, y0 e, x1 e, y1 e) in | ||
let gete ind = match ind with 0 -> x0 | 1 -> y0 | 2 -> x1 | _ -> y1 in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment.
src/rectangle.ml
Outdated
let minmaxdist p e = | ||
let x0, y0, x1, y1 = (x0 e, y0 e, x1 e, y1 e) in | ||
let gete ind = match ind with 0 -> x0 | 1 -> y0 | 2 -> x1 | _ -> y1 in | ||
let rm k = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A slightly more descriptive name would be helpful here, and also so we don't confuse rm
and rM
.
src/rectangle.ml
Outdated
in | ||
|
||
let rM k = | ||
if get p k >= (gete k +. gete (k + 2)) /. 2. then gete k else gete (k + 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gete
indirection could just be replace with direct use of the x0
etc. variables no ?
src/rtree_intf.ml
Outdated
@@ -77,6 +89,9 @@ module type Envelope = sig | |||
|
|||
val contains : t -> t -> bool | |||
(** [contains a b] asks whether [b] is contained by [a]. *) | |||
|
|||
val minmaxdist : float list -> t -> float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these needed in envelope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could put this in a separate module called Point, or create a function in an envelope, to calculate the distance from a point to that envelope. This seemed easier to me. I know this is not the appropriate place for the function. Need to find its home.
Exactly my thought and my question in the first comment. I wanted to do the same. However, calculating distances from a value to an envelope was confusing to me. I think the users can give the distance function between two envelopes in the R-tree because we could turn the value into an envelope. Is this approach correct? |
Hi @patricoferris, I made some changes to the code especially the location of distance functions inside V module. Can you review this? Regarding other comments:
|
Hi @patricoferris, can you provide some feedback on this? |
test/distance
Outdated
let gete ind = match ind with 0 -> x0 | 1 -> y0 | 2 -> x1 | _ -> y1 in | ||
let rm k = | ||
if get p k <= (gete k +. gete (k + 2)) /. 2. then gete k else gete (k + 2) | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this file committed by mistake? If so can it be removed ?
test/persist.ml
Outdated
@@ -10,6 +10,9 @@ module Line = struct | |||
let y0 = Float.min arr.(1) arr.(3) in | |||
let y1 = Float.max arr.(1) arr.(3) in | |||
Rtree.Rectangle.v ~x0 ~y0 ~x1 ~y1 | |||
|
|||
let mindist _a _b= 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do run a dune build @fmt --auto
to run the formatter as currently the CI will not like the _b=
part I think
src/rtree_intf.ml
Outdated
@@ -43,6 +43,9 @@ module type Value = sig | |||
|
|||
val envelope : t -> envelope | |||
(** Given a value, calculates the envelope for it. *) | |||
|
|||
val mindist: t -> envelope -> float | |||
val minmaxdist: t -> envelope -> float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that they're a part of the Value
signature then we need them here and can be tested individually. Could we add some documentation strings to them ? minmaxdist
is a little opaque I think ?
I don't quite understand, don't you already use let gete ind = match ind with 0 -> x0 | 1 -> y0 | 2 -> x1 | _ -> y1 in |
Sorry for the late reply, I missed the notification for this. I am using gete inside of the functions like |
@patricoferris, I have made some variable name changes and deleted the file. Please review these changes. |
I have added two functions: mindist and minmaxdistance, which will be helpful in pruning the tree while searching for the nearest neighbor. The code doesn't work yet. I have some doubts about the implementation: