Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spatial Support #90

Open
vegaed opened this issue Aug 29, 2016 · 6 comments
Open

Spatial Support #90

vegaed opened this issue Aug 29, 2016 · 6 comments

Comments

@vegaed
Copy link

vegaed commented Aug 29, 2016

Just started to look at the code base. I am Lucene noob, so there is probably something I missed on initial inspection. Is there spatial support for queries? An example query I would like to registry would be terms "dog" "brown" in the Kansas(which would be a polygon) and then match if the text has dog and brown and there is a geo point / or polygon that is in Kansas. If not is this something that is being thought about being added in the future? If I was going to do this work where would I start?

@dsmiley
Copy link

dsmiley commented Aug 29, 2016

I haven't used Luwak yet but I believe it should be possible. Ideally, Luwak would be smart enough to have the Kansas polygon in the query be benefit from the pre-searcher by indexing the polygon (CompositeSpatialStrategy), and then the point document becomes a lookup.

@romseygeek
Copy link
Collaborator

Luwak doesn't have explicit spatial query support in the presearcher yet - your example would work fine, but the presearcher wouldn't end up filtering out "brown dog" queries in Missouri, for example.

One way to move forward might be to index polygons using the new Points-based APIs in lucene 6, and then filter things out using intersects-type queries - David will probably understand these better than me, though :)

@dsmiley
Copy link

dsmiley commented Aug 29, 2016

Lucene's Points API is for multi-dimensional points, not for polygons. Maybe some day that will change, but that is what it is now. Note that with multi-dimensional points, you could index rectangles using 4 dimensions.

With Points you can query by polygons, but not index them. With Luwak, query & index flips around.

@vegaed
Copy link
Author

vegaed commented Aug 30, 2016

Alan and David thanks for such quick responses.

Alan what I think your saying is that the presearchers won't stop the query from running against the document slowing the process down, but it would work.

David if I have a document with a point or a polygon and my queries are all polygons that I want the documents point or polygon to be inside. I think your are saying that in Luwak because things are reversed. I can have indexes that have both terms and polygons, but the single document that is in the MemoryIndex can only have points.

Also it seems like Luwak only has Core as dependency. Wouldn't I have to Spatial added as well?

@dsmiley
Copy link

dsmiley commented Aug 30, 2016

I'm not up to speed on some of the Luwak details so I'm not sure about what limitations it introduces on it's pre-searcher thing.

When Alan or I talk about Points with a capital 'P', we are talking about the so-called Points/PointValues API feature new to Lucene 6. You can use PointValues for spatial if it meets your requirements. Based on wanting polygons at index time (and query?) -- it won't. Although... depending on if the pre-searcher is a fast approximation, it's plausible it might if you might use a bounding-box rectangle as an approximation for the actual Polygon shape desired. Again, I'm out of my element talking about Luwak.

Lucene's current "spatial" module is not for you. "sandbox" might be to use PointValues via the user-friendly LatLonType. Again, only indexes points. There's something new in core v6.2+ than uses PointValues for indexing rectangles I think.

In Lucene "spatial-extras" you will find a source file that shows some of the concepts of that module: https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java It requires Spatial4j and if you want polygons too then JTS as well. This spatial-extras module honors predicates like the distinction between Intersects vs Contains vs Within.

@romseygeek
Copy link
Collaborator

The way the presearcher works is to invert the domain and range of a query - where before you're saying 'find all documents that match this query', you're now saying 'find all queries that could potentially match this document'. For spatial queries, what you'd essentially need to do is index a range; the question you want answered is, 'given the following point, which ranges in our index contain it?'. And as David says, the presearcher is just an approximation, so we can index bounding-boxes rather than complex shapes to keep things as simple as possible.

So there are two questions here, really:

  1. can we use the various lucene spatial implementations to index n-dimensional ranges? and
  2. can we create a QueryTreeBuilder that will extract and build these ranges from a spatial Query?

I don't really know the various spatial modules well enough to answer either of these questions, but it sounds like a fun problem to work on!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants