-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spatial Support #90
Comments
I haven't used Luwak yet but I believe it should be possible. Ideally, Luwak would be smart enough to have the Kansas polygon in the query be benefit from the pre-searcher by indexing the polygon (CompositeSpatialStrategy), and then the point document becomes a lookup. |
Luwak doesn't have explicit spatial query support in the presearcher yet - your example would work fine, but the presearcher wouldn't end up filtering out "brown dog" queries in Missouri, for example. One way to move forward might be to index polygons using the new Points-based APIs in lucene 6, and then filter things out using intersects-type queries - David will probably understand these better than me, though :) |
Lucene's Points API is for multi-dimensional points, not for polygons. Maybe some day that will change, but that is what it is now. Note that with multi-dimensional points, you could index rectangles using 4 dimensions. With Points you can query by polygons, but not index them. With Luwak, query & index flips around. |
Alan and David thanks for such quick responses. Alan what I think your saying is that the presearchers won't stop the query from running against the document slowing the process down, but it would work. David if I have a document with a point or a polygon and my queries are all polygons that I want the documents point or polygon to be inside. I think your are saying that in Luwak because things are reversed. I can have indexes that have both terms and polygons, but the single document that is in the MemoryIndex can only have points. Also it seems like Luwak only has Core as dependency. Wouldn't I have to Spatial added as well? |
I'm not up to speed on some of the Luwak details so I'm not sure about what limitations it introduces on it's pre-searcher thing. When Alan or I talk about Points with a capital 'P', we are talking about the so-called Points/PointValues API feature new to Lucene 6. You can use PointValues for spatial if it meets your requirements. Based on wanting polygons at index time (and query?) -- it won't. Although... depending on if the pre-searcher is a fast approximation, it's plausible it might if you might use a bounding-box rectangle as an approximation for the actual Polygon shape desired. Again, I'm out of my element talking about Luwak. Lucene's current "spatial" module is not for you. "sandbox" might be to use PointValues via the user-friendly LatLonType. Again, only indexes points. There's something new in core v6.2+ than uses PointValues for indexing rectangles I think. In Lucene "spatial-extras" you will find a source file that shows some of the concepts of that module: https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java It requires Spatial4j and if you want polygons too then JTS as well. This spatial-extras module honors predicates like the distinction between Intersects vs Contains vs Within. |
The way the presearcher works is to invert the domain and range of a query - where before you're saying 'find all documents that match this query', you're now saying 'find all queries that could potentially match this document'. For spatial queries, what you'd essentially need to do is index a range; the question you want answered is, 'given the following point, which ranges in our index contain it?'. And as David says, the presearcher is just an approximation, so we can index bounding-boxes rather than complex shapes to keep things as simple as possible. So there are two questions here, really:
I don't really know the various spatial modules well enough to answer either of these questions, but it sounds like a fun problem to work on! |
Just started to look at the code base. I am Lucene noob, so there is probably something I missed on initial inspection. Is there spatial support for queries? An example query I would like to registry would be terms "dog" "brown" in the Kansas(which would be a polygon) and then match if the text has dog and brown and there is a geo point / or polygon that is in Kansas. If not is this something that is being thought about being added in the future? If I was going to do this work where would I start?
The text was updated successfully, but these errors were encountered: