Solr 3.4. GeoHash Field Performance Query
I am using Solr 3.4 with 20M document index with latitude longitude points for each document. There is a pre-existing indexed field that uses solr.LatLonType called locLatLon. We are trying to compare the performance of this with solr.GeoHashField. I've added a new field to our schema that uses a GeoHash field called locLatLon_geohash, which is populated with a copyField instance from the locLatLon field. I did a selective download of the Solr index where I downloaded multiple documents and I was able to search both fields (I removed the actual solr server name)
On the surface, these two queries should return very similar results. The geospatial query takes 62 ms and returns 179k documents. The geohash request takes 34081ms and returns 121k documents. I am not too concerned about the number of results returned (yet) as I am concerned about the amount of time it takes to generate these results.
Reading about GeoHash it seems like this Solr query method should be very fast, but is actually very slow.
I tried debugging by adding the debugQuery = on query parameter, but that doesn't tell me anything that I can use without having to dig through the source code. Below are snippets of Solr results with only the result of a filter query.
Debug output of GeoHash Solr:
<arr name="parsed_filter_queries"> <str>ConstantScore(frange(ghhsin(str(locLatLon_geohash),literal(9q5cfxwybswp))):[0 TO 10.0])</str> </arr>
Solo geodata debug output:
<arr name="parsed_filter_queries"> <str>+locLatLon_0_coordinate:[34.01006796645071 TO 34.18993203354929] +locLatLon_1_coordinate:[-118.46600561233814 TO -118.24879438766185]</str> </arr>
QUESTION (s) : Is there something that I didn't take into account when using the GeoHash type for Solr? Is there anything else I should try to debug this?
source to share
Read the comments on SOLR-2155 . The attached patch has never been applied and the ticket is still not allowed, but the attached zip is a plugin with functionality so there is no need to actually fix the SOLR. The patch is supposed to allow multiple points to be indexed on the same document, but it also seems to implement a prefix matching geohash for fast bounding box searches.
source to share