Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Lucene and all its subprojects:

Switch to Plain View
Solr >> mail # user >> Benchmark Solr vs Elastic Search vs Sensei


+
Volodymyr Zhabiuk 2012-04-27, 01:50
Copy link to this message
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Some observations:
1> I suspect some of your queries aren't doing what you expect, but
     I'm not sure if that matters. e.g. !tags:chick magnet will be parsed
     as -tags:chick defaultField:magnet.
2> Typical Solr setups in production are usually master/slave
     setups. Your indexing process (the commits) are causing
     new searchers to be opened/warmed/etc quite regularly,
     reducing your throughput. It's not surprising at all that
     your QPS rate increases when not indexing.
3> The trunk Near Real Time with "soft commits" should change
     the characteristics of the test with background indexing. You
     might try that.
4> Examine your cache usage, see the Solr admin page. Caches
     are quite important. Also consider autowarming characteristics.
5> There's a ton of stuff you can do to tune query rate. Unfortunately
     what the specific thing that would help your situation is hard to
     say. You might start with:
    http://wiki.apache.org/lucene-java/ImproveSearchingSpeed

Best
Erick

On Thu, Apr 26, 2012 at 9:50 PM, Volodymyr Zhabiuk <[EMAIL PROTECTED]> wrote:
> Hi Solr users
>
> I've implemented the project to compare the performance between
> Solr, Elastic Search and SenseiDB
> https://github.com/vzhabiuk/search-perf
>  the Solr version 3.5.0 was used. I've used the default configuration,
> just enabled json updates and used the following schema
> https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml.
> 2.5 mln documents were put into the index, after
> that I've launched the indexing process to add anotherr 500k docs. I
> was issuing commits after each 500 doc batch . At the
> same time I've launched the concurrent client, that sent the
> following type of queries
> ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!tags:soccer%20mom))%20
> OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:yellow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black))%20
> OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20
> OR%20city:u.s.a.*
> &facet=true&facet.field=tags&facet.field=color
> The query contains the high level "OR" query, consisting of 2 terms, 2
> ranges and 1 prefix. It is designed to hit ~60-70% of all the docs
> Here is the performance result:
> #Threads     min       median         mean            75%         qps
>   1         208.95ms  332.66ms    350.48ms     422.92ms     2.8
>   2         188.68ms  338.09ms    339.22ms     402.15ms     5.9
>   3         151.06ms  326.64ms    336.20ms     418.61ms     8.8
>   4         125.13ms  332.90ms    332.18ms     396.14ms     12.0
> If there is no  indexing process on background
> The result is as follows for 2,6 mln docs:
> #Threads     min     median          mean             75%         qps
>   1         106.70ms  199.66ms    199.40ms     234.89ms     5.1
>   2         128.61ms  199.12ms    201.81ms     229.89ms     9.9
>   3         110.99ms  197.43ms    203.13ms     232.25ms     14.7
>   4         90.24ms    201.46ms      200.46ms     227.75ms     19.9
>   5         106.14ms  208.75ms    207.69ms     242.88ms     24.0
>   6         103.75ms  208.91ms    211.23ms     238.60ms     28.3
>   7         113.54ms  207.07ms    209.69ms     239.99ms     33.3
>   8         117.32ms  216.38ms    224.74ms     258.74ms     35.5
> I've got three questions so far:
> 1. In case of background indexing the latency is almost 2 times
> higher, is there any way to overcome this?
> 2. How can we tune the Solr to get better results ?
> 3. What's in your opinion is the preferred type of queries that I can
> use for the benchmark?
>
> With many thanks,
> Volodymyr
>
>
> BTW here is the spec of my machine
> RedHat 6.1 64bit
> Intel XEON e5620 @2.40 GHz, 8 cores
> 63 GB RAM
+
Jeremy Taylor 2012-04-27, 17:59
+
Volodymyr Zhabiuk 2012-04-27, 18:33
+
Radim Kolar 2012-04-27, 19:39
+
Walter Underwood 2012-04-27, 19:46
+
Jeff Schmidt 2012-04-27, 19:58
+
Jason Rutherglen 2012-04-27, 20:12
+
Andy 2012-04-27, 20:49
+
Jake Luciani 2012-04-27, 21:02
+
Andy 2012-04-27, 20:55
+
Volodymyr Zhabiuk 2012-04-28, 00:06
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB