|
|
-
Benchmark Solr vs Elastic Search vs Sensei
Volodymyr Zhabiuk 2012-04-27, 01:50
Hi Solr users I've implemented the project to compare the performance between Solr, Elastic Search and SenseiDB https://github.com/vzhabiuk/search-perf the Solr version 3.5.0 was used. I've used the default configuration, just enabled json updates and used the following schema https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml. 2.5 mln documents were put into the index, after that I've launched the indexing process to add anotherr 500k docs. I was issuing commits after each 500 doc batch . At the same time I've launched the concurrent client, that sent the following type of queries ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!tags:soccer%20mom))%20 OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:yellow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black))%20 OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 OR%20city:u.s.a.* &facet=true&facet.field=tags&facet.field=color The query contains the high level "OR" query, consisting of 2 terms, 2 ranges and 1 prefix. It is designed to hit ~60-70% of all the docs Here is the performance result: #Threads min median mean 75% qps 1 208.95ms 332.66ms 350.48ms 422.92ms 2.8 2 188.68ms 338.09ms 339.22ms 402.15ms 5.9 3 151.06ms 326.64ms 336.20ms 418.61ms 8.8 4 125.13ms 332.90ms 332.18ms 396.14ms 12.0 If there is no indexing process on background The result is as follows for 2,6 mln docs: #Threads min median mean 75% qps 1 106.70ms 199.66ms 199.40ms 234.89ms 5.1 2 128.61ms 199.12ms 201.81ms 229.89ms 9.9 3 110.99ms 197.43ms 203.13ms 232.25ms 14.7 4 90.24ms 201.46ms 200.46ms 227.75ms 19.9 5 106.14ms 208.75ms 207.69ms 242.88ms 24.0 6 103.75ms 208.91ms 211.23ms 238.60ms 28.3 7 113.54ms 207.07ms 209.69ms 239.99ms 33.3 8 117.32ms 216.38ms 224.74ms 258.74ms 35.5 I've got three questions so far: 1. In case of background indexing the latency is almost 2 times higher, is there any way to overcome this? 2. How can we tune the Solr to get better results ? 3. What's in your opinion is the preferred type of queries that I can use for the benchmark? With many thanks, Volodymyr BTW here is the spec of my machine RedHat 6.1 64bit Intel XEON e5620 @2.40 GHz, 8 cores 63 GB RAM
+
Volodymyr Zhabiuk 2012-04-27, 01:50
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Erick Erickson 2012-04-27, 12:25
Some observations: 1> I suspect some of your queries aren't doing what you expect, but I'm not sure if that matters. e.g. !tags:chick magnet will be parsed as -tags:chick defaultField:magnet. 2> Typical Solr setups in production are usually master/slave setups. Your indexing process (the commits) are causing new searchers to be opened/warmed/etc quite regularly, reducing your throughput. It's not surprising at all that your QPS rate increases when not indexing. 3> The trunk Near Real Time with "soft commits" should change the characteristics of the test with background indexing. You might try that. 4> Examine your cache usage, see the Solr admin page. Caches are quite important. Also consider autowarming characteristics. 5> There's a ton of stuff you can do to tune query rate. Unfortunately what the specific thing that would help your situation is hard to say. You might start with: http://wiki.apache.org/lucene-java/ImproveSearchingSpeedBest Erick On Thu, Apr 26, 2012 at 9:50 PM, Volodymyr Zhabiuk <[EMAIL PROTECTED]> wrote: > Hi Solr users > > I've implemented the project to compare the performance between > Solr, Elastic Search and SenseiDB > https://github.com/vzhabiuk/search-perf> the Solr version 3.5.0 was used. I've used the default configuration, > just enabled json updates and used the following schema > https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml. > 2.5 mln documents were put into the index, after > that I've launched the indexing process to add anotherr 500k docs. I > was issuing commits after each 500 doc batch . At the > same time I've launched the concurrent client, that sent the > following type of queries > ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!tags:soccer%20mom))%20 > OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:yellow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black))%20 > OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 > OR%20city:u.s.a.* > &facet=true&facet.field=tags&facet.field=color > The query contains the high level "OR" query, consisting of 2 terms, 2 > ranges and 1 prefix. It is designed to hit ~60-70% of all the docs > Here is the performance result: > #Threads min median mean 75% qps > 1 208.95ms 332.66ms 350.48ms 422.92ms 2.8 > 2 188.68ms 338.09ms 339.22ms 402.15ms 5.9 > 3 151.06ms 326.64ms 336.20ms 418.61ms 8.8 > 4 125.13ms 332.90ms 332.18ms 396.14ms 12.0 > If there is no indexing process on background > The result is as follows for 2,6 mln docs: > #Threads min median mean 75% qps > 1 106.70ms 199.66ms 199.40ms 234.89ms 5.1 > 2 128.61ms 199.12ms 201.81ms 229.89ms 9.9 > 3 110.99ms 197.43ms 203.13ms 232.25ms 14.7 > 4 90.24ms 201.46ms 200.46ms 227.75ms 19.9 > 5 106.14ms 208.75ms 207.69ms 242.88ms 24.0 > 6 103.75ms 208.91ms 211.23ms 238.60ms 28.3 > 7 113.54ms 207.07ms 209.69ms 239.99ms 33.3 > 8 117.32ms 216.38ms 224.74ms 258.74ms 35.5 > I've got three questions so far: > 1. In case of background indexing the latency is almost 2 times > higher, is there any way to overcome this? > 2. How can we tune the Solr to get better results ? > 3. What's in your opinion is the preferred type of queries that I can > use for the benchmark? > > With many thanks, > Volodymyr > > > BTW here is the spec of my machine > RedHat 6.1 64bit > Intel XEON e5620 @2.40 GHz, 8 cores > 63 GB RAM
+
Erick Erickson 2012-04-27, 12:25
-
RE: Benchmark Solr vs Elastic Search vs Sensei
Jeremy Taylor 2012-04-27, 17:59
DataStax offers a Solr integration that isn't master/slave and is NearRealTimes. Essentially, the software offers the great features of Solr without the major shortcomings. Jeremy -----Original Message----- From: Erick Erickson [mailto:[EMAIL PROTECTED]] Sent: Friday, April 27, 2012 5:26 AM To: [EMAIL PROTECTED] Subject: Re: Benchmark Solr vs Elastic Search vs Sensei Some observations: 1> I suspect some of your queries aren't doing what you expect, but I'm not sure if that matters. e.g. !tags:chick magnet will be parsed as -tags:chick defaultField:magnet. 2> Typical Solr setups in production are usually master/slave setups. Your indexing process (the commits) are causing new searchers to be opened/warmed/etc quite regularly, reducing your throughput. It's not surprising at all that your QPS rate increases when not indexing. 3> The trunk Near Real Time with "soft commits" should change the characteristics of the test with background indexing. You might try that. 4> Examine your cache usage, see the Solr admin page. Caches are quite important. Also consider autowarming characteristics. 5> There's a ton of stuff you can do to tune query rate. Unfortunately what the specific thing that would help your situation is hard to say. You might start with: http://wiki.apache.org/lucene-java/ImproveSearchingSpeedBest Erick On Thu, Apr 26, 2012 at 9:50 PM, Volodymyr Zhabiuk <[EMAIL PROTECTED]> wrote: > Hi Solr users > > I've implemented the project to compare the performance between Solr, > Elastic Search and SenseiDB https://github.com/vzhabiuk/search-perf> the Solr version 3.5.0 was used. I've used the default configuration, > just enabled json updates and used the following schema > https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml. > 2.5 mln documents were put into the index, after that I've launched > the indexing process to add anotherr 500k docs. I was issuing commits > after each 500 doc batch . At the same time I've launched the > concurrent client, that sent the following type of queries > ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags: > hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!ta > gs:soccer%20mom))%20 > OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:ye > llow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black) > )%20 > OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 > OR%20city:u.s.a.* > &facet=true&facet.field=tags&facet.field=color > The query contains the high level "OR" query, consisting of 2 terms, 2 > ranges and 1 prefix. It is designed to hit ~60-70% of all the docs > Here is the performance result: > #Threads min median mean 75% qps > 1 208.95ms 332.66ms 350.48ms 422.92ms 2.8 > 2 188.68ms 338.09ms 339.22ms 402.15ms 5.9 > 3 151.06ms 326.64ms 336.20ms 418.61ms 8.8 > 4 125.13ms 332.90ms 332.18ms 396.14ms 12.0 If > there is no indexing process on background The result is as follows > for 2,6 mln docs: > #Threads min median mean 75% qps > 1 106.70ms 199.66ms 199.40ms 234.89ms 5.1 > 2 128.61ms 199.12ms 201.81ms 229.89ms 9.9 > 3 110.99ms 197.43ms 203.13ms 232.25ms 14.7 > 4 90.24ms 201.46ms 200.46ms 227.75ms 19.9 > 5 106.14ms 208.75ms 207.69ms 242.88ms 24.0 > 6 103.75ms 208.91ms 211.23ms 238.60ms 28.3 > 7 113.54ms 207.07ms 209.69ms 239.99ms 33.3 > 8 117.32ms 216.38ms 224.74ms 258.74ms 35.5 I've > got three questions so far: > 1. In case of background indexing the latency is almost 2 times > higher, is there any way to overcome this? > 2. How can we tune the Solr to get better results ?
+
Jeremy Taylor 2012-04-27, 17:59
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Volodymyr Zhabiuk 2012-04-27, 18:33
Hi Eric Thanks for extensive answers. I will try to tune up my Solr installation according to your advises and the wiki page you've mentioned Best regards, Volodymyr 2012/4/27 Jeremy Taylor <[EMAIL PROTECTED]>: > DataStax offers a Solr integration that isn't master/slave and is > NearRealTimes. Essentially, the software offers the great features of > Solr without the major shortcomings. > > Jeremy > > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED]] > Sent: Friday, April 27, 2012 5:26 AM > To: [EMAIL PROTECTED] > Subject: Re: Benchmark Solr vs Elastic Search vs Sensei > > Some observations: > 1> I suspect some of your queries aren't doing what you expect, but > I'm not sure if that matters. e.g. !tags:chick magnet will be parsed > as -tags:chick defaultField:magnet. > 2> Typical Solr setups in production are usually master/slave > setups. Your indexing process (the commits) are causing > new searchers to be opened/warmed/etc quite regularly, > reducing your throughput. It's not surprising at all that > your QPS rate increases when not indexing. > 3> The trunk Near Real Time with "soft commits" should change > the characteristics of the test with background indexing. You > might try that. > 4> Examine your cache usage, see the Solr admin page. Caches > are quite important. Also consider autowarming characteristics. > 5> There's a ton of stuff you can do to tune query rate. Unfortunately > what the specific thing that would help your situation is hard to > say. You might start with: > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed> > Best > Erick > > On Thu, Apr 26, 2012 at 9:50 PM, Volodymyr Zhabiuk <[EMAIL PROTECTED]> > wrote: >> Hi Solr users >> >> I've implemented the project to compare the performance between Solr, >> Elastic Search and SenseiDB https://github.com/vzhabiuk/search-perf>> the Solr version 3.5.0 was used. I've used the default configuration, >> just enabled json updates and used the following schema >> > https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xm> l. >> 2.5 mln documents were put into the index, after that I've launched >> the indexing process to add anotherr 500k docs. I was issuing commits >> after each 500 doc batch . At the same time I've launched the >> concurrent client, that sent the following type of queries >> ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags: >> hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!ta >> gs:soccer%20mom))%20 >> OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:ye >> llow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black) >> )%20 >> OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 >> OR%20city:u.s.a.* >> &facet=true&facet.field=tags&facet.field=color >> The query contains the high level "OR" query, consisting of 2 terms, 2 >> ranges and 1 prefix. It is designed to hit ~60-70% of all the docs >> Here is the performance result: >> #Threads min median mean 75% qps >> 1 208.95ms 332.66ms 350.48ms 422.92ms 2.8 >> 2 188.68ms 338.09ms 339.22ms 402.15ms 5.9 >> 3 151.06ms 326.64ms 336.20ms 418.61ms 8.8 >> 4 125.13ms 332.90ms 332.18ms 396.14ms 12.0 If >> there is no indexing process on background The result is as follows >> for 2,6 mln docs: >> #Threads min median mean 75% qps >> 1 106.70ms 199.66ms 199.40ms 234.89ms 5.1 >> 2 128.61ms 199.12ms 201.81ms 229.89ms 9.9 >> 3 110.99ms 197.43ms 203.13ms 232.25ms 14.7 >> 4 90.24ms 201.46ms 200.46ms 227.75ms 19.9 >> 5 106.14ms 208.75ms 207.69ms 242.88ms 24.0 >> 6 103.75ms 208.91ms 211.23ms 238.60ms 28.3 >> 7 113.54ms 207.07ms 209.69ms 239.99ms 33.3
+
Volodymyr Zhabiuk 2012-04-27, 18:33
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Radim Kolar 2012-04-27, 19:39
Dne 27.4.2012 19:59, Jeremy Taylor napsal(a): > DataStax offers a Solr integration that isn't master/slave and is > NearRealTimes. its rebranded solandra?
+
Radim Kolar 2012-04-27, 19:39
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Walter Underwood 2012-04-27, 19:46
On Apr 27, 2012, at 12:39 PM, Radim Kolar wrote: > Dne 27.4.2012 19:59, Jeremy Taylor napsal(a): >> DataStax offers a Solr integration that isn't master/slave and is >> NearRealTimes. > its rebranded solandra? No, it is a rewrite. http://www.datastax.com/dev/blog/cassandra-with-solr-integration-detailswunder -- Walter Underwood [EMAIL PROTECTED]
+
Walter Underwood 2012-04-27, 19:46
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Jeff Schmidt 2012-04-27, 19:58
This is a pretty awesome combination, actually. I'm getting started using it myself, and I'd be very interested in what kind of benchmark results you get vs. Solr and your other candidates. DataStax Enterprise 2.0 was released in March and is based on Solr 4.0 and Cassandra 1.0.7 or 1.0.8, I'm looking for the Cassandra 1.1 based release. Note: I am not affiliated with DataStax in anyway, other than being a satisfied customer for the past few months. I am just trying to selfishly fuel your interest so you'll consider benchmarking it. My project is already using Cassandra, and we had to manage Solr separately. Having the Solr indexes, and core configuration (solrconfig.xml, schema.xml, synonyms.txt etc) in Cassandra, being distributed and replicated among the various nodes, and eventually for us, multiple data centers is fantastic. Jeff On Apr 27, 2012, at 1:46 PM, Walter Underwood wrote: > On Apr 27, 2012, at 12:39 PM, Radim Kolar wrote: > >> Dne 27.4.2012 19:59, Jeremy Taylor napsal(a): >>> DataStax offers a Solr integration that isn't master/slave and is >>> NearRealTimes. >> its rebranded solandra? > > No, it is a rewrite. > > http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details> > wunder > -- > Walter Underwood > [EMAIL PROTECTED] > > > -- Jeff Schmidt 535 Consulting [EMAIL PROTECTED] http://www.535consulting.com(650) 423-1068
+
Jeff Schmidt 2012-04-27, 19:58
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Jason Rutherglen 2012-04-27, 20:12
I think Datatax Enterprise is faster than Solr Cloud with transaction logging turned on. Cassandra has it's own fast(er) transaction logging mechanism. Of course it's best to use two HDs when testing, eg, one for the data, the other for the transaction log. On Fri, Apr 27, 2012 at 12:58 PM, Jeff Schmidt <[EMAIL PROTECTED]> wrote: > This is a pretty awesome combination, actually. I'm getting started using it myself, and I'd be very interested in what kind of benchmark results you get vs. Solr and your other candidates. DataStax Enterprise 2.0 was released in March and is based on Solr 4.0 and Cassandra 1.0.7 or 1.0.8, I'm looking for the Cassandra 1.1 based release. > > Note: I am not affiliated with DataStax in anyway, other than being a satisfied customer for the past few months. I am just trying to selfishly fuel your interest so you'll consider benchmarking it. > > My project is already using Cassandra, and we had to manage Solr separately. Having the Solr indexes, and core configuration (solrconfig.xml, schema.xml, synonyms.txt etc) in Cassandra, being distributed and replicated among the various nodes, and eventually for us, multiple data centers is fantastic. > > Jeff > > On Apr 27, 2012, at 1:46 PM, Walter Underwood wrote: > >> On Apr 27, 2012, at 12:39 PM, Radim Kolar wrote: >> >>> Dne 27.4.2012 19:59, Jeremy Taylor napsal(a): >>>> DataStax offers a Solr integration that isn't master/slave and is >>>> NearRealTimes. >>> its rebranded solandra? >> >> No, it is a rewrite. >> >> http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details>> >> wunder >> -- >> Walter Underwood >> [EMAIL PROTECTED] >> >> >> > > > > -- > Jeff Schmidt > 535 Consulting > [EMAIL PROTECTED] > http://www.535consulting.com> (650) 423-1068 > > > > > > > > >
+
Jason Rutherglen 2012-04-27, 20:12
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Andy 2012-04-27, 20:49
So the Cassandra integration brings distributed index and replication to Solr? Is that different from what Solr Cloud does? ________________________________ From: Jeff Schmidt <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Friday, April 27, 2012 3:58 PM Subject: Re: Benchmark Solr vs Elastic Search vs Sensei This is a pretty awesome combination, actually. I'm getting started using it myself, and I'd be very interested in what kind of benchmark results you get vs. Solr and your other candidates. DataStax Enterprise 2.0 was released in March and is based on Solr 4.0 and Cassandra 1.0.7 or 1.0.8, I'm looking for the Cassandra 1.1 based release. Note: I am not affiliated with DataStax in anyway, other than being a satisfied customer for the past few months. I am just trying to selfishly fuel your interest so you'll consider benchmarking it. My project is already using Cassandra, and we had to manage Solr separately. Having the Solr indexes, and core configuration (solrconfig.xml, schema.xml, synonyms.txt etc) in Cassandra, being distributed and replicated among the various nodes, and eventually for us, multiple data centers is fantastic. Jeff On Apr 27, 2012, at 1:46 PM, Walter Underwood wrote: > On Apr 27, 2012, at 12:39 PM, Radim Kolar wrote: > >> Dne 27.4.2012 19:59, Jeremy Taylor napsal(a): >>> DataStax offers a Solr integration that isn't master/slave and is >>> NearRealTimes. >> its rebranded solandra? > > No, it is a rewrite. > > http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details> > wunder > -- > Walter Underwood > [EMAIL PROTECTED] > > > -- Jeff Schmidt 535 Consulting [EMAIL PROTECTED] http://www.535consulting.com(650) 423-1068
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Jake Luciani 2012-04-27, 21:02
Yes the replication, failover and distribution is managed by Cassandra it makes solr more dynamo like. For example scaling involves adding another node to the cassandra cluster. Finally since the field data is in Cassandra you can access it from Cassandra, Hadoop or Solr. Jake On Apr 27, 2012, at 4:49 PM, Andy <[EMAIL PROTECTED]> wrote: > So the Cassandra integration brings distributed index and replication to Solr? Is that different from what Solr Cloud does? > > > ________________________________ > From: Jeff Schmidt <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Friday, April 27, 2012 3:58 PM > Subject: Re: Benchmark Solr vs Elastic Search vs Sensei > > This is a pretty awesome combination, actually. I'm getting started using it myself, and I'd be very interested in what kind of benchmark results you get vs. Solr and your other candidates. DataStax Enterprise 2.0 was released in March and is based on Solr 4.0 and Cassandra 1.0.7 or 1.0.8, I'm looking for the Cassandra 1.1 based release. > > Note: I am not affiliated with DataStax in anyway, other than being a satisfied customer for the past few months. I am just trying to selfishly fuel your interest so you'll consider benchmarking it. > > My project is already using Cassandra, and we had to manage Solr separately. Having the Solr indexes, and core configuration (solrconfig.xml, schema.xml, synonyms.txt etc) in Cassandra, being distributed and replicated among the various nodes, and eventually for us, multiple data centers is fantastic. > > Jeff > > On Apr 27, 2012, at 1:46 PM, Walter Underwood wrote: > >> On Apr 27, 2012, at 12:39 PM, Radim Kolar wrote: >> >>> Dne 27.4.2012 19:59, Jeremy Taylor napsal(a): >>>> DataStax offers a Solr integration that isn't master/slave and is >>>> NearRealTimes. >>> its rebranded solandra? >> >> No, it is a rewrite. >> >> http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details>> >> wunder >> -- >> Walter Underwood >> [EMAIL PROTECTED] >> >> >> > > > > -- > Jeff Schmidt > 535 Consulting > [EMAIL PROTECTED] > http://www.535consulting.com> (650) 423-1068
+
Jake Luciani 2012-04-27, 21:02
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Andy 2012-04-27, 20:55
What is the performance of Elasticsearch and SenseiDB in your benchmark? ________________________________ From: Volodymyr Zhabiuk <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Thursday, April 26, 2012 9:50 PM Subject: Benchmark Solr vs Elastic Search vs Sensei Hi Solr users I've implemented the project to compare the performance between Solr, Elastic Search and SenseiDB https://github.com/vzhabiuk/search-perfthe Solr version 3.5.0 was used. I've used the default configuration, just enabled json updates and used the following schema https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml. 2.5 mln documents were put into the index, after that I've launched the indexing process to add anotherr 500k docs. I was issuing commits after each 500 doc batch . At the same time I've launched the concurrent client, that sent the following type of queries ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!tags:soccer%20mom))%20 OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:yellow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black))%20 OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 OR%20city:u.s.a.* &facet=true&facet.field=tags&facet.field=color The query contains the high level "OR" query, consisting of 2 terms, 2 ranges and 1 prefix. It is designed to hit ~60-70% of all the docs Here is the performance result: #Threads min median mean 75% qps 1 208.95ms 332.66ms 350.48ms 422.92ms 2.8 2 188.68ms 338.09ms 339.22ms 402.15ms 5.9 3 151.06ms 326.64ms 336.20ms 418.61ms 8.8 4 125.13ms 332.90ms 332.18ms 396.14ms 12.0 If there is no indexing process on background The result is as follows for 2,6 mln docs: #Threads min median mean 75% qps 1 106.70ms 199.66ms 199.40ms 234.89ms 5.1 2 128.61ms 199.12ms 201.81ms 229.89ms 9.9 3 110.99ms 197.43ms 203.13ms 232.25ms 14.7 4 90.24ms 201.46ms 200.46ms 227.75ms 19.9 5 106.14ms 208.75ms 207.69ms 242.88ms 24.0 6 103.75ms 208.91ms 211.23ms 238.60ms 28.3 7 113.54ms 207.07ms 209.69ms 239.99ms 33.3 8 117.32ms 216.38ms 224.74ms 258.74ms 35.5 I've got three questions so far: 1. In case of background indexing the latency is almost 2 times higher, is there any way to overcome this? 2. How can we tune the Solr to get better results ? 3. What's in your opinion is the preferred type of queries that I can use for the benchmark? With many thanks, Volodymyr BTW here is the spec of my machine RedHat 6.1 64bit Intel XEON e5620 @2.40 GHz, 8 cores 63 GB RAM
-
Re: Benchmark Solr vs Elastic Search vs Sensei
Volodymyr Zhabiuk 2012-04-28, 00:06
Hi Andy I don't want to publish results, since still there are some mistakes in the benchmark. Also this would be controversial, because there are too many parameters to tune and to take into consideration. Nevertheless you can go to the Sensei google group to see the preliminary result for Sensei At first I was using the benchmark to do the stress testing for Sensei. We needed to identify possible memory leaks and bottlenecks in the new release. After that I've extended the tool to test Solr and Elastic search With many thanks, Volodymyr 2012/4/27 Andy <[EMAIL PROTECTED]>: > What is the performance of Elasticsearch and SenseiDB in your benchmark? > > > ________________________________ > From: Volodymyr Zhabiuk <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thursday, April 26, 2012 9:50 PM > Subject: Benchmark Solr vs Elastic Search vs Sensei > > Hi Solr users > > I've implemented the project to compare the performance between > Solr, Elastic Search and SenseiDB > https://github.com/vzhabiuk/search-perf> the Solr version 3.5.0 was used. I've used the default configuration, > just enabled json updates and used the following schema > https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xml. > 2.5 mln documents were put into the index, after > that I've launched the indexing process to add anotherr 500k docs. I > was issuing commits after each 500 doc batch . At the > same time I've launched the concurrent client, that sent the > following type of queries > ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!tags:soccer%20mom))%20 > OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:yellow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black))%20 > OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20 > OR%20city:u.s.a.* > &facet=true&facet.field=tags&facet.field=color > The query contains the high level "OR" query, consisting of 2 terms, 2 > ranges and 1 prefix. It is designed to hit ~60-70% of all the docs > Here is the performance result: > #Threads min median mean 75% qps > 1 208.95ms 332.66ms 350.48ms 422.92ms 2.8 > 2 188.68ms 338.09ms 339.22ms 402.15ms 5.9 > 3 151.06ms 326.64ms 336.20ms 418.61ms 8.8 > 4 125.13ms 332.90ms 332.18ms 396.14ms 12.0 > If there is no indexing process on background > The result is as follows for 2,6 mln docs: > #Threads min median mean 75% qps > 1 106.70ms 199.66ms 199.40ms 234.89ms 5.1 > 2 128.61ms 199.12ms 201.81ms 229.89ms 9.9 > 3 110.99ms 197.43ms 203.13ms 232.25ms 14.7 > 4 90.24ms 201.46ms 200.46ms 227.75ms 19.9 > 5 106.14ms 208.75ms 207.69ms 242.88ms 24.0 > 6 103.75ms 208.91ms 211.23ms 238.60ms 28.3 > 7 113.54ms 207.07ms 209.69ms 239.99ms 33.3 > 8 117.32ms 216.38ms 224.74ms 258.74ms 35.5 > I've got three questions so far: > 1. In case of background indexing the latency is almost 2 times > higher, is there any way to overcome this? > 2. How can we tune the Solr to get better results ? > 3. What's in your opinion is the preferred type of queries that I can > use for the benchmark? > > With many thanks, > Volodymyr > > > BTW here is the spec of my machine > RedHat 6.1 64bit > Intel XEON e5620 @2.40 GHz, 8 cores > 63 GB RAM
+
Volodymyr Zhabiuk 2012-04-28, 00:06
|
|