|
Michael McCandless
2012-08-10, 15:54
Adrien Grand
2012-08-10, 16:24
Michael McCandless
2012-08-10, 16:52
Han Jiang
2012-08-10, 17:26
Michael McCandless
2012-08-11, 13:23
Robert Muir
2012-08-11, 14:31
Michael McCandless
2012-08-11, 18:58
|
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Michael McCandless 2012-08-10, 15:54
Replying to dev@ because Jira keeps being unavailable:
Seems like we should default BlockPostingsFormat to COMPACT. Mike McCandless http://blog.mikemccandless.com On Fri, Aug 10, 2012 at 8:14 AM, Adrien Grand (JIRA) <[EMAIL PROTECTED]> wrote: > > [ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432716#comment-13432716 ] > > Adrien Grand commented on LUCENE-3892: > -------------------------------------- > > I ran the comparison between acceptableOverheadRatio=PackedInts.COMPACT (0%) and PackedInts.DEFAULT (20%) and it seems to be much faster with PackedInts.COMPACT: > > {noformat} > base=COMPACT, challenger=DEFAULT > Task QPS base StdDev base QPS def StdDev def Pct diff > IntNRQ 81.83 5.43 74.14 2.94 -18% - 0% > HighTerm 146.55 10.34 133.57 9.02 -20% - 4% > LowPhrase 93.91 1.63 86.90 1.67 -10% - -4% > MedTerm 824.58 43.48 766.35 38.78 -16% - 3% > LowSloppyPhrase 83.29 1.99 77.65 1.18 -10% - -3% > OrHighMed 94.15 5.28 88.34 4.54 -15% - 4% > OrHighHigh 100.63 5.42 94.57 4.20 -14% - 3% > OrHighLow 128.62 7.21 120.92 6.07 -15% - 4% > HighPhrase 13.05 0.45 12.29 0.39 -11% - 0% > Prefix3 217.06 6.82 205.05 4.62 -10% - 0% > MedPhrase 27.50 0.97 26.33 0.79 -10% - 2% > Wildcard 183.20 4.87 175.58 3.89 -8% - 0% > LowTerm 1763.31 43.24 1693.31 39.29 -8% - 0% > HighSloppyPhrase 10.05 0.48 9.67 0.40 -11% - 5% > AndHighHigh 111.59 1.15 107.45 1.66 -6% - -1% > LowSpanNear 56.16 1.32 54.25 1.01 -7% - 0% > AndHighMed 423.44 7.40 409.32 5.10 -6% - 0% > MedSpanNear 33.14 0.91 32.32 0.74 -7% - 2% > AndHighLow 2177.50 30.79 2134.05 28.64 -4% - 0% > Fuzzy1 95.34 2.41 93.66 2.32 -6% - 3% > HighSpanNear 5.28 0.17 5.21 0.11 -6% - 3% > MedSloppyPhrase 18.41 0.72 18.19 0.70 -8% - 6% > Fuzzy2 37.73 1.31 37.31 1.14 -7% - 5% > Respell 109.71 3.09 108.64 2.76 -6% - 4% > PKLookup 257.32 6.64 260.00 7.15 -4% - 6% > {noformat} > >> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) >> ------------------------------------------------------------------------------------- >> >> Key: LUCENE-3892 >> URL: https://issues.apache.org/jira/browse/LUCENE-3892 >> Project: Lucene - Core >> Issue Type: Improvement >> Reporter: Michael McCandless >> Labels: gsoc2012, lucene-gsoc-12 >> Fix For: 4.1 >> >> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Adrien Grand 2012-08-10, 16:24
On 10/08/2012 17:54, Michael McCandless wrote:> Replying to dev@
because Jira keeps being unavailable: > Seems like we should default BlockPostingsFormat to COMPACT. Something that worries me too is that I tried to reproduce this benchmark (but with a lower jvmCount as I was running out of time) and I got very similar results between COMPACT and DEFAULT, I am not sure why... Moreover Toke pointed out that PackedInts had very different performance characteristics depending on the hardware in LUCENE-4062, I'd be interested to know how COMPACT and DEFAULT compare on other computers. -- Adrien ---------------------------------------------------------------------
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Michael McCandless 2012-08-10, 16:52
On Fri, Aug 10, 2012 at 12:24 PM, Adrien Grand <[EMAIL PROTECTED]> wrote:
> On 10/08/2012 17:54, Michael McCandless wrote:> Replying to dev@ > because Jira keeps being unavailable: >> Seems like we should default BlockPostingsFormat to COMPACT. > > Something that worries me too is that I tried to reproduce this > benchmark (but with a lower jvmCount as I was running out of time) and > I got very similar results between COMPACT and DEFAULT, I am not sure > why... > > Moreover Toke pointed out that PackedInts had very different > performance characteristics depending on the hardware in LUCENE-4062, Maybe on startup we need to run N possibilities and select the fastest! Kind of like how software raid does when it starts... It wouldn't be perfect though, since whatever env does the searching can be totally different from whatever is doing the indexing... > I'd be interested to know how COMPACT and DEFAULT compare on other > computers. I'll test on my beast (2X Xeon 5680) ... Mike McCandless http://blog.mikemccandless.com ---------------------------------------------------------------------
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Han Jiang 2012-08-10, 17:26
This is a test on my computer(1M wiki data, on AMD Athlon(tm) II X2 240e,
2G memory), hope this helps! :) Task QPS defaultStdDev default QPS compactStdDev compact Pct diff AndHighHigh 124.93 2.99 123.98 3.09 -5% - 4% AndHighLow 2308.86 29.85 2303.38 17.93 -2% - 1% AndHighMed 340.91 7.46 341.68 7.18 -3% - 4% Fuzzy1 86.35 2.45 87.53 3.42 -5% - 8% Fuzzy2 31.28 0.81 31.84 1.50 -5% - 9% HighPhrase 9.68 0.67 9.31 0.46 -14% - 8% HighSloppyPhrase 5.28 0.30 5.17 0.22 -11% - 8% HighSpanNear 10.97 0.38 10.54 0.38 -10% - 3% HighTerm 180.75 6.92 181.39 6.26 -6% - 7% IntNRQ 61.35 4.47 62.25 3.76 -11% - 16% LowPhrase 45.47 1.91 44.31 1.55 -9% - 5% LowSloppyPhrase 69.09 1.12 68.90 2.94 -6% - 5% LowSpanNear 91.13 1.43 87.62 3.36 -8% - 1% LowTerm 1757.52 30.29 1772.45 29.51 -2% - 4% MedPhrase 29.37 1.34 28.38 1.01 -10% - 4% MedSloppyPhrase 34.33 1.10 32.77 1.33 -11% - 2% MedSpanNear 25.43 0.56 24.14 0.95 -10% - 0% MedTerm 672.27 23.18 676.32 20.46 -5% - 7% OrHighHigh 28.07 2.17 27.51 1.24 -13% - 10% OrHighLow 159.13 11.84 154.76 7.53 -13% - 10% OrHighMed 107.43 8.08 104.95 5.01 -13% - 10% PKLookup 214.36 3.14 216.26 2.59 -1% - 3% Prefix3 169.37 7.05 169.22 4.99 -6% - 7% Respell 83.48 1.77 84.83 3.31 -4% - 7% Wildcard 156.70 5.55 155.41 2.60 -5% - 4% -- Han Jiang EECS, Peking University, China Every Effort Creates Smile Senior Student
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Michael McCandless 2012-08-11, 13:23
Here are my results ... base = DEFAULT (0.2 acceptable overhead),
competitor = compact (0.0 overhead, ie PACKED): Dual Xeon x5680: Task QPS base StdDev base QPS compact StdDev compact Pct diff Prefix3 77.85 5.86 72.79 2.24 -15% - 4% LowSpanNear 9.18 0.15 8.60 0.22 -10% - -2% IntNRQ 10.45 1.55 9.83 0.46 -21% - 15% Wildcard 47.43 3.52 45.15 1.05 -13% - 5% MedPhrase 13.69 0.19 13.03 0.22 -7% - -1% LowPhrase 22.06 0.16 21.28 0.31 -5% - -1% LowSloppyPhrase 6.91 0.09 6.67 0.09 -5% - 0% HighPhrase 1.57 0.06 1.53 0.06 -9% - 5% MedSpanNear 4.34 0.10 4.27 0.13 -6% - 3% AndHighMed 72.51 0.20 71.43 0.80 -2% - 0% HighSloppyPhrase 1.86 0.04 1.83 0.03 -4% - 2% MedSloppyPhrase 7.68 0.10 7.66 0.08 -2% - 1% AndHighHigh 25.40 0.18 25.36 0.21 -1% - 1% PKLookup 163.94 3.99 164.06 3.57 -4% - 4% HighSpanNear 1.52 0.04 1.52 0.04 -5% - 5% AndHighLow 670.18 17.12 672.56 8.54 -3% - 4% Fuzzy1 63.60 1.73 64.21 1.80 -4% - 6% Respell 56.92 1.66 57.54 1.59 -4% - 7% OrHighLow 23.24 1.53 23.86 0.47 -5% - 12% Fuzzy2 60.47 2.80 62.24 2.06 -4% - 11% OrHighMed 17.75 1.17 18.30 0.38 -5% - 12% OrHighHigh 8.97 0.62 9.28 0.20 -5% - 13% HighTerm 30.96 4.19 32.64 4.44 -19% - 38% MedTerm 157.41 19.47 166.61 19.08 -16% - 34% LowTerm 404.16 66.47 508.14 30.98 1% - 59% Single i7-3770k (ivy bridge): Task QPS base StdDev base QPS compact StdDev compact Pct diff LowSpanNear 10.98 0.03 10.27 0.20 -8% - -4% LowSloppyPhrase 8.21 0.23 7.68 0.31 -12% - 0% MedPhrase 14.90 0.04 14.08 0.11 -6% - -4% HighSloppyPhrase 2.06 0.06 1.96 0.10 -12% - 2% HighPhrase 1.98 0.04 1.89 0.04 -8% - 0% LowPhrase 24.28 0.01 23.41 0.14 -4% - -2% MedSloppyPhrase 7.24 0.25 7.05 0.32 -10% - 5% AndHighLow 730.83 24.83 715.02 6.55 -6% - 2% Respell 63.03 1.26 61.79 1.64 -6% - 2% MedSpanNear 5.04 0.03 4.94 0.09 -4% - 0% Fuzzy1 75.98 0.95 74.53 0.86 -4% - 0% Fuzzy2 63.50 0.97 62.38 1.15 -5% - 1% OrHighLow 27.52 0.89 27.19 0.72 -6% - 4% OrHighMed 23.84 0.75 23.57 0.61 -6% - 4% OrHighHigh 11.54 0.38 11.41 0.31 -6% - 5% AndHighMed 76.73 1.07 76.06 0.24 -2% - 0% PKLookup 190.77 1.66 189.29 1.89 -2% - 1% AndHighHigh 24.20 0.32 24.08 0.12 -2% - 1% HighSpanNear 1.70 0.01 1.69 0.03 -3% - 2% HighTerm 35.63 1.13 35.51 0.87 -5% - 5% LowTerm 513.05 8.17 511.49 8.09 -3% - 2% MedTerm 198.89 5.29 198.83 4.49 -4% - 5% Wildcard 52.47 1.14 54.44 3.09 -4% - 12% Prefix3 82.00 2.27 86.23 5.01 -3% - 14% IntNRQ 11.52 0.51 12.54 1.47 -8% - 27% Hard to know what to conclude! Mike McCandless http://blog.mikemccandless.com On Fri, Aug 10, 2012 at 1:26 PM, Han Jiang <[EMAIL PROTECTED]> wrote:
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Robert Muir 2012-08-11, 14:31
I'm having a tough time remembering what these packed ints options do
(I thought the perf boost from allowing overhead came from upgrading to the next byte boundary?) Anyway: again I'm a little concerned about the wikipedia benchmark here for this purpose. For e.g. structured content from databases (tiny fields) where the numbers are much tinier on average the numbers could be different. I'm also worried about the fact that decode speed is over-emphasized in the wikipedia benchmark since all the I/O is hot. So I think if its this ambiguous for wikipedia we should shoot for the most COMPACT form as a safe default. On Sat, Aug 11, 2012 at 9:23 AM, Michael McCandless <[EMAIL PROTECTED]> wrote: > Here are my results ... base = DEFAULT (0.2 acceptable overhead), > competitor = compact (0.0 > overhead, ie PACKED): > > Dual Xeon x5680: > Task QPS base StdDev base QPS compact StdDev > compact Pct diff > Prefix3 77.85 5.86 72.79 2.24 > -15% - 4% > LowSpanNear 9.18 0.15 8.60 0.22 > -10% - -2% > IntNRQ 10.45 1.55 9.83 0.46 > -21% - 15% > Wildcard 47.43 3.52 45.15 1.05 > -13% - 5% > MedPhrase 13.69 0.19 13.03 0.22 > -7% - -1% > LowPhrase 22.06 0.16 21.28 0.31 > -5% - -1% > LowSloppyPhrase 6.91 0.09 6.67 0.09 > -5% - 0% > HighPhrase 1.57 0.06 1.53 0.06 > -9% - 5% > MedSpanNear 4.34 0.10 4.27 0.13 > -6% - 3% > AndHighMed 72.51 0.20 71.43 0.80 > -2% - 0% > HighSloppyPhrase 1.86 0.04 1.83 0.03 > -4% - 2% > MedSloppyPhrase 7.68 0.10 7.66 0.08 > -2% - 1% > AndHighHigh 25.40 0.18 25.36 0.21 > -1% - 1% > PKLookup 163.94 3.99 164.06 3.57 > -4% - 4% > HighSpanNear 1.52 0.04 1.52 0.04 > -5% - 5% > AndHighLow 670.18 17.12 672.56 8.54 > -3% - 4% > Fuzzy1 63.60 1.73 64.21 1.80 > -4% - 6% > Respell 56.92 1.66 57.54 1.59 > -4% - 7% > OrHighLow 23.24 1.53 23.86 0.47 > -5% - 12% > Fuzzy2 60.47 2.80 62.24 2.06 > -4% - 11% > OrHighMed 17.75 1.17 18.30 0.38 > -5% - 12% > OrHighHigh 8.97 0.62 9.28 0.20 > -5% - 13% > HighTerm 30.96 4.19 32.64 4.44 > -19% - 38% > MedTerm 157.41 19.47 166.61 19.08 > -16% - 34% > LowTerm 404.16 66.47 508.14 30.98 > 1% - 59% > > Single i7-3770k (ivy bridge): > > Task QPS base StdDev base QPS compact StdDev > compact Pct diff > LowSpanNear 10.98 0.03 10.27 0.20 > -8% - -4% > LowSloppyPhrase 8.21 0.23 7.68 0.31 > -12% - 0% > MedPhrase 14.90 0.04 14.08 0.11 > -6% - -4% > HighSloppyPhrase 2.06 0.06 1.96 0.10 > -12% - 2% > HighPhrase 1.98 0.04 1.89 0.04 > -8% - 0% > LowPhrase 24.28 0.01 23.41 0.14 > -4% - -2% > MedSloppyPhrase 7.24 0.25 7.05 0.32 > -10% - 5% > AndHighLow 730.83 24.83 715.02 6.55 > -6% - 2% > Respell 63.03 1.26 61.79 1.64 > -6% - 2% > MedSpanNear 5.04 0.03 4.94 0.09 lucidimagination.com
-
Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)Michael McCandless 2012-08-11, 18:58
On Sat, Aug 11, 2012 at 10:31 AM, Robert Muir <[EMAIL PROTECTED]> wrote:
> I'm having a tough time remembering what these packed ints options do > (I thought the perf boost from allowing overhead came from upgrading > to the next byte boundary?) Upgrading to the next byte boundary, or using PACKED_SINGLE_BLOCK when possible. > Anyway: again I'm a little concerned about the wikipedia benchmark > here for this purpose. We should find another corpus/corpora to also test... > For e.g. structured content from databases (tiny fields) where the > numbers are much tinier on average the numbers could be different. I'm > also worried about the fact > that decode speed is over-emphasized in the wikipedia benchmark since > all the I/O is hot. True. > So I think if its this ambiguous for wikipedia we should shoot for the > most COMPACT form as a safe default. +1 Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- |