Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Lucene, mail # dev - RE: svn commit: r1311373 - in /lucene/dev/branches/lucene3969: lucene/test-framework/src/java/org/apache/lucene/analysis/ modules/analysis/common/src/java/org/apache/lucene/analysis/shingle/ modules/analysis/common/src/test/org/apache/lucene/analysis/core


Copy link to this message
-
Re: svn commit: r1311373 - in /lucene/dev/branches/lucene3969: lucene/test-framework/src/java/org/apache/lucene/analysis/ modules/analysis/common/src/java/org/apache/lucene/analysis/shingle/ modules/analysis/common/src/test/org/apache/lucene/analysis
Michael McCandless 2012-04-09, 20:11
On Mon, Apr 9, 2012 at 3:41 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> On 4/9/2012 at 3:06 PM, [EMAIL PROTECTED] wrote:
>> LUCENE-3969: [...] tenatively add posLen to ShingleFilter
>> [...]
>> +++ lucene/dev/branches/lucene3969/modules/analysis/common/src/java/org/
>> +++ apache/lucene/analysis/shingle/ShingleFilter.java Mon Apr  9  19:05:47 2012
>> [...]
>> @@ -319,6 +321,8 @@ public final class ShingleFilter extends
>>            noShingleOutput = false;
>>          }
>>          offsetAtt.setOffset(offsetAtt.startOffset(), nextToken.offsetAtt.endOffset());
>> +        // nocommit is this right!?  i'm just guessing...
>> +        posLenAtt.setPositionLength(builtGramSize);
>>          isOutputHere = true;
>>          gramSize.advance();
>>          tokenAvailable = true;
>
> +1 - looks right to me.
>
> builtGramSize is the position length of the output shingle - missing positions (e.g. from stop words) are represented as "filler" tokens.

Thanks Steve.... I removed the nocommit.  This fixed at least the one
test failure I was working on at the time...

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------