Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Solr >> mail # user >> search.highlight.InvalidTokenOffsetsException in Solr 3.5


Copy link to this message
-
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
On Fri, Mar 2, 2012 at 7:37 AM, andrew <[EMAIL PROTECTED]> wrote:
> I was able to create a test case.
>
> We are querying ranges of documents. When I tried to isolate the document
> that causes trouble, I found it happens with exactly every second request
> only for a single document query (it fails constantly when requesting a
> range of documents where that document is included). I could also reproduce
> the exception with only that single document in the index.
>
> I think it is not a good idea to post the Solr <add/> XML here - it is very
> long (text extract of a newspaper page) and may not reproduce verbatim
> (whitespace etc.) if I paste it here.
>
> iorixxx, koji - is it ok if I send the necessary artifacts (add XML, schema,
> config) via email?
>

You can also open a jira issue
(https://issues.apache.org/jira/browse/SOLR), and upload everything as
attachments.

I would also be very interested if you can test a nightly 3.6 build
(https://builds.apache.org/job/Solr-3.x/lastSuccessfulBuild/artifact/artifacts/)

There have been *numerous* offsets bugs fixed in 3.6 in a variety of
tokenizers/tokenfilters besides the HTMLStripCharFilter:
https://issues.apache.org/jira/browse/LUCENE-3642
https://issues.apache.org/jira/browse/SOLR-2891
https://issues.apache.org/jira/browse/LUCENE-3717

--
lucidimagination.com