Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Lucene, mail # user - Join between indexes


+
Arnon Mazza 2012-02-01, 14:05
+
Simon Willnauer 2012-02-01, 16:09
+
Francisco A. Lozano 2012-02-01, 17:56
Copy link to this message
-
Re: Join between indexes
Arnon Mazza 2012-02-02, 20:56
Thanks, that's a very nice feature.
 
Would it also enable joining on the docId level, meaning that part of a document is kept in some index and another part of the same document is kept in another index ?
 
In the example that was given in the articles & comments link, that could be for instance:
articles index:
- docId=1: "(1) this (2) paper (3) is (4) about (5) lucene". (numbers are positions in the doc).
comments index:
- docId=1: "(3) very (4) recommended".
 
So that one would be able to know that the comment "very recommended" was written next to the word "paper".
(Conceptually the query could be: articles.paper NEAR comments."very recommended").
 
Is this also part of the feature ?
 
Thanks,
Arnon.

From: Francisco A. Lozano <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Wednesday, February 1, 2012 7:56 PM
Subject: Re: Join between indexes

Wow, thanks for pointing this out, didn't know such a feature was in progress.

I see a mention that there are some chances this will be released in
3.6... crossing my fingers :)

Francisco A. Lozano

On Wed, Feb 1, 2012 at 17:09, Simon Willnauer
<[EMAIL PROTECTED]> wrote:
> maybe this link will help: http://bit.ly/AhwIw6
>
> simon
>
> On Wed, Feb 1, 2012 at 3:05 PM, Arnon Mazza <[EMAIL PROTECTED]> wrote:
>> Assume we have a Lucene index over which several types of analyses are performed.
>>
>> Assume that the conclusions of some analysis require that new tokens be added to existing documents in the index.
>> For example, a repeating pattern p (sequence of words) that appears in a large part of the documents should be tagged in every document in its exact position.
>>
>> Now it is required to execute proximity queries involving standard terms and also the pattern p (e.g. find all documents in which the word "hello" is adjacent to the pattern p).
>>
>> Is there a way of achieving this without re-indexing all the documents where the pattern p was found ?
>> In other words, is it possible to maintain a separate index that would keep only patterns->docIds/positions, and then join between the two indexes ?
>>
>> If not, is there a plan to support this in the future ?
>>
>> Thanks,
>> Arnon.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>

---------------------------------------------------------------------
.apache.org
.org