Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Lucene, mail # dev - Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)


Copy link to this message
-
Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Yonik Seeley 2006-09-06, 20:37
On 9/6/06, Ning Li <[EMAIL PROTECTED]> wrote:
> > So, I *think* most of our hypothetical problems go away with a simple
> > adjustment to f(n):
> >
> > f(n) = floor(log_M((n-1)/B))
>
> Correct. And nice. :-)
>
> Equivalently,
> f(n) = ceil(log_M (n / B)). If f(n) = c, it means B*(M^(c-1)) < n <= B*(M^(c)).
>
> So f(n) = 0 means n <= B.

Cool. or for all n>0,  f(n) = ceil(log_M(ceil(n/B)))
to avoid negative f(n)

So what's left... maxMergeDocs I guess.
Capping the segment size breaks the simple invariants a bit.

If the first M segments of any given f(n) total more than
maxMergeDocs, then the total number of segments with that same f(n)
may be >= M

We also need to be able to handle changes to M and maxMergeDocs
between different IndexWriter sessions.  When checking for a merge for
a particular f(n) level, always checking and merging leftmost*
segments first solves some potential problems here.

*when I say leftmost, I envision the index with the largest segments
on the left.

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------