roz dev 2012-02-27, 08:26
Erick Erickson 2012-02-28, 03:01
eks dev 2012-02-28, 08:38
-Re: Solr Cloud, Commits and Master/Slave configuration
Mark Miller 2012-03-01, 04:24
We actually do currently batch updates - we are being somewhat loose when we say a document at a time. There is a buffer of updates per replica that gets flushed depending on the requests coming through and the buffer size.
- Mark Miller
On Feb 28, 2012, at 3:38 AM, eks dev wrote:
> SolrCluod is going to be great, NRT feature is really huge step
> forward, as well as central configuration, elasticity ...
> The only thing I do not yet understand is treatment of cases that were
> traditionally covered by Master/Slave setup. Batch update
> If I get it right (?), updates to replicas are sent one by one,
> meaning when one server receives update, it gets forwarded to all
> replicas. This is great for reduced update latency case, but I do not
> know how is it implemented if you hit it with "batch" update. This
> would cause huge amount of update commands going to replicas. Not so
> good for throughput.
> - Master slave does distribution at segment level, (no need to
> replicate analysis, far less network traffic). Good for batch updates
> - SolrCloud does par update command (low latency, but chatty and
> Analysis step is done N_Servers times). Good for incremental updates
> Ideally, some sort of "batching" is going to be available in
> SolrCloud, and some cont roll over it, e.g. forward batches of 1000
> documents (basically keep update log slightly longer and forward it as
> a batch update command). This would still cause duplicate analysis,
> but would reduce network traffic.
> Please bare in mind, this is more of a question than a statement, I
> didn't look at the cloud code. It might be I am completely wrong here!
> On Tue, Feb 28, 2012 at 4:01 AM, Erick Erickson <[EMAIL PROTECTED]> wrote:
>> As I understand it (and I'm just getting into SolrCloud myself), you can
>> essentially forget about master/slave stuff. If you're using NRT,
>> the soft commit will make the docs visible, you don't ned to do a hard
>> commit (unlike the master/slave days). Essentially, the update is sent
>> to each shard leader and then fanned out into the replicas for that
>> leader. All automatically. Leaders are elected automatically. ZooKeeper
>> is used to keep the cluster information.
>> Additionally, SolrCloud keeps a transaction log of the updates, and replays
>> them if the indexing is interrupted, so you don't risk data loss the way
>> you used to.
>> There aren't really masters/slaves in the old sense any more, so
>> you have to get out of that thought-mode (it's hard, I know).
>> The code is under pretty active development, so any feedback is
>> On Mon, Feb 27, 2012 at 3:26 AM, roz dev <[EMAIL PROTECTED]> wrote:
>>> Hi All,
>>> I am trying to understand features of Solr Cloud, regarding commits and
>>> - If I am using Solr Cloud then do I need to explicitly call commit
>>> (hard-commit)? Or, a soft commit is okay and Solr Cloud will do the job of
>>> writing to disk?
>>> - Do We still need to use Master/Slave setup to scale searching? If we
>>> have to use Master/Slave setup then do i need to issue hard-commit to make
>>> my changes visible to slaves?
>>> - If I were to use NRT with Master/Slave setup with soft commit then
>>> will the slave be able to see changes made on master with soft commit?
>>> Any inputs are welcome.
eks dev 2012-03-01, 08:35