|
Ravi Solr
2011-05-06, 20:52
Bill Bell
2011-05-08, 03:49
Ravi Solr
2011-05-09, 15:24
Bill Bell
2011-05-11, 05:22
Ravi Solr
2011-05-11, 13:25
Alexander Kanarsky
2011-05-10, 08:10
Ravi Solr
2011-05-10, 16:45
Alexander Kanarsky
2011-05-11, 22:00
Ravi Solr
2011-05-12, 22:42
Ravi Solr
2011-05-13, 22:34
Alexander Kanarsky
2011-05-15, 07:12
Ravi Solr
2011-05-18, 20:24
|
-
Replication Clarification PleaseRavi Solr 2011-05-06, 20:52
Hello,
Pardon me if this has been already answered somewhere and I apologize for a lengthy post. I was wondering if anybody could help me understand Replication internals a bit more. We have a single master-slave setup (solr 1.4.1) with the configurations as shown below. Our environment is quite commit heavy (almost 100s of docs every 5 minutes), and all indexing is done on Master and all searches go to the Slave. We are seeing that the slave replication performance gradually decreases and the speed decreases < 1kbps and ultimately gets backed up. Once we reload the core on slave it will be work fine for sometime and then it again gets backed up. We have mergeFactor set to 10 and ramBufferSizeMB is set to 32MB and solr itself is running with 2GB memory and locktype is simple on both master and slave. I am hoping that the following questions might help me understand the replication performance issue better (Replication Configuration is given at the end of the email) 1. Does the Slave get the whole index every time during replication or just the delta since the last replication happened ? 2. If there are huge number of queries being done on slave will it affect the replication ? How can I improve the performance ? (see the replications details at he bottom of the page) 3. Will the segment names be same be same on master and slave after replication ? I see that they are different. Is this correct ? If it is correct how does the slave know what to fetch the next time i.e. the delta. 4. When and why does the index.<TIMESTAMP> folder get created ? I see this type of folder getting created only on slave and the slave instance is pointing to it. 5. Does replication process copy both the index and index.<TIMESTAMP> folder ? 6. what happens if the replication kicks off even before the previous invocation has not completed ? will the 2nd invocation block or will it go through causing more confusion ? 7. If I have to prep a new master-slave combination is it OK to copy the respective contents into the new master-slave and start solr ? or do I have have to wipe the new slave and let it replicate from its new master ? 8. Doing an 'ls | wc -l' on index folder of master and slave gave 194 and 17968 respectively...I slave has lot of segments_xxx files. Is this normal ? MASTER <requestHandler name="/replication" class="solr. ReplicationHandler" > <lst name="master"> <str name="replicateAfter">startup</str> <str name="replicateAfter">commit</str> <str name="replicateAfter">optimize</str> <str name="confFiles">schema.xml,stopwords.txt</str> <str name="commitReserveDuration">00:00:10</str> </lst> </requestHandler> SLAVE <requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="slave"> <str name="masterUrl">master core url</str> <str name="pollInterval">00:03:00</str> <str name="compression">internal</str> <str name="httpConnTimeout">5000</str> <str name="httpReadTimeout">10000</str> </lst> </requestHandler> REPLICATION DETAILS FROM PAGE Master master core url Poll Interval 00:03:00 Local Index Index Version: 1296217104577, Generation: 20190 Location: /data/solr/core/search-data/index.20110429042508 Size: 2.1 GB Times Replicated Since Startup: 672 Previous Replication Done At: Fri May 06 15:41:01 EDT 2011 Config Files Replicated At: null Config Files Replicated: null Times Config Files Replicated Since Startup: null Next Replication Cycle At: Fri May 06 15:44:00 EDT 2011 Current Replication Status Start Time: Fri May 06 15:41:00 EDT 2011 Files Downloaded: 43 / 197 Downloaded: 477.08 KB / 588.82 MB [0.0%] Downloading File: _hdm.prx, Downloaded: 9.3 KB / 9.3 KB [100.0%] Time Elapsed: 967s, Estimated Time Remaining: 1221166s, Speed: 505 bytes/s Ravi Kiran Bhaskar +
Ravi Solr 2011-05-06, 20:52
-
Re: Replication Clarification PleaseBill Bell 2011-05-08, 03:49
I did not see answers... I am not an authority, but will tell you what I
think.... Did you get some answers? On 5/6/11 2:52 PM, "Ravi Solr" <[EMAIL PROTECTED]> wrote: >Hello, > Pardon me if this has been already answered somewhere and I >apologize for a lengthy post. I was wondering if anybody could help me >understand Replication internals a bit more. We have a single >master-slave setup (solr 1.4.1) with the configurations as shown >below. Our environment is quite commit heavy (almost 100s of docs >every 5 minutes), and all indexing is done on Master and all searches >go to the Slave. We are seeing that the slave replication performance >gradually decreases and the speed decreases < 1kbps and ultimately >gets backed up. Once we reload the core on slave it will be work fine >for sometime and then it again gets backed up. We have mergeFactor set >to 10 and ramBufferSizeMB is set to 32MB and solr itself is running >with 2GB memory and locktype is simple on both master and slave. How big is your index? How many rows and GB ? Every time you replicate, there are several resets on caching. So if you are constantly Indexing, you need to be careful on how that performance impact will apply. > >I am hoping that the following questions might help me understand the >replication performance issue better (Replication Configuration is >given at the end of the email) > >1. Does the Slave get the whole index every time during replication or >just the delta since the last replication happened ? It depends. If you do an OPTIMIZE every time your index, then you will be sending the whole index down. If the amount of time if > 10 segments, I believe that might also trigger a whole index, since you cycled all the segments. In that case I think you might want to increase the mergeFactor. > >2. If there are huge number of queries being done on slave will it >affect the replication ? How can I improve the performance ? (see the >replications details at he bottom of the page) It seems that might be one way the you get the index.* directories. At least I see it more frequently when there is huge load and you are trying to replicate. You could replicate less frequently. > >3. Will the segment names be same be same on master and slave after >replication ? I see that they are different. Is this correct ? If it >is correct how does the slave know what to fetch the next time i.e. >the delta. Yes they better be. In the old days you could just rsync the data directory from master and slave and reload the core, that worked fine. > >4. When and why does the index.<TIMESTAMP> folder get created ? I see >this type of folder getting created only on slave and the slave >instance is pointing to it. I would love to know all the conditions... I believe it is supposed to replicate to index.*, then reload to point to it. But sometimes it gets stuck in index.* land and never goes back to straight index. There are several bug fixes for this in 3.1. > >5. Does replication process copy both the index and index.<TIMESTAMP> >folder ? I believe it is supposed to copy the segment or whole index/ from master to index.* on slave. > >6. what happens if the replication kicks off even before the previous >invocation has not completed ? will the 2nd invocation block or will >it go through causing more confusion ? That is not supposed to happen, if a replication is in process, it should not copy again until that one is complete. Try it, just delete the data/*, restart SOLR, and force a replication, while it is syncing, force it again. Does not seem to work for me. > >7. If I have to prep a new master-slave combination is it OK to copy >the respective contents into the new master-slave and start solr ? or >do I have have to wipe the new slave and let it replicate from its new >master ? If you shut down the slave, copy the data/* directory amd restart you should be fine. That is how we fix the data/ dir when there is corruption. > >8. Doing an 'ls | wc -l' on index folder of master and slave gave 194 Several bugs fixed in 3.1 for this one. Not a good thing.... You are getting leftover segments or index.* directories. +
Bill Bell 2011-05-08, 03:49
-
Re: Replication Clarification PleaseRavi Solr 2011-05-09, 15:24
Hello Mr. Bell,
Thank you very much for patiently responding to my questions. We optimize once in every 2 days. Can you kindly rephrase your answer, I could not understand - "if the amount of time if > 10 segments, I believe that might also trigger a whole index, since you cycled all the segments.In that case I think you might want to increase the mergeFactor." The current index folder details and sizes are given below MASTER -------------- 5K search-data/spellchecker2 480M search-data/index 5K search-data/spellchecker1 5K search-data/spellcheckerFile 480M search-data SLAVE ---------- 2K search-data/index.20110509103950 419M search-data/index 2.3G search-data/index.20110429042508 ----> SLAVE is pointing to this directory 5K search-data/spellchecker1 5K search-data/spellchecker2 5K search-data/spellcheckerFile 2.7G search-data Thanks, Ravi Kiran Bhaskar On Sat, May 7, 2011 at 11:49 PM, Bill Bell <[EMAIL PROTECTED]> wrote: > I did not see answers... I am not an authority, but will tell you what I > think.... > > Did you get some answers? > > > On 5/6/11 2:52 PM, "Ravi Solr" <[EMAIL PROTECTED]> wrote: > >>Hello, >> Pardon me if this has been already answered somewhere and I >>apologize for a lengthy post. I was wondering if anybody could help me >>understand Replication internals a bit more. We have a single >>master-slave setup (solr 1.4.1) with the configurations as shown >>below. Our environment is quite commit heavy (almost 100s of docs >>every 5 minutes), and all indexing is done on Master and all searches >>go to the Slave. We are seeing that the slave replication performance >>gradually decreases and the speed decreases < 1kbps and ultimately >>gets backed up. Once we reload the core on slave it will be work fine >>for sometime and then it again gets backed up. We have mergeFactor set >>to 10 and ramBufferSizeMB is set to 32MB and solr itself is running >>with 2GB memory and locktype is simple on both master and slave. > > How big is your index? How many rows and GB ? > > Every time you replicate, there are several resets on caching. So if you > are constantly > Indexing, you need to be careful on how that performance impact will apply. > >> >>I am hoping that the following questions might help me understand the >>replication performance issue better (Replication Configuration is >>given at the end of the email) >> >>1. Does the Slave get the whole index every time during replication or >>just the delta since the last replication happened ? > > > It depends. If you do an OPTIMIZE every time your index, then you will be > sending the whole index down. > If the amount of time if > 10 segments, I believe that might also trigger > a whole index, since you cycled all the segments. > In that case I think you might want to increase the mergeFactor. > > >> >>2. If there are huge number of queries being done on slave will it >>affect the replication ? How can I improve the performance ? (see the >>replications details at he bottom of the page) > > It seems that might be one way the you get the index.* directories. At > least I see it more frequently when there is huge load and you are trying > to replicate. > You could replicate less frequently. > >> >>3. Will the segment names be same be same on master and slave after >>replication ? I see that they are different. Is this correct ? If it >>is correct how does the slave know what to fetch the next time i.e. >>the delta. > > Yes they better be. In the old days you could just rsync the data > directory from master and slave and reload the core, that worked fine. > >> >>4. When and why does the index.<TIMESTAMP> folder get created ? I see >>this type of folder getting created only on slave and the slave >>instance is pointing to it. > > I would love to know all the conditions... I believe it is supposed to > replicate to index.*, then reload to point to it. But sometimes it gets > stuck in index.* land and never goes back to straight index. +
Ravi Solr 2011-05-09, 15:24
-
Re: Replication Clarification PleaseBill Bell 2011-05-11, 05:22
OK let me rephrase.
In solrconfig.xml there is a setting called mergeFactor. The default is usually 10. Practically it means there are 10 segments. If you are doing fast delta indexing (adding a couple documents, then committing), You will cycle through all 10 segments pretty fast. It appears that if you do go past the 10 segments without replicating, the only recourse is for the replicator to do a full index replication instead of a delta index replication... Does that help? On 5/9/11 9:24 AM, "Ravi Solr" <[EMAIL PROTECTED]> wrote: >Hello Mr. Bell, > Thank you very much for patiently responding to my >questions. We optimize once in every 2 days. Can you kindly rephrase >your answer, I could not understand - "if the amount of time if > 10 >segments, I believe that might also trigger a whole index, since you >cycled all the segments.In that case I think you might want to >increase the mergeFactor." > >The current index folder details and sizes are given below > >MASTER >-------------- > 5K search-data/spellchecker2 > 480M search-data/index > 5K search-data/spellchecker1 > 5K search-data/spellcheckerFile > 480M search-data > >SLAVE >---------- > 2K search-data/index.20110509103950 > 419M search-data/index > 2.3G search-data/index.20110429042508 ----> SLAVE is pointing to >this directory > 5K search-data/spellchecker1 > 5K search-data/spellchecker2 > 5K search-data/spellcheckerFile > 2.7G search-data > >Thanks, > >Ravi Kiran Bhaskar > >On Sat, May 7, 2011 at 11:49 PM, Bill Bell <[EMAIL PROTECTED]> wrote: >> I did not see answers... I am not an authority, but will tell you what I >> think.... >> >> Did you get some answers? >> >> >> On 5/6/11 2:52 PM, "Ravi Solr" <[EMAIL PROTECTED]> wrote: >> >>>Hello, >>> Pardon me if this has been already answered somewhere and I >>>apologize for a lengthy post. I was wondering if anybody could help me >>>understand Replication internals a bit more. We have a single >>>master-slave setup (solr 1.4.1) with the configurations as shown >>>below. Our environment is quite commit heavy (almost 100s of docs >>>every 5 minutes), and all indexing is done on Master and all searches >>>go to the Slave. We are seeing that the slave replication performance >>>gradually decreases and the speed decreases < 1kbps and ultimately >>>gets backed up. Once we reload the core on slave it will be work fine >>>for sometime and then it again gets backed up. We have mergeFactor set >>>to 10 and ramBufferSizeMB is set to 32MB and solr itself is running >>>with 2GB memory and locktype is simple on both master and slave. >> >> How big is your index? How many rows and GB ? >> >> Every time you replicate, there are several resets on caching. So if you >> are constantly >> Indexing, you need to be careful on how that performance impact will >>apply. >> >>> >>>I am hoping that the following questions might help me understand the >>>replication performance issue better (Replication Configuration is >>>given at the end of the email) >>> >>>1. Does the Slave get the whole index every time during replication or >>>just the delta since the last replication happened ? >> >> >> It depends. If you do an OPTIMIZE every time your index, then you will >>be >> sending the whole index down. >> If the amount of time if > 10 segments, I believe that might also >>trigger >> a whole index, since you cycled all the segments. >> In that case I think you might want to increase the mergeFactor. >> >> >>> >>>2. If there are huge number of queries being done on slave will it >>>affect the replication ? How can I improve the performance ? (see the >>>replications details at he bottom of the page) >> >> It seems that might be one way the you get the index.* directories. At >> least I see it more frequently when there is huge load and you are >>trying >> to replicate. >> You could replicate less frequently. >> >>> >>>3. Will the segment names be same be same on master and slave after +
Bill Bell 2011-05-11, 05:22
-
Re: Replication Clarification PleaseRavi Solr 2011-05-11, 13:25
Mr. Bell,
Thank you for your help. Yes, the full index replicated every 1000, 10000, 100000 etc, if mergeFactor is 10 as per it's definition. We do index every 5 minutes and replicate every 3 minutes just to make sure consumers have immediate access to the indexed docs. Thanks, Ravi Kiran Bhaskar On Wednesday, May 11, 2011, Bill Bell <[EMAIL PROTECTED]> wrote: > OK let me rephrase. > > In solrconfig.xml there is a setting called mergeFactor. The default is > usually 10. > Practically it means there are 10 segments. If you are doing fast delta > indexing (adding a couple documents, then committing), > You will cycle through all 10 segments pretty fast. > > It appears that if you do go past the 10 segments without replicating, the > only recourse is for the replicator to do a full index replication instead > of a delta index replication... > > Does that help? > > > On 5/9/11 9:24 AM, "Ravi Solr" <[EMAIL PROTECTED]> wrote: > >>Hello Mr. Bell, >> Thank you very much for patiently responding to my >>questions. We optimize once in every 2 days. Can you kindly rephrase >>your answer, I could not understand - "if the amount of time if > 10 >>segments, I believe that might also trigger a whole index, since you >>cycled all the segments.In that case I think you might want to >>increase the mergeFactor." >> >>The current index folder details and sizes are given below >> >>MASTER >>-------------- >> 5K search-data/spellchecker2 >> 480M search-data/index >> 5K search-data/spellchecker1 >> 5K search-data/spellcheckerFile >> 480M search-data >> >>SLAVE >>---------- >> 2K search-data/index.20110509103950 >> 419M search-data/index >> 2.3G search-data/index.20110429042508 ----> SLAVE is pointing to >>this directory >> 5K search-data/spellchecker1 >> 5K search-data/spellchecker2 >> 5K search-data/spellcheckerFile >> 2.7G search-data >> >>Thanks, >> >>Ravi Kiran Bhaskar >> >>On Sat, May 7, 2011 at 11:49 PM, Bill Bell <[EMAIL PROTECTED]> wrote: >>> I did not see answers... I am not an authority, but will tell you what I >>> think.... >>> >>> Did you get some answers? >>> >>> >>> On 5/6/11 2:52 PM, "Ravi Solr" <[EMAIL PROTECTED]> wrote: >>> >>>>Hello, >>>> Pardon me if this has been already answered somewhere and I >>>>apologize for a lengthy post. I was wondering if anybody could help me >>>>understand Replication internals a bit more. We have a single >>>>master-slave setup (solr 1.4.1) with the configurations as shown >>>>below. Our environment is quite commit heavy (almost 100s of docs >>>>every 5 minutes), and all indexing is done on Master and all searches >>>>go to the Slave. We are seeing that the slave replication performance >>>>gradually decreases and the speed decreases < 1kbps and ultimately >>>>gets backed up. Once we reload the core on slave it will be work fine >>>>for sometime and then it again gets backed up. We have mergeFactor set >>>>to 10 and ramBufferSizeMB is set to 32MB and solr itself is running >>>>with 2GB memory and locktype is simple on both master and slave. >>> >>> How big is your index? How many rows and GB ? >>> >>> Every time you replicate, there are several resets on caching. So if you >>> are constantly >>> Indexing, you need to be careful on how that performance impact will >>>apply. >>> >>>> >>>>I am hoping that the following questions might help me understand the >>>>replication performance issue better (Replication Configuration is >>>>given at the end of the email) >>>> >>>>1. Does the Slave get the whole index every time during replication or >>>>just the delta since the last replication happened ? >>> >>> >>> It depends. If you do an OPTIMIZE every time your index, then you will >>>be >>> sending the whole index down. >>> If the amount of time if > 10 segments, I believe that might also >>>trigger >>> a whole index, since you cycled all the segments. >>> In that case I think you might want to increase the mergeFactor. >>> >>> >>> +
Ravi Solr 2011-05-11, 13:25
-
Re: Replication Clarification PleaseAlexander Kanarsky 2011-05-10, 08:10
Ravi,
as far as I remember, this is how the replication logic works (see SnapPuller class, fetchLatestIndex method): > 1. Does the Slave get the whole index every time during replication or > just the delta since the last replication happened ? It look at the index version AND the index generation. If both slave's version and generation are the same as on master, nothing gets replicated. if the master's generation is greater than on slave, the slave fetches the delta files only (even if the partial merge was done on the master) and put the new files from master to the same index folder on slave (either index or index.<timestamp>, see further explanation). However, if the master's index generation is equals or less than one on slave, the slave does the full replication by fetching all files of the master's index and place them into a separate folder on slave (index.<timestamp>). Then, if the fetch is successfull, the slave updates (or creates) the index.properties file and puts there the name of the "current" index folder. The "old" index.<timestamp> folder(s) will be kept in 1.4.x - which was treated as a bug - see SOLR-2156 (and this was fixed in 3.1). After this, the slave does commit or reload core depending whether the config files were replicated. There is another bug in 1.4.x that fails replication if the slave need to do the full replication AND the config files were changed - also fixed in 3.1 (see SOLR-1983). > 2. If there are huge number of queries being done on slave will it > affect the replication ? How can I improve the performance ? (see the > replications details at he bottom of the page) >From my experience the half of the replication time is a time when the transferred data flushes to the disk. So the IO impact is important. > 3. Will the segment names be same be same on master and slave after > replication ? I see that they are different. Is this correct ? If it > is correct how does the slave know what to fetch the next time i.e. > the delta. They should be the same. The slave fetches the changed files only (see above), also look at SnapPuller code. > 4. When and why does the index.<TIMESTAMP> folder get created ? I see > this type of folder getting created only on slave and the slave > instance is pointing to it. See above. > 5. Does replication process copy both the index and index.<TIMESTAMP> folder ? index.<timestamp> folder gets created only of the full replication happened at least once. Otherwise, the slave will use the index folder. > 6. what happens if the replication kicks off even before the previous > invocation has not completed ? will the 2nd invocation block or will > it go through causing more confusion ? There is a lock (snapPullLock in ReplicationHandler) that prevents two replications run simultaneously. If there is no bug, it should just return silently from the replication call. (I personally never had problem with this so it looks there is no bug :) > 7. If I have to prep a new master-slave combination is it OK to copy > the respective contents into the new master-slave and start solr ? or > do I have have to wipe the new slave and let it replicate from its new > master ? If the new master has a different index, the slave will create a new <index.timestamp> folder. There is no need to wipe it. > 8. Doing an 'ls | wc -l' on index folder of master and slave gave 194 > and 17968 respectively...I slave has lot of segments_xxx files. Is > this normal ? No, it looks like in your case the slave continues to replicate to the same folder for a long time period but the old files are not getting deleted bu some reason. Try to restart the slave or do core reload on it to see if the old segments gone. -Alexander +
Alexander Kanarsky 2011-05-10, 08:10
-
Re: Replication Clarification PleaseRavi Solr 2011-05-10, 16:45
Hello Mr. Kanarsky,
Thank you very much for the detailed explanation, probably the best explanation I found regarding replication. Just to be sure, I wanted to test solr 3.1 to see if it alleviates the problems...I dont think it helped. The master index version and generation are greater than the slave, still the slave replicates the entire index form master (see replication admin screen output below). Any idea why it would get the whole index everytime even in 3.1 or am I misinterpreting the output ? However I must admit that 3.1 finished the replication unlike 1.4.1 which would hang and be backed up for ever. Master http://masterurl:post/solr-admin/searchcore/replication Latest Index Version:null, Generation: null Replicatable Index Version:1296217097572, Generation: 12726 Poll Interval 00:03:00 Local Index Index Version: 1296217097569, Generation: 12725 Location: /data/solr/core/search-data/index Size: 944.32 MB Times Replicated Since Startup: 148 Previous Replication Done At: Tue May 10 12:32:42 EDT 2011 Config Files Replicated At: null Config Files Replicated: null Times Config Files Replicated Since Startup: null Next Replication Cycle At: Tue May 10 12:35:41 EDT 2011 Current Replication Status Start Time: Tue May 10 12:32:41 EDT 2011 Files Downloaded: 18 / 108 Downloaded: 317.48 KB / 436.24 MB [0.0%] Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%] Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s Thanks, Ravi Kiran Bhaskar On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky <[EMAIL PROTECTED]> wrote: > Ravi, > > as far as I remember, this is how the replication logic works (see > SnapPuller class, fetchLatestIndex method): > >> 1. Does the Slave get the whole index every time during replication or >> just the delta since the last replication happened ? > > > It look at the index version AND the index generation. If both slave's > version and generation are the same as on master, nothing gets > replicated. if the master's generation is greater than on slave, the > slave fetches the delta files only (even if the partial merge was done > on the master) and put the new files from master to the same index > folder on slave (either index or index.<timestamp>, see further > explanation). However, if the master's index generation is equals or > less than one on slave, the slave does the full replication by > fetching all files of the master's index and place them into a > separate folder on slave (index.<timestamp>). Then, if the fetch is > successfull, the slave updates (or creates) the index.properties file > and puts there the name of the "current" index folder. The "old" > index.<timestamp> folder(s) will be kept in 1.4.x - which was treated > as a bug - see SOLR-2156 (and this was fixed in 3.1). After this, the > slave does commit or reload core depending whether the config files > were replicated. There is another bug in 1.4.x that fails replication > if the slave need to do the full replication AND the config files were > changed - also fixed in 3.1 (see SOLR-1983). > >> 2. If there are huge number of queries being done on slave will it >> affect the replication ? How can I improve the performance ? (see the >> replications details at he bottom of the page) > > > >From my experience the half of the replication time is a time when the > transferred data flushes to the disk. So the IO impact is important. > >> 3. Will the segment names be same be same on master and slave after >> replication ? I see that they are different. Is this correct ? If it >> is correct how does the slave know what to fetch the next time i.e. >> the delta. > > > They should be the same. The slave fetches the changed files only (see > above), also look at SnapPuller code. > >> 4. When and why does the index.<TIMESTAMP> folder get created ? I see >> this type of folder getting created only on slave and the slave >> instance is pointing to it. > > > See above. > >> 5. Does replication process copy both the index and index.<TIMESTAMP> +
Ravi Solr 2011-05-10, 16:45
-
Re: Replication Clarification PleaseAlexander Kanarsky 2011-05-11, 22:00
Ravi,
if you have what looks like a full replication each time even if the master generation is greater than slave, try to watch for the index on both master and slave the same time to see what files are getting replicated. You probably may need to adjust your merge factor, as Bill mentioned. -Alexander On Tue, 2011-05-10 at 12:45 -0400, Ravi Solr wrote: > Hello Mr. Kanarsky, > Thank you very much for the detailed explanation, > probably the best explanation I found regarding replication. Just to > be sure, I wanted to test solr 3.1 to see if it alleviates the > problems...I dont think it helped. The master index version and > generation are greater than the slave, still the slave replicates the > entire index form master (see replication admin screen output below). > Any idea why it would get the whole index everytime even in 3.1 or am > I misinterpreting the output ? However I must admit that 3.1 finished > the replication unlike 1.4.1 which would hang and be backed up for > ever. > > Master http://masterurl:post/solr-admin/searchcore/replication > Latest Index Version:null, Generation: null > Replicatable Index Version:1296217097572, Generation: 12726 > > Poll Interval 00:03:00 > > Local Index Index Version: 1296217097569, Generation: 12725 > > Location: /data/solr/core/search-data/index > Size: 944.32 MB > Times Replicated Since Startup: 148 > Previous Replication Done At: Tue May 10 12:32:42 EDT 2011 > Config Files Replicated At: null > Config Files Replicated: null > Times Config Files Replicated Since Startup: null > Next Replication Cycle At: Tue May 10 12:35:41 EDT 2011 > > Current Replication Status Start Time: Tue May 10 12:32:41 EDT 2011 > Files Downloaded: 18 / 108 > Downloaded: 317.48 KB / 436.24 MB [0.0%] > Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%] > Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s > > > Thanks, > Ravi Kiran Bhaskar > > On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky > <[EMAIL PROTECTED]> wrote: > > Ravi, > > > > as far as I remember, this is how the replication logic works (see > > SnapPuller class, fetchLatestIndex method): > > > >> 1. Does the Slave get the whole index every time during replication or > >> just the delta since the last replication happened ? > > > > > > It look at the index version AND the index generation. If both slave's > > version and generation are the same as on master, nothing gets > > replicated. if the master's generation is greater than on slave, the > > slave fetches the delta files only (even if the partial merge was done > > on the master) and put the new files from master to the same index > > folder on slave (either index or index.<timestamp>, see further > > explanation). However, if the master's index generation is equals or > > less than one on slave, the slave does the full replication by > > fetching all files of the master's index and place them into a > > separate folder on slave (index.<timestamp>). Then, if the fetch is > > successfull, the slave updates (or creates) the index.properties file > > and puts there the name of the "current" index folder. The "old" > > index.<timestamp> folder(s) will be kept in 1.4.x - which was treated > > as a bug - see SOLR-2156 (and this was fixed in 3.1). After this, the > > slave does commit or reload core depending whether the config files > > were replicated. There is another bug in 1.4.x that fails replication > > if the slave need to do the full replication AND the config files were > > changed - also fixed in 3.1 (see SOLR-1983). > > > >> 2. If there are huge number of queries being done on slave will it > >> affect the replication ? How can I improve the performance ? (see the > >> replications details at he bottom of the page) > > > > > > >From my experience the half of the replication time is a time when the > > transferred data flushes to the disk. So the IO impact is important. > > > >> 3. Will the segment names be same be same on master and slave after +
Alexander Kanarsky 2011-05-11, 22:00
-
Re: Replication Clarification PleaseRavi Solr 2011-05-12, 22:42
Thank you Mr. Bell and Mr. Kanarsky, as per your advise we have moved
from 1.4.1 to 3.1 and have made several changes to configuration. The configuration changes have worked nicely till now and the replication is finishing within the interval and not backing up. The changes we made are as follows 1. Increased the mergeFactor from 10 to 15 2. Increased ramBufferSizeMB to 1024 3. Changed lockType to single (previously it was simple) 4. Set maxCommitsToKeep to 1 in the deletionPolicy 5. Set maxPendingDeletes to 0 6. Changed caches from LRUCache to FastLRUCache as we had hit ratios well over 75% to increase warming speed 7. Increased the poll interval to 6 minutes and re-indexed all content. Thanks, Ravi Kiran Bhaskar On Wed, May 11, 2011 at 6:00 PM, Alexander Kanarsky <[EMAIL PROTECTED]> wrote: > Ravi, > > if you have what looks like a full replication each time even if the > master generation is greater than slave, try to watch for the index on > both master and slave the same time to see what files are getting > replicated. You probably may need to adjust your merge factor, as Bill > mentioned. > > -Alexander > > > > On Tue, 2011-05-10 at 12:45 -0400, Ravi Solr wrote: >> Hello Mr. Kanarsky, >> Thank you very much for the detailed explanation, >> probably the best explanation I found regarding replication. Just to >> be sure, I wanted to test solr 3.1 to see if it alleviates the >> problems...I dont think it helped. The master index version and >> generation are greater than the slave, still the slave replicates the >> entire index form master (see replication admin screen output below). >> Any idea why it would get the whole index everytime even in 3.1 or am >> I misinterpreting the output ? However I must admit that 3.1 finished >> the replication unlike 1.4.1 which would hang and be backed up for >> ever. >> >> Master http://masterurl:post/solr-admin/searchcore/replication >> Latest Index Version:null, Generation: null >> Replicatable Index Version:1296217097572, Generation: 12726 >> >> Poll Interval 00:03:00 >> >> Local Index Index Version: 1296217097569, Generation: 12725 >> >> Location: /data/solr/core/search-data/index >> Size: 944.32 MB >> Times Replicated Since Startup: 148 >> Previous Replication Done At: Tue May 10 12:32:42 EDT 2011 >> Config Files Replicated At: null >> Config Files Replicated: null >> Times Config Files Replicated Since Startup: null >> Next Replication Cycle At: Tue May 10 12:35:41 EDT 2011 >> >> Current Replication Status Start Time: Tue May 10 12:32:41 EDT 2011 >> Files Downloaded: 18 / 108 >> Downloaded: 317.48 KB / 436.24 MB [0.0%] >> Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%] >> Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s >> >> >> Thanks, >> Ravi Kiran Bhaskar >> >> On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky >> <[EMAIL PROTECTED]> wrote: >> > Ravi, >> > >> > as far as I remember, this is how the replication logic works (see >> > SnapPuller class, fetchLatestIndex method): >> > >> >> 1. Does the Slave get the whole index every time during replication or >> >> just the delta since the last replication happened ? >> > >> > >> > It look at the index version AND the index generation. If both slave's >> > version and generation are the same as on master, nothing gets >> > replicated. if the master's generation is greater than on slave, the >> > slave fetches the delta files only (even if the partial merge was done >> > on the master) and put the new files from master to the same index >> > folder on slave (either index or index.<timestamp>, see further >> > explanation). However, if the master's index generation is equals or >> > less than one on slave, the slave does the full replication by >> > fetching all files of the master's index and place them into a >> > separate folder on slave (index.<timestamp>). Then, if the fetch is +
Ravi Solr 2011-05-12, 22:42
-
Re: Replication Clarification PleaseRavi Solr 2011-05-13, 22:34
Sorry guys spoke too soon I guess. The replication still remains very
slow even after upgrading to 3.1 and setting the compression off. Now Iam totally clueless. I have tried everything that I know of to increase the speed of replication but failed. if anybody faced the same issue, can you please tell me how you solved it. Ravi Kiran Bhaskar On Thu, May 12, 2011 at 6:42 PM, Ravi Solr <[EMAIL PROTECTED]> wrote: > Thank you Mr. Bell and Mr. Kanarsky, as per your advise we have moved > from 1.4.1 to 3.1 and have made several changes to configuration. The > configuration changes have worked nicely till now and the replication > is finishing within the interval and not backing up. The changes we > made are as follows > > 1. Increased the mergeFactor from 10 to 15 > 2. Increased ramBufferSizeMB to 1024 > 3. Changed lockType to single (previously it was simple) > 4. Set maxCommitsToKeep to 1 in the deletionPolicy > 5. Set maxPendingDeletes to 0 > 6. Changed caches from LRUCache to FastLRUCache as we had hit ratios > well over 75% to increase warming speed > 7. Increased the poll interval to 6 minutes and re-indexed all content. > > Thanks, > > Ravi Kiran Bhaskar > > On Wed, May 11, 2011 at 6:00 PM, Alexander Kanarsky > <[EMAIL PROTECTED]> wrote: >> Ravi, >> >> if you have what looks like a full replication each time even if the >> master generation is greater than slave, try to watch for the index on >> both master and slave the same time to see what files are getting >> replicated. You probably may need to adjust your merge factor, as Bill >> mentioned. >> >> -Alexander >> >> >> >> On Tue, 2011-05-10 at 12:45 -0400, Ravi Solr wrote: >>> Hello Mr. Kanarsky, >>> Thank you very much for the detailed explanation, >>> probably the best explanation I found regarding replication. Just to >>> be sure, I wanted to test solr 3.1 to see if it alleviates the >>> problems...I dont think it helped. The master index version and >>> generation are greater than the slave, still the slave replicates the >>> entire index form master (see replication admin screen output below). >>> Any idea why it would get the whole index everytime even in 3.1 or am >>> I misinterpreting the output ? However I must admit that 3.1 finished >>> the replication unlike 1.4.1 which would hang and be backed up for >>> ever. >>> >>> Master http://masterurl:post/solr-admin/searchcore/replication >>> Latest Index Version:null, Generation: null >>> Replicatable Index Version:1296217097572, Generation: 12726 >>> >>> Poll Interval 00:03:00 >>> >>> Local Index Index Version: 1296217097569, Generation: 12725 >>> >>> Location: /data/solr/core/search-data/index >>> Size: 944.32 MB >>> Times Replicated Since Startup: 148 >>> Previous Replication Done At: Tue May 10 12:32:42 EDT 2011 >>> Config Files Replicated At: null >>> Config Files Replicated: null >>> Times Config Files Replicated Since Startup: null >>> Next Replication Cycle At: Tue May 10 12:35:41 EDT 2011 >>> >>> Current Replication Status Start Time: Tue May 10 12:32:41 EDT 2011 >>> Files Downloaded: 18 / 108 >>> Downloaded: 317.48 KB / 436.24 MB [0.0%] >>> Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%] >>> Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s >>> >>> >>> Thanks, >>> Ravi Kiran Bhaskar >>> >>> On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky >>> <[EMAIL PROTECTED]> wrote: >>> > Ravi, >>> > >>> > as far as I remember, this is how the replication logic works (see >>> > SnapPuller class, fetchLatestIndex method): >>> > >>> >> 1. Does the Slave get the whole index every time during replication or >>> >> just the delta since the last replication happened ? >>> > >>> > >>> > It look at the index version AND the index generation. If both slave's >>> > version and generation are the same as on master, nothing gets >>> > replicated. if the master's generation is greater than on slave, the +
Ravi Solr 2011-05-13, 22:34
-
Re: Replication Clarification PleaseAlexander Kanarsky 2011-05-15, 07:12
Ravi,
what is the replication configuration on both master and slave? Also could you list of files in the index folder on master and slave before and after the replication? -Alexander On Fri, 2011-05-13 at 18:34 -0400, Ravi Solr wrote: > Sorry guys spoke too soon I guess. The replication still remains very > slow even after upgrading to 3.1 and setting the compression off. Now > Iam totally clueless. I have tried everything that I know of to > increase the speed of replication but failed. if anybody faced the > same issue, can you please tell me how you solved it. > > Ravi Kiran Bhaskar > > On Thu, May 12, 2011 at 6:42 PM, Ravi Solr <[EMAIL PROTECTED]> wrote: > > Thank you Mr. Bell and Mr. Kanarsky, as per your advise we have moved > > from 1.4.1 to 3.1 and have made several changes to configuration. The > > configuration changes have worked nicely till now and the replication > > is finishing within the interval and not backing up. The changes we > > made are as follows > > > > 1. Increased the mergeFactor from 10 to 15 > > 2. Increased ramBufferSizeMB to 1024 > > 3. Changed lockType to single (previously it was simple) > > 4. Set maxCommitsToKeep to 1 in the deletionPolicy > > 5. Set maxPendingDeletes to 0 > > 6. Changed caches from LRUCache to FastLRUCache as we had hit ratios > > well over 75% to increase warming speed > > 7. Increased the poll interval to 6 minutes and re-indexed all content. > > > > Thanks, > > > > Ravi Kiran Bhaskar > > > > On Wed, May 11, 2011 at 6:00 PM, Alexander Kanarsky > > <[EMAIL PROTECTED]> wrote: > >> Ravi, > >> > >> if you have what looks like a full replication each time even if the > >> master generation is greater than slave, try to watch for the index on > >> both master and slave the same time to see what files are getting > >> replicated. You probably may need to adjust your merge factor, as Bill > >> mentioned. > >> > >> -Alexander > >> > >> > >> > >> On Tue, 2011-05-10 at 12:45 -0400, Ravi Solr wrote: > >>> Hello Mr. Kanarsky, > >>> Thank you very much for the detailed explanation, > >>> probably the best explanation I found regarding replication. Just to > >>> be sure, I wanted to test solr 3.1 to see if it alleviates the > >>> problems...I dont think it helped. The master index version and > >>> generation are greater than the slave, still the slave replicates the > >>> entire index form master (see replication admin screen output below). > >>> Any idea why it would get the whole index everytime even in 3.1 or am > >>> I misinterpreting the output ? However I must admit that 3.1 finished > >>> the replication unlike 1.4.1 which would hang and be backed up for > >>> ever. > >>> > >>> Master http://masterurl:post/solr-admin/searchcore/replication > >>> Latest Index Version:null, Generation: null > >>> Replicatable Index Version:1296217097572, Generation: 12726 > >>> > >>> Poll Interval 00:03:00 > >>> > >>> Local Index Index Version: 1296217097569, Generation: 12725 > >>> > >>> Location: /data/solr/core/search-data/index > >>> Size: 944.32 MB > >>> Times Replicated Since Startup: 148 > >>> Previous Replication Done At: Tue May 10 12:32:42 EDT 2011 > >>> Config Files Replicated At: null > >>> Config Files Replicated: null > >>> Times Config Files Replicated Since Startup: null > >>> Next Replication Cycle At: Tue May 10 12:35:41 EDT 2011 > >>> > >>> Current Replication Status Start Time: Tue May 10 12:32:41 EDT 2011 > >>> Files Downloaded: 18 / 108 > >>> Downloaded: 317.48 KB / 436.24 MB [0.0%] > >>> Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%] > >>> Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s > >>> > >>> > >>> Thanks, > >>> Ravi Kiran Bhaskar > >>> > >>> On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky > >>> <[EMAIL PROTECTED]> wrote: > >>> > Ravi, > >>> > > >>> > as far as I remember, this is how the replication logic works (see +
Alexander Kanarsky 2011-05-15, 07:12
-
Re: Replication Clarification PleaseRavi Solr 2011-05-18, 20:24
Alexander, sorry for the delay in replying. I wanted to test out a few
hunches that I had before I get back to you. Hurray!!! I was able to resolve the issue. The problem was with the cache settings in the solrconfig.xml. It was taking almost 15-20 minutes to warm up the caches on each commit, as we are commit heavy (every 5 minutes) the replication was screaming for the new searcher to be warmed and it would never get a chance to finish so it was perennially backed up. We reduced the cache and autowarm counts and now the replication is happy finishing within 20 seconds!! Thank you again for all your support. Thanks, Ravi Kiran Bhaskar The Washington Post 1150 15th St. NW Washington, DC 20071 On Sun, May 15, 2011 at 3:12 AM, Alexander Kanarsky <[EMAIL PROTECTED]> wrote: > Ravi, > > what is the replication configuration on both master and slave? > Also could you list of files in the index folder on master and slave > before and after the replication? > > -Alexander > > > On Fri, 2011-05-13 at 18:34 -0400, Ravi Solr wrote: >> Sorry guys spoke too soon I guess. The replication still remains very >> slow even after upgrading to 3.1 and setting the compression off. Now >> Iam totally clueless. I have tried everything that I know of to >> increase the speed of replication but failed. if anybody faced the >> same issue, can you please tell me how you solved it. >> >> Ravi Kiran Bhaskar >> >> On Thu, May 12, 2011 at 6:42 PM, Ravi Solr <[EMAIL PROTECTED]> wrote: >> > Thank you Mr. Bell and Mr. Kanarsky, as per your advise we have moved >> > from 1.4.1 to 3.1 and have made several changes to configuration. The >> > configuration changes have worked nicely till now and the replication >> > is finishing within the interval and not backing up. The changes we >> > made are as follows >> > >> > 1. Increased the mergeFactor from 10 to 15 >> > 2. Increased ramBufferSizeMB to 1024 >> > 3. Changed lockType to single (previously it was simple) >> > 4. Set maxCommitsToKeep to 1 in the deletionPolicy >> > 5. Set maxPendingDeletes to 0 >> > 6. Changed caches from LRUCache to FastLRUCache as we had hit ratios >> > well over 75% to increase warming speed >> > 7. Increased the poll interval to 6 minutes and re-indexed all content. >> > >> > Thanks, >> > >> > Ravi Kiran Bhaskar >> > >> > On Wed, May 11, 2011 at 6:00 PM, Alexander Kanarsky >> > <[EMAIL PROTECTED]> wrote: >> >> Ravi, >> >> >> >> if you have what looks like a full replication each time even if the >> >> master generation is greater than slave, try to watch for the index on >> >> both master and slave the same time to see what files are getting >> >> replicated. You probably may need to adjust your merge factor, as Bill >> >> mentioned. >> >> >> >> -Alexander >> >> >> >> >> >> >> >> On Tue, 2011-05-10 at 12:45 -0400, Ravi Solr wrote: >> >>> Hello Mr. Kanarsky, >> >>> Thank you very much for the detailed explanation, >> >>> probably the best explanation I found regarding replication. Just to >> >>> be sure, I wanted to test solr 3.1 to see if it alleviates the >> >>> problems...I dont think it helped. The master index version and >> >>> generation are greater than the slave, still the slave replicates the >> >>> entire index form master (see replication admin screen output below). >> >>> Any idea why it would get the whole index everytime even in 3.1 or am >> >>> I misinterpreting the output ? However I must admit that 3.1 finished >> >>> the replication unlike 1.4.1 which would hang and be backed up for >> >>> ever. >> >>> >> >>> Master http://masterurl:post/solr-admin/searchcore/replication >> >>> Latest Index Version:null, Generation: null >> >>> Replicatable Index Version:1296217097572, Generation: 12726 >> >>> >> >>> Poll Interval 00:03:00 >> >>> >> >>> Local Index Index Version: 1296217097569, Generation: 12725 >> >>> >> >>> Location: /data/solr/core/search-data/index >> >>> Size: 944.32 MB >> >>> Times Replicated Since Startup: 148 +
Ravi Solr 2011-05-18, 20:24
|