|
|
-
RE: DIH import and postImportDeleteQueryDyer, James 2011-05-25, 19:40
Great. I wasn't aware of the other issue. I put a link on the 2 issues in JIRA so people can know in the future.
James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -----Original Message----- From: Alexandre Rocco [mailto:[EMAIL PROTECTED]] Sent: Wednesday, May 25, 2011 2:34 PM To: [EMAIL PROTECTED] Subject: Re: DIH import and postImportDeleteQuery Hi James, Thanks for the heads up! I am currently on version 1.4.1, so I can apply this patch and see if it works. Just need to assess if it's best to apply the patch or to check on the backend system to see if only delete requests were generated and then do not call DIH. Previously, I found another open issue, created from Ephraim: https://issues.apache.org/jira/browse/SOLR-2104 It's the same issue, but it hasn't had any updates yet. Regards, Alexandre On Wed, May 25, 2011 at 3:17 PM, Dyer, James <[EMAIL PROTECTED]>wrote: > The "failure to commit" bug with $deleteDocById can be fixed by applying > patch SOLR-2492. This patch also partially fixes the "no updated stats" bug > in that it increments 1 for every call to $deleteDocById and > $deleteDocByQuery. Note that this might result in inaccurate counts if the > id given with $deleteDocById doesn't exist or is duplicated. Obviously this > is not a complete fix for stats using $deleteDocByQuery as this command > would normally be used to delete >1 doc at a time. > > The patch is for Trunk but it might work with 3.1 also. If not, it likely > only needs minor tweaking. > > The jira ticket is here: https://issues.apache.org/jira/browse/SOLR-2492 > > James Dyer > E-Commerce Systems > Ingram Content Group > (615) 213-4311 > > > -----Original Message----- > From: Alexandre Rocco [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, May 25, 2011 12:54 PM > To: [EMAIL PROTECTED] > Subject: Re: DIH import and postImportDeleteQuery > > Hi Ephraim, > > Thank you so much for the input. > I was able to find your thread on the archives and got your solution to > work. > > In fact, when using $deleteDocById and $skipDoc it worked like a charm. > This > feature is very useful, it's a shame it's not properly documented. > The only downside is the one you mentioned that the stats are not updated, > so if I update 13 documents and delete 2, DIH would tell me that only 13 > documents were processed. This is bad in my case because I check the end > result to generate an error e-mail if needed. > > You also mentioned that if the query contains only deletion records, a > commit would not be automatically executed and it would be necessary to > commit manually. > > How can I commit manually via DIH? I was not able to find any references on > the documentation. > > Thanks! > Alexandre > > On Wed, May 25, 2011 at 5:14 AM, Ephraim Ofir <[EMAIL PROTECTED]> wrote: > > > Search the list for my post "DIH - deleting documents, high performance > > (delta) imports, and passing parameters" which shows my solution a > > similar problem. > > > > Ephraim Ofir > > > > -----Original Message----- > > From: Alexandre Rocco [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, May 24, 2011 11:24 PM > > To: [EMAIL PROTECTED] > > Subject: DIH import and postImportDeleteQuery > > > > Guys, > > > > I am facing a situation in one of our projects that I need to perform a > > cleanup to remove some documents after we perform an update via DIH. > > The big issue right now comes from the fact that when we call the DIH > > with > > clean=false, the postImportDeleteQuery is not executed. > > > > My setup is currently arranged like this: > > - A SQL Server stored procedure that receives a parameter (specified in > > the > > URL) and returns the records to be indexed > > - The procedure is able to return all the records (for a full-import) or > > only the updated records (for a delta-import) > > - This procedure returns valid and deleted records, from this point > > comes > > the need to run a postImportDeleteQuery to remove the deleted ones. |