|
wallen@...
2004-08-16, 18:15
wallen@...
2004-08-16, 19:55
Honey George
2004-08-17, 14:35
wallen@...
2004-08-17, 14:39
Honey George
2004-08-17, 14:39
wallen@...
2004-08-17, 14:42
Honey George
2004-08-18, 10:50
Erik Hatcher
2004-08-18, 10:55
Honey George
2004-08-18, 12:20
Karthik N S
2004-08-18, 13:05
Honey George
2004-08-19, 08:38
Karthik N S
2004-08-19, 09:41
Honey George
2004-08-19, 10:37
Ryan McKinley
2007-11-10, 21:01
|
-
Restoring a corrupt indexwallen@... 2004-08-16, 18:15
Dear fellow Luceners,
I had a disk failure while indexing and am now unable to get ANY of the documents stored in my index. I am interested in restoring as many documents as possible from what is a mostly complete index. Is there something I can alter by hand to at least get most of the data back? I am getting an EOF error on the file/segment _cu0v which was presumably the file that was being written when the index crashed. Is there a reference to that file in segments that I could edit out?? I have included what I hope is useful information below. Thank you, Will -------------------------------------------------------------------- This is the call-stack from an optimize call IndexWriter writer = new IndexWriter(path, new StandardAnalyzer(), false); ------> writer.optimize(); logger.debug(writer.docCount() + ""); writer.close(); ------------Call Stack----------------------- java.io.IOException: read past EOF at org.apache.lucene.store.InputStream.refill(InputStream.java:154) at org.apache.lucene.store.InputStream.readByte(InputStream.java:43) at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83) at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66 ) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104) at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366) at TryStuff.tryFixingLuceneIndex(TryStuff.java:60) at TryStuff.main(TryStuff.java:49) -------------Directory listing------------- -rw-rw-r-- 1 wallen devs 383461 Jul 27 16:48 _1wtg.cfs -rw-rw-r-- 1 wallen devs 754131765 Jul 27 21:12 _262q.cfs -rw-rw-r-- 1 wallen devs 754345785 Jul 29 11:43 _4c49.cfs -rw-rw-r-- 1 wallen devs 719608798 Jul 31 04:38 _6i6l.cfs -rw-rw-r-- 1 wallen devs 773242798 Aug 2 03:05 _8o79.cfs -rw-rw-r-- 1 wallen devs 791843591 Aug 3 12:13 _au8j.cfs -rw-rw-r-- 1 wallen devs 77665301 Aug 3 14:35 _b21n.cfs -rw-rw-r-- 1 wallen devs 79123000 Aug 3 17:49 _b9uk.cfs -rw-rw-r-- 1 wallen devs 71718714 Aug 3 22:05 _bhnf.cfs -rw-rw-r-- 1 wallen devs 81537292 Aug 4 02:50 _bpga.cfs -rw-rw-r-- 1 wallen devs 80611946 Aug 4 07:44 _bx95.cfs -rw-rw-r-- 1 wallen devs 77923836 Aug 4 13:23 _c523.cfs -rw-rw-r-- 1 wallen devs 0 Aug 4 14:20 _caip.fnm -rw-rw-r-- 1 wallen devs 79987096 Aug 4 15:29 _ccxt.cfs -rw-rw-r-- 1 wallen devs 84966054 Aug 4 16:25 _ckqo.cfs -rw-rw-r-- 1 wallen devs 90829602 Aug 4 19:14 _csjj.cfs -rw-rw-r-- 1 wallen devs 7486317 Aug 4 19:23 _ctbm.cfs -rw-rw-r-- 1 wallen devs 1148765 Aug 4 19:24 _ctef.cfs -rw-rw-r-- 1 wallen devs 958149 Aug 4 19:27 _cth8.cfs -rw-rw-r-- 1 wallen devs 909911 Aug 4 19:28 _ctk1.cfs -rw-rw-r-- 1 wallen devs 918952 Aug 4 19:28 _ctmu.cfs -rw-rw-r-- 1 wallen devs 957856 Aug 4 19:31 _ctpn.cfs -rw-rw-r-- 1 wallen devs 651717 Aug 4 19:32 _ctsg.cfs -rw-rw-r-- 1 wallen devs 790354 Aug 4 19:32 _ctv9.cfs -rw-rw-r-- 1 wallen devs 890058 Aug 4 19:35 _cty2.cfs -rw-rw-r-- 1 wallen devs 0 Aug 4 19:35 _cu0v.cfs -rw-rw-r-- 1 wallen devs 891397 Aug 5 13:36 _cu3o.cfs -rw-rw-r-- 1 wallen devs 1085511 Aug 5 13:40 _cu6h.cfs -rw-rw-r-- 1 wallen devs 754877 Aug 5 13:40 _cu9b.cfs -rw-rw-r-- 1 wallen devs 1610682 Aug 5 13:40 _cuc5.cfs -rw-rw-r-- 1 wallen devs 1039577 Aug 5 13:41 _cuez.cfs -rw-rw-r-- 1 wallen devs 831174 Aug 5 13:41 _cuht.cfs -rw-rw-r-- 1 wallen devs 930858 Aug 5 13:56 _cuko.cfs -rw-rw-r-- 1 wallen devs 911844 Aug 5 13:56 _cuni.cfs -rw-rw-r-- 1 wallen devs 340 Aug 5 13:56 segments -rw-rw-r-- 1 wallen devs 4 Aug 5 13:56 deletable drwxrwxrwx 2 wallen devs 929792 Aug 5 13:56 . drwxrwxr-x 5 wallen devs 40 Aug 10 14:13 ..
-
RE: Restoring a corrupt indexwallen@... 2004-08-16, 19:55
I fixed my own problem, but hope this might help someone else in the future:
I went into my segments file (with a hex editor), deleted the record for _cu0v and changed the length 0x20 to be 0x1f, and it seems I have most of my index back! Maybe a developer could elaborate on this? -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, August 16, 2004 2:16 PM To: [EMAIL PROTECTED] Subject: Restoring a corrupt index Dear fellow Luceners, I had a disk failure while indexing and am now unable to get ANY of the documents stored in my index. I am interested in restoring as many documents as possible from what is a mostly complete index. Is there something I can alter by hand to at least get most of the data back? I am getting an EOF error on the file/segment _cu0v which was presumably the file that was being written when the index crashed. Is there a reference to that file in segments that I could edit out?? I have included what I hope is useful information below. Thank you, Will -------------------------------------------------------------------- This is the call-stack from an optimize call IndexWriter writer = new IndexWriter(path, new StandardAnalyzer(), false); ------> writer.optimize(); logger.debug(writer.docCount() + ""); writer.close(); ------------Call Stack----------------------- java.io.IOException: read past EOF at org.apache.lucene.store.InputStream.refill(InputStream.java:154) at org.apache.lucene.store.InputStream.readByte(InputStream.java:43) at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83) at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66 ) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104) at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366) at TryStuff.tryFixingLuceneIndex(TryStuff.java:60) at TryStuff.main(TryStuff.java:49) -------------Directory listing------------- -rw-rw-r-- 1 wallen devs 383461 Jul 27 16:48 _1wtg.cfs -rw-rw-r-- 1 wallen devs 754131765 Jul 27 21:12 _262q.cfs -rw-rw-r-- 1 wallen devs 754345785 Jul 29 11:43 _4c49.cfs -rw-rw-r-- 1 wallen devs 719608798 Jul 31 04:38 _6i6l.cfs -rw-rw-r-- 1 wallen devs 773242798 Aug 2 03:05 _8o79.cfs -rw-rw-r-- 1 wallen devs 791843591 Aug 3 12:13 _au8j.cfs -rw-rw-r-- 1 wallen devs 77665301 Aug 3 14:35 _b21n.cfs -rw-rw-r-- 1 wallen devs 79123000 Aug 3 17:49 _b9uk.cfs -rw-rw-r-- 1 wallen devs 71718714 Aug 3 22:05 _bhnf.cfs -rw-rw-r-- 1 wallen devs 81537292 Aug 4 02:50 _bpga.cfs -rw-rw-r-- 1 wallen devs 80611946 Aug 4 07:44 _bx95.cfs -rw-rw-r-- 1 wallen devs 77923836 Aug 4 13:23 _c523.cfs -rw-rw-r-- 1 wallen devs 0 Aug 4 14:20 _caip.fnm -rw-rw-r-- 1 wallen devs 79987096 Aug 4 15:29 _ccxt.cfs -rw-rw-r-- 1 wallen devs 84966054 Aug 4 16:25 _ckqo.cfs -rw-rw-r-- 1 wallen devs 90829602 Aug 4 19:14 _csjj.cfs -rw-rw-r-- 1 wallen devs 7486317 Aug 4 19:23 _ctbm.cfs -rw-rw-r-- 1 wallen devs 1148765 Aug 4 19:24 _ctef.cfs -rw-rw-r-- 1 wallen devs 958149 Aug 4 19:27 _cth8.cfs -rw-rw-r-- 1 wallen devs 909911 Aug 4 19:28 _ctk1.cfs -rw-rw-r-- 1 wallen devs 918952 Aug 4 19:28 _ctmu.cfs -rw-rw-r-- 1 wallen devs 957856 Aug 4 19:31 _ctpn.cfs -rw-rw-r-- 1 wallen devs 651717 Aug 4 19:32 _ctsg.cfs -rw-rw-r-- 1 wallen devs 790354 Aug 4 19:32 _ctv9.cfs -rw-rw-r-- 1 wallen devs 890058 Aug 4 19:35 _cty2.cfs -rw-rw-r-- 1 wallen devs 0 Aug 4 19:35 _cu0v.cfs -rw-rw-r-- 1 wallen devs 891397 Aug 5 13:36 _cu3o.cfs -rw-rw-r-- 1 wallen devs 1085511 Aug 5 13:40 _cu6h.cfs -rw-rw-r-- 1 wallen devs 754877 Aug 5 13:40 _cu9b.cfs -rw-rw-r-- 1 wallen devs 1610682 Aug 5 13:40 _cuc5.cfs -rw-rw-r-- 1 wallen devs 1039577 Aug 5 13:41 _cuez.cfs -rw-rw-r-- 1 wallen devs 831174 Aug 5 13:41 _cuht.cfs -rw-rw-r-- 1 wallen devs 930858 Aug 5 13:56 _cuko.cfs -rw-rw-r-- 1 wallen devs 911844 Aug 5 13:56 _cuni.cfs -rw-rw-r-- 1 wallen devs 340 Aug 5 13:56 segments -rw-rw-r-- 1 wallen devs 4 Aug 5 13:56 deletable drwxrwxrwx 2 wallen devs 929792 Aug 5 13:56 . drwxrwxr-x 5 wallen devs 40 Aug 10 14:13 ..
-
RE: Restoring a corrupt indexHoney George 2004-08-17, 14:35
Wallen,
Which hex editor have you used. I am also facing a similar problem. I tried to use KHexEdit and it doesn't seem to help. I am attaching with this email my segments file. I think only the segment with name _ung is a valid one, I wanted to delete the remaining..but couldn't. Can you help? -George --- [EMAIL PROTECTED] wrote: > I fixed my own problem, but hope this might help > someone else in the future: > > I went into my segments file (with a hex editor), > deleted the record for > _cu0v and changed the length 0x20 to be 0x1f, and it > seems I have most of my > index back! > > Maybe a developer could elaborate on this? > ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
-
RE: Restoring a corrupt indexwallen@... 2004-08-17, 14:39
http://www.ultraedit.com/ is the best!
However, I cannot imagine how another hexeditor wouldnt work. -----Original Message----- From: Honey George [mailto:[EMAIL PROTECTED]] Sent: Tuesday, August 17, 2004 10:35 AM To: Lucene Users List Subject: RE: Restoring a corrupt index Wallen, Which hex editor have you used. I am also facing a similar problem. I tried to use KHexEdit and it doesn't seem to help. I am attaching with this email my segments file. I think only the segment with name _ung is a valid one, I wanted to delete the remaining..but couldn't. Can you help? -George --- [EMAIL PROTECTED] wrote: > I fixed my own problem, but hope this might help > someone else in the future: > > I went into my segments file (with a hex editor), > deleted the record for > _cu0v and changed the length 0x20 to be 0x1f, and it > seems I have most of my > index back! > > Maybe a developer could elaborate on this? > ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com ---------------------------------------------------------------------
-
RE: Restoring a corrupt indexHoney George 2004-08-17, 14:39
I think attachments are filtered. This is what I see
when I open in the hex editor. 0000:0000 00 04 e0 af 00 00 00 02 05 5f 36 75 6e 67 00 04 ..�....._6ung.. 0000:0010 1e fb 05 5f 36 75 6e 69 00 00 00 01 00 00 00 00 .�._6uni........ 0000:0020 00 00 c1 b4 ..�� -George --- Honey George <[EMAIL PROTECTED]> wrote: > Wallen, > Which hex editor have you used. I am also facing a > similar problem. I tried to use KHexEdit and it > doesn't seem to help. I am attaching with this email > my segments file. I think only the segment with name > _ung is a valid one, I wanted to delete the > remaining..but couldn't. Can you help? > > -George > > > > --- [EMAIL PROTECTED] wrote: > > I fixed my own problem, but hope this might help > > someone else in the future: > > > > I went into my segments file (with a hex editor), > > deleted the record for > > _cu0v and changed the length 0x20 to be 0x1f, and > it > > seems I have most of my > > index back! > > > > Maybe a developer could elaborate on this? > > > > > > > > ___________________________________________________________ALL-NEW > Yahoo! Messenger - all new features - even more fun! > http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com ---------------------------------------------------------------------
-
RE: Restoring a corrupt indexwallen@... 2004-08-17, 14:42
Change 02 to be 01 and delete the bytes that represent the one record that
is bad. It was easier to see what a record was in my file because I had about 30 _files. -----Original Message----- From: Honey George [mailto:[EMAIL PROTECTED]] Sent: Tuesday, August 17, 2004 10:39 AM To: Lucene Users List Subject: RE: Restoring a corrupt index I think attachments are filtered. This is what I see when I open in the hex editor. 0000:0000 00 04 e0 af 00 00 00 02 05 5f 36 75 6e 67 00 04 ..à¯....._6ung.. 0000:0010 1e fb 05 5f 36 75 6e 69 00 00 00 01 00 00 00 00 .û._6uni........ 0000:0020 00 00 c1 b4 ..Á´ -George --- Honey George <[EMAIL PROTECTED]> wrote: > Wallen, > Which hex editor have you used. I am also facing a > similar problem. I tried to use KHexEdit and it > doesn't seem to help. I am attaching with this email > my segments file. I think only the segment with name > _ung is a valid one, I wanted to delete the > remaining..but couldn't. Can you help? > > -George > > > > --- [EMAIL PROTECTED] wrote: > > I fixed my own problem, but hope this might help > > someone else in the future: > > > > I went into my segments file (with a hex editor), > > deleted the record for > > _cu0v and changed the length 0x20 to be 0x1f, and > it > > seems I have most of my > > index back! > > > > Maybe a developer could elaborate on this? > > > > > > > > ___________________________________________________________ALL-NEW > Yahoo! Messenger - all new features - even more fun! > http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- ---------------------------------------------------------------------
-
RE: Restoring a corrupt indexHoney George 2004-08-18, 10:50
Looks like problem is not with the hexeditor, even in
the ultraedit(i had access to a windows box) I am seeing the same display. The problem is I am not able to identify where a record starts with just 1 record in the file. Need to try some alternate approach. Thanks, George --- [EMAIL PROTECTED] wrote: > http://www.ultraedit.com/ is the best! > > However, I cannot imagine how another hexeditor > wouldnt work. > > -----Original Message----- > From: Honey George [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, August 17, 2004 10:35 AM > To: Lucene Users List > Subject: RE: Restoring a corrupt index > > > Wallen, > Which hex editor have you used. I am also facing a > similar problem. I tried to use KHexEdit and it > doesn't seem to help. I am attaching with this email > my segments file. I think only the segment with name > _ung is a valid one, I wanted to delete the > remaining..but couldn't. Can you help? > > -George > ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com ---------------------------------------------------------------------
-
Re: Restoring a corrupt indexErik Hatcher 2004-08-18, 10:55
The details of the segments file (and all the others) is freely
available here: http://jakarta.apache.org/lucene/docs/fileformats.html Also, there is Java code in Lucene, of course, that manipulates the segments file which could be leveraged (although probably package scoped and not easily usable in a standalone repair tool). Erik On Aug 18, 2004, at 6:50 AM, Honey George wrote: > Looks like problem is not with the hexeditor, even in > the ultraedit(i had access to a windows box) I am > seeing the same display. The problem is I am not able > to identify where a record starts with just 1 record > in the file. > > Need to try some alternate approach. > > Thanks, > George > > --- [EMAIL PROTECTED] wrote: >> http://www.ultraedit.com/ is the best! >> >> However, I cannot imagine how another hexeditor >> wouldnt work. >> >> -----Original Message----- >> From: Honey George [mailto:[EMAIL PROTECTED]] >> Sent: Tuesday, August 17, 2004 10:35 AM >> To: Lucene Users List >> Subject: RE: Restoring a corrupt index >> >> >> Wallen, >> Which hex editor have you used. I am also facing a >> similar problem. I tried to use KHexEdit and it >> doesn't seem to help. I am attaching with this email >> my segments file. I think only the segment with name >> _ung is a valid one, I wanted to delete the >> remaining..but couldn't. Can you help? >> >> -George >> > > > > > > > ___________________________________________________________ALL-NEW > Yahoo! Messenger - all new features - even more fun! > http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] ---------------------------------------------------------------------
-
Re: Restoring a corrupt indexHoney George 2004-08-18, 12:20
Thanks Erik, that worked. I was able to remove the
corrupt index and now it looks like the index is OK. I was able to view the number of documents in the index. Before that I was getting the error, java.io.IOException: read past EOF I am yet to find out how my index got corrupted. There is another thread going on about this topic, http://www.mail-archive.com/[EMAIL PROTECTED]/msg03165.html If anybody is facing similar problem and is interested in the code I can post it here. Thanks, George --- Erik Hatcher <[EMAIL PROTECTED]> wrote: > The details of the segments file (and all the > others) is freely > available here: > > > http://jakarta.apache.org/lucene/docs/fileformats.html > > Also, there is Java code in Lucene, of course, that > manipulates the > segments file which could be leveraged (although > probably package > scoped and not easily usable in a standalone repair > tool). > > Erik > > > On Aug 18, 2004, at 6:50 AM, Honey George wrote: > > > Looks like problem is not with the hexeditor, even > in > > the ultraedit(i had access to a windows box) I am > > seeing the same display. The problem is I am not > able > > to identify where a record starts with just 1 > record > > in the file. > > > > Need to try some alternate approach. > > > > Thanks, > > George ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com ---------------------------------------------------------------------
-
RE: Restoring a corrupt indexKarthik N S 2004-08-18, 13:05
Hi Guys....
In Our Situation we would be indexing Million & Millions of Information documents with Huge Giga Bytes of Data Indexed and finally would be put into a MERGED INDEX, Categorized accordingly. There may be a possibility of Corruption, So Please do post the code reffrals.... Thx Karthik -----Original Message----- From: Honey George [mailto:[EMAIL PROTECTED]] Sent: Wednesday, August 18, 2004 5:51 PM To: Lucene Users List Subject: Re: Restoring a corrupt index Thanks Erik, that worked. I was able to remove the corrupt index and now it looks like the index is OK. I was able to view the number of documents in the index. Before that I was getting the error, java.io.IOException: read past EOF I am yet to find out how my index got corrupted. There is another thread going on about this topic, http://www.mail-archive.com/[EMAIL PROTECTED]/msg03165.html If anybody is facing similar problem and is interested in the code I can post it here. Thanks, George --- Erik Hatcher <[EMAIL PROTECTED]> wrote: > The details of the segments file (and all the > others) is freely > available here: > > > http://jakarta.apache.org/lucene/docs/fileformats.html > > Also, there is Java code in Lucene, of course, that > manipulates the > segments file which could be leveraged (although > probably package > scoped and not easily usable in a standalone repair > tool). > > Erik > > > On Aug 18, 2004, at 6:50 AM, Honey George wrote: > > > Looks like problem is not with the hexeditor, even > in > > the ultraedit(i had access to a windows box) I am > > seeing the same display. The problem is I am not > able > > to identify where a record starts with just 1 > record > > in the file. > > > > Need to try some alternate approach. > > > > Thanks, > > George ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- ---------------------------------------------------------------------
-
RE: Restoring a corrupt indexHoney George 2004-08-19, 08:38
This is what I did.
There are 2 classes in the lucene source which are not public and therefore cannot be accessed from outside the package. The classes are 1. org.apache.lucene.index.SegmentInfos - collection of segments 2. org.apache.lucene.index.SegmentInfo -represents a sigle segment I took these two files and moved to a separate folder. Then created a class with the following code fragment. public void displaySegments(String indexDir) throws Exception { Directory dir (Directory)FSDirectory.getDirectory(indexDir, false); SegmentInfos segments = new SegmentInfos(); segments.read(dir); StringBuffer str = new StringBuffer(); int size = segments.size(); str.append("Index Dir = " + indexDir ); str.append("\nTotal Number of Segments " + size); str.append("\n--------------------------------------"); for(int i=0;i<size;i++) { str.append("\n"); str.append((i+1) + ". "); str.append(((SegmentInfo)segments.get(i)).name); } str.append("\n--------------------------------------"); System.out.println(str.toString()); } public void deleteSegment(String indexDir, String segmentName) throws Exception { Directory dir (Directory)FSDirectory.getDirectory(indexDir, false); SegmentInfos segments = new SegmentInfos(); segments.read(dir); int size = segments.size(); String name = null; boolean found = false; for(int i=0;i<size;i++) { name ((SegmentInfo)segments.get(i)).name; if (segmentName.equals(name)) { found = true; segments.remove(i); System.out.println("Deleted the segment with name " + name + "from the segments file"); break; } } if (found) { segments.write(dir); } else { System.out.println("Invalid segment name: " + segmentName); } } Use the displaySegments() method to display the segments and deleteSegment to delete the corrupt segment. Thanks, George --- Karthik N S <[EMAIL PROTECTED]> wrote: > Hi Guys.... > > > In Our Situation we would be indexing Million & > Millions of Information > documents > > with Huge Giga Bytes of Data Indexed and > finally would be put into a > MERGED INDEX, Categorized accordingly. > > There may be a possibility of Corruption, So > Please do post the code > reffrals.... > > > Thx > Karthik > > > -----Original Message----- > From: Honey George [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 18, 2004 5:51 PM > To: Lucene Users List > Subject: Re: Restoring a corrupt index > > > Thanks Erik, that worked. I was able to remove the > corrupt index and now it looks like the index is OK. > I > was able to view the number of documents in the > index. > Before that I was getting the error, > java.io.IOException: read past EOF > > I am yet to find out how my index got corrupted. > There > is another thread going on about this topic, > http://www.mail-archive.com/[EMAIL PROTECTED]/msg03165.html > > If anybody is facing similar problem and is > interested > in the code I can post it here. > > Thanks, > George > > > > --- Erik Hatcher <[EMAIL PROTECTED]> > wrote: > > The details of the segments file (and all the > > others) is freely > > available here: > > > > > > > http://jakarta.apache.org/lucene/docs/fileformats.html > > > > Also, there is Java code in Lucene, of course, > that > > manipulates the > > segments file which could be leveraged (although > > probably package > > scoped and not easily usable in a standalone > repair > > tool). > > > > Erik > > > > > > On Aug 18, 2004, at 6:50 AM, Honey George wrote: > > > > > Looks like problem is not with the hexeditor, > even > > in ___________________________________________________________ALL-NEW ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
-
RE: Restoring a corrupt indexKarthik N S 2004-08-19, 09:41
Hi
George Do u think ,the same would work for MERGED Indexes.... Please Can u suggest a solution. Karthik -----Original Message----- From: Honey George [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 19, 2004 2:08 PM To: Lucene Users List Subject: RE: Restoring a corrupt index This is what I did. There are 2 classes in the lucene source which are not public and therefore cannot be accessed from outside the package. The classes are 1. org.apache.lucene.index.SegmentInfos - collection of segments 2. org.apache.lucene.index.SegmentInfo -represents a sigle segment I took these two files and moved to a separate folder. Then created a class with the following code fragment. public void displaySegments(String indexDir) throws Exception { Directory dir (Directory)FSDirectory.getDirectory(indexDir, false); SegmentInfos segments = new SegmentInfos(); segments.read(dir); StringBuffer str = new StringBuffer(); int size = segments.size(); str.append("Index Dir = " + indexDir ); str.append("\nTotal Number of Segments " + size); str.append("\n--------------------------------------"); for(int i=0;i<size;i++) { str.append("\n"); str.append((i+1) + ". "); str.append(((SegmentInfo)segments.get(i)).name); } str.append("\n--------------------------------------"); System.out.println(str.toString()); } public void deleteSegment(String indexDir, String segmentName) throws Exception { Directory dir (Directory)FSDirectory.getDirectory(indexDir, false); SegmentInfos segments = new SegmentInfos(); segments.read(dir); int size = segments.size(); String name = null; boolean found = false; for(int i=0;i<size;i++) { name ((SegmentInfo)segments.get(i)).name; if (segmentName.equals(name)) { found = true; segments.remove(i); System.out.println("Deleted the segment with name " + name + "from the segments file"); break; } } if (found) { segments.write(dir); } else { System.out.println("Invalid segment name: " + segmentName); } } Use the displaySegments() method to display the segments and deleteSegment to delete the corrupt segment. Thanks, George --- Karthik N S <[EMAIL PROTECTED]> wrote: > Hi Guys.... > > > In Our Situation we would be indexing Million & > Millions of Information > documents > > with Huge Giga Bytes of Data Indexed and > finally would be put into a > MERGED INDEX, Categorized accordingly. > > There may be a possibility of Corruption, So > Please do post the code > reffrals.... > > > Thx > Karthik > > > -----Original Message----- > From: Honey George [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 18, 2004 5:51 PM > To: Lucene Users List > Subject: Re: Restoring a corrupt index > > > Thanks Erik, that worked. I was able to remove the > corrupt index and now it looks like the index is OK. > I > was able to view the number of documents in the > index. > Before that I was getting the error, > java.io.IOException: read past EOF > > I am yet to find out how my index got corrupted. > There > is another thread going on about this topic, > http://www.mail-archive.com/[EMAIL PROTECTED]/msg03165.html > > If anybody is facing similar problem and is > interested > in the code I can post it here. > > Thanks, > George > > > > --- Erik Hatcher <[EMAIL PROTECTED]> > wrote: > > The details of the segments file (and all the > > others) is freely > > available here: > > > > > > > http://jakarta.apache.org/lucene/docs/fileformats.html > > > > Also, there is Java code in Lucene, of course, > that > > manipulates the > > segments file which could be leveraged (although ___________________________________________________________ALL-NEW ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
-
RE: Restoring a corrupt indexHoney George 2004-08-19, 10:37
If I understand correctly, You have situation where
you have a large main index and then you create small indexes and finally merge to the main index. It can happen that half way through merging, the system crashed and the index got corrupted. I do not think in this case you can use my solution. What I am trying to do is to remove a corrupt segment and associated files from the index folder, not trying to fix a corrupt segment. This way atleast I can add new documents to the index. Of cource I am sure I didn't loose anything because my index file size was actually 0 bytes. Thanks, George --- Karthik N S <[EMAIL PROTECTED]> wrote: > Hi > > George > > Do u think ,the same would work for MERGED > Indexes.... > Please Can u suggest a solution. > > > Karthik > > -----Original Message----- > From: Honey George [mailto:[EMAIL PROTECTED]] > Sent: Thursday, August 19, 2004 2:08 PM > To: Lucene Users List > Subject: RE: Restoring a corrupt index > > > This is what I did. > > There are 2 classes in the lucene source which are > not > public and therefore cannot be accessed from outside > the package. The classes are > 1. org.apache.lucene.index.SegmentInfos > - collection of segments > 2. org.apache.lucene.index.SegmentInfo > -represents a sigle segment > > I took these two files and moved to a separate > folder. > Then created a class with the following code > fragment. > > public void displaySegments(String indexDir) > throws Exception > { > Directory dir > (Directory)FSDirectory.getDirectory(indexDir, > false); > SegmentInfos segments = new SegmentInfos(); > segments.read(dir); > > StringBuffer str = new StringBuffer(); > int size = segments.size(); > str.append("Index Dir = " + indexDir ); > str.append("\nTotal Number of Segments " + > size); > > str.append("\n--------------------------------------"); > for(int i=0;i<size;i++) > { > str.append("\n"); > str.append((i+1) + ". "); > > str.append(((SegmentInfo)segments.get(i)).name); > } > > str.append("\n--------------------------------------"); > > System.out.println(str.toString()); > } > > > public void deleteSegment(String indexDir, > String > segmentName) > throws Exception > { > Directory dir > (Directory)FSDirectory.getDirectory(indexDir, > false); > SegmentInfos segments = new SegmentInfos(); > segments.read(dir); > > int size = segments.size(); > String name = null; > boolean found = false; > for(int i=0;i<size;i++) > { > name > ((SegmentInfo)segments.get(i)).name; > if (segmentName.equals(name)) > { > found = true; > segments.remove(i); > System.out.println("Deleted the > segment with name " + name > + "from the segments file"); > break; > } > } > if (found) > { > segments.write(dir); > } > else > { > System.out.println("Invalid segment > name: > " + segmentName); > } > } > > Use the displaySegments() method to display the > segments and deleteSegment to delete the corrupt > segment. > > Thanks, > George > > --- Karthik N S <[EMAIL PROTECTED]> wrote: > > Hi Guys.... > > > > > > In Our Situation we would be indexing Million > & > > Millions of Information > > documents > > > > with Huge Giga Bytes of Data Indexed and > > finally would be put into a > > MERGED INDEX, Categorized accordingly. > > > > There may be a possibility of Corruption, So > > Please do post the code > > reffrals.... > > > > > > Thx > > Karthik > > > > > > -----Original Message----- > > From: Honey George [mailto:[EMAIL PROTECTED]] > > Sent: Wednesday, August 18, 2004 5:51 PM > > To: Lucene Users List > > Subject: Re: Restoring a corrupt index http://www.mail-archive.com/[EMAIL PROTECTED]/msg03165.html http://jakarta.apache.org/lucene/docs/fileformats.html === message truncated === ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
-
restoring a corrupt index?Ryan McKinley 2007-11-10, 21:01
Using solr, we have been running an indexing process for a while and
when I checked on it today, it spits out an error: java.lang.RuntimeException: java.io.FileNotFoundException: /path/to/index/_cf9.fnm (No such file or directory) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:584) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:475) Looking through the archives, it looks like we are up a creek. Any thoughts on what could have caused this? The log files contains some 'too many open files' errors, I can't tell if that corresponds with when the index went bad though. the startup script includes: ulimit -n 100000 which seems generous, no? it is a 22GB index, ls -l | wc shows 180K files (oh my) So my questions: 1. Anything I can do to use this index while I rebuild another? (takes a long time!) 2. Does the ulimit number explain how the index got corrupted? If so, it seems like a problem. thanks ryan --------------------------------------------------------------------- |