|
|
-
Apparently far from last question :)
Tolga 2012-05-23, 10:44
Hi,
I put the lines <mimeType name="application/x-excel"> <plugin id="parse-tika" /> <plugin id="feed" /> </mimeType>
in parse-plugins.xml, but I still can't crawl xls files. Why is that?
Regards,
+
Tolga 2012-05-23, 10:44
-
Re: Apparently far from last question :)
Lewis John Mcgibbney 2012-05-23, 11:05
There is absolutely no requirement to add this configuration to this file. If you you look at the XML file in question, one of the first XML configuration blocks says
<!-- by default if the mimeType is set to *, or if it can't be determined, use parse-tika --> <mimeType name="*"> <plugin id="parse-tika" /> </mimeType>
Just remove your unnecessary config and Tika will do the work for you :0)
Lewis
On Wed, May 23, 2012 at 11:44 AM, Tolga <[EMAIL PROTECTED]> wrote: > Hi, > > I put the lines <mimeType name="application/x-excel"> > <plugin id="parse-tika" /> > <plugin id="feed" /> > </mimeType> > > in parse-plugins.xml, but I still can't crawl xls files. Why is that? > > Regards,
-- Lewis
+
Lewis John Mcgibbney 2012-05-23, 11:05
-
Re: Apparently far from last question :)
Tolga 2012-05-23, 11:19
I put that in because I noticed it wasn't crawled. After I put that in, it wasn't crawled either.
On 5/23/12 2:05 PM, Lewis John Mcgibbney wrote: > There is absolutely no requirement to add this configuration to this file. > If you you look at the XML file in question, one of the first XML > configuration blocks says > > <!-- by default if the mimeType is set to *, or > if it can't be determined, use parse-tika --> > <mimeType name="*"> > <plugin id="parse-tika" /> > </mimeType> > > Just remove your unnecessary config and Tika will do the work for you :0) > > Lewis > > On Wed, May 23, 2012 at 11:44 AM, Tolga<[EMAIL PROTECTED]> wrote: >> Hi, >> >> I put the lines<mimeType name="application/x-excel"> >> <plugin id="parse-tika" /> >> <plugin id="feed" /> >> </mimeType> >> >> in parse-plugins.xml, but I still can't crawl xls files. Why is that? >> >> Regards, > >
+
Tolga 2012-05-23, 11:19
-
Re: Apparently far from last question :)
Tolga 2012-05-23, 12:20
My colleague has just made me realize something. Is it possible that this xls file wasn't crawled because there isn't a link to it within the website?
Regards,
On 5/23/12 2:05 PM, Lewis John Mcgibbney wrote: > There is absolutely no requirement to add this configuration to this file. > If you you look at the XML file in question, one of the first XML > configuration blocks says > > <!-- by default if the mimeType is set to *, or > if it can't be determined, use parse-tika --> > <mimeType name="*"> > <plugin id="parse-tika" /> > </mimeType> > > Just remove your unnecessary config and Tika will do the work for you :0) > > Lewis > > On Wed, May 23, 2012 at 11:44 AM, Tolga<[EMAIL PROTECTED]> wrote: >> Hi, >> >> I put the lines<mimeType name="application/x-excel"> >> <plugin id="parse-tika" /> >> <plugin id="feed" /> >> </mimeType> >> >> in parse-plugins.xml, but I still can't crawl xls files. Why is that? >> >> Regards, > >
+
Tolga 2012-05-23, 12:20
-
RE: Apparently far from last question :)
Markus Jelsma 2012-05-23, 12:26
You can inspect the CrawlDB with the readdb tool, check if it's there. -----Original message----- > From:Tolga <[EMAIL PROTECTED]> > Sent: Wed 23-May-2012 14:21 > To: [EMAIL PROTECTED] > Subject: Re: Apparently far from last question :) > > My colleague has just made me realize something. Is it possible that > this xls file wasn't crawled because there isn't a link to it within the > website? > > Regards, > > On 5/23/12 2:05 PM, Lewis John Mcgibbney wrote: > > There is absolutely no requirement to add this configuration to this file. > > If you you look at the XML file in question, one of the first XML > > configuration blocks says > > > > <!-- by default if the mimeType is set to *, or > > if it can't be determined, use parse-tika --> > > <mimeType name="*"> > > <plugin id="parse-tika" /> > > </mimeType> > > > > Just remove your unnecessary config and Tika will do the work for you :0) > > > > Lewis > > > > On Wed, May 23, 2012 at 11:44 AM, Tolga<[EMAIL PROTECTED]> wrote: > >> Hi, > >> > >> I put the lines<mimeType name="application/x-excel"> > >> <plugin id="parse-tika" /> > >> <plugin id="feed" /> > >> </mimeType> > >> > >> in parse-plugins.xml, but I still can't crawl xls files. Why is that? > >> > >> Regards, > > > > >
+
Markus Jelsma 2012-05-23, 12:26
|
|