Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # user - Can't retrieve Tika parser for mime-type text/javascript


Copy link to this message
-
Re: Can't retrieve Tika parser for mime-type text/javascript
Markus Jelsma 2012-05-15, 11:04
I see, it doesn't work. The JSParser is known not to work very well, or work
at all.  Why do you want to parse JS anyway? It's not a very common practice
to do so.

On Monday 14 May 2012 01:35:01 forwardswing wrote:
> I modify the parse-plugins.xml clip from:
> <mimeType name="text/javascript">
> <plugin id="parse-tike" />
> </mimeType>
>
> to :
> <mimeType name="text/javascript">
> <plugin id="parse-js" />
> </mimeType>
>
> but there occurs another error:
> Error parsing: http://10.31.8.29:8080/AWIsys/dtree.js: UNKNOWN!(-53,0):
> Content not JavaScript: 'text/javascript'
>  fetch of http://10.31.8.29:8080/AWIsys/dtree.js failed with:
> java.lang.ArrayIndexOutOfBoundsException: -53
>
> Error parsing: http://10.31.8.29:8080/AWIsys/main.js: UNKNOWN!(-53,0):
> Content not JavaScript: 'text/javascript'
> fetch of http://10.31.8.29:8080/AWIsys/main.js failed with:
> java.lang.ArrayIndexOutOfBoundsException: -53
>
> Error parsing: http://10.31.8.29:8080/AWIsys/Progress.js: UNKNOWN!(-53,0):
> Content not JavaScript: 'text/javascript'
> fetch of http://10.31.8.29:8080/AWIsys/Progress.js failed with:
> java.lang.ArrayIndexOutOfBoundsException: -53
>
> Error parsing: http://10.31.8.29:8080/AWIsys/table_sorter_script.js:
> UNKNOWN!(-53,0): Content not JavaScript: 'text/javascript'
> fetch of http://10.31.8.29:8080/AWIsys/table_sorter_script.js failed with:
> java.lang.ArrayIndexOutOfBoundsException: -53
>
>
> What's the meaning of "-53"
>
> If necessary ,I can provide the js files.
>
> Thank you for your help.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Can-t-retrieve-Tika-parser-for-mime-type
> -text-javascript-tp3983599p3983627.html Sent from the Nutch - User mailing
> list archive at Nabble.com.
--
Markus Jelsma - CTO - Openindex