Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # user - Exception org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/nutch/1.4/runtime/local/crawl/segments/20111209174842/parse_data


Copy link to this message
-
Re: Exception org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/nutch/1.4/runtime/local/crawl/segments/20111209174842/parse_data
remi tassing 2011-12-23, 13:52
Just deleting the folders?

On Fri, Dec 23, 2011 at 3:49 PM, Markus Jelsma
<[EMAIL PROTECTED]>wrote:

> you have to get rid of the bad segments. they cannot be recovered. It is
> with
> Nutch 1.x never a good idea to use extremely large segments that take days
> to
> run.
>
> On Friday 23 December 2011 14:45:39 remi tassing wrote:
> > My computer shut down yesterday and I'm having the same problem. The
> > problem this time is that I can't just delete and re-started again. I've
> > been crawling for days!
> >
> > Any other ways to handle this? Remove segments? Sanitize the database?
> >
> > On Sat, Dec 10, 2011 at 3:54 PM, M.Rizwan
> >
> > <[EMAIL PROTECTED]>wrote:
> > > Thanks Rami. Yes not a good solution but this worked for me too.
> > >
> > > Thanks for sharing.
> > >
> > > On Fri, Dec 9, 2011 at 5:13 PM, remi tassing <[EMAIL PROTECTED]>
> > >
> > > wrote:
> > > > Sorry, I forgot to change the title...
> > > >
> > > > However I had the same error "Exception
> > > > org.apache.hadoop.mapred.InvalidInputException: Input path does not
> > >
> > > exist:
> > > > file:/home/nutch/1.4/runtime/local/crawl/segments/..." this morning.
> > > >
> > > > I believe it's because I stopped Nutch while it was crawling and data
> > >
> > > were
> > >
> > > > not saved properly.
> > > >
> > > > I couldn't find an alternative and just had to delete my "crawl"
> > > > folder, then it worked...Not a good solution!
> > > >
> > > > On Fri, Dec 9, 2011 at 2:08 PM, Lewis John Mcgibbney <
> > > >
> > > > [EMAIL PROTECTED]> wrote:
> > > > > Hi Remi,
> > > > >
> > > > > Please don't hijack someone's thread, start your own.
> > > > >
> > > > > Thank you
> > > > >
> > > > > Lewis
> > > > >
> > > > > On Fri, Dec 9, 2011 at 8:26 AM, remi tassing <
> [EMAIL PROTECTED]>
> > > > >
> > > > > wrote:
> > > > > > Hello guys,
> > > > > >
> > > > > > how do you use "org.apache.nutch.net.URLFilterChecker"? It's not
> > > > >
> > > > > documented
> > > > >
> > > > > > and it always shows me this "Checking combination of all
> URLFilters
> > > > > > available" and then gets stuck.
> > > > > >
> > > > > > Remi
> > > > >
> > > > > --
> > > > > *Lewis*
> > > >
> > > > --
> > > > Remi Tassing
>
> --
> Markus Jelsma - CTO - Openindex
>

--
Remi Tassing