Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Nutch, mail # user - crawldb modifications


Copy link to this message
-
Re: crawldb modifications
Markus Jelsma 2012-02-28, 11:51
I may be missing something but rm -r crawl/crawldb works fine here.

On Tuesday 28 February 2012 07:03:39 remi tassing wrote:
> What do in this case is to erase the db, use the.command mergesegs with
> -filter option and then updatedb.
>
> I would.love to know if there is a simpler way
>
> Remi
>
> On Monday, February 27, 2012, Charles Thomas <[EMAIL PROTECTED]> wrote:
> > Is there a way to clear out the various databases that Nutch uses (e.g.
> > crawldb)?  I did some testing which injected a lot of URLs into the DB
>
> that
>
> > I want to clear out as I move toward production.
> >
> > Thanks!
> >
> > CT
> >
> > --
>
> > View this message in context:
> http://lucene.472066.n3.nabble.com/crawldb-modifications-tp3781740p3781740.
> html
>
> > Sent from the Nutch - User mailing list archive at Nabble.com.

--
Markus Jelsma - CTO - Openindex