|
|
-
Re: crawldb modificationsremi tassing 2012-02-28, 12:04
I think he ment to remove some specific URLs not everything
On Tue, Feb 28, 2012 at 1:51 PM, Markus Jelsma <[EMAIL PROTECTED]>wrote: > I may be missing something but rm -r crawl/crawldb works fine here. > > On Tuesday 28 February 2012 07:03:39 remi tassing wrote: > > What do in this case is to erase the db, use the.command mergesegs with > > -filter option and then updatedb. > > > > I would.love to know if there is a simpler way > > > > Remi > > > > On Monday, February 27, 2012, Charles Thomas <[EMAIL PROTECTED]> wrote: > > > Is there a way to clear out the various databases that Nutch uses (e.g. > > > crawldb)? I did some testing which injected a lot of URLs into the DB > > > > that > > > > > I want to clear out as I move toward production. > > > > > > Thanks! > > > > > > CT > > > > > > -- > > > > > View this message in context: > > > http://lucene.472066.n3.nabble.com/crawldb-modifications-tp3781740p3781740 > . > > html > > > > > Sent from the Nutch - User mailing list archive at Nabble.com. > > -- > Markus Jelsma - CTO - Openindex > |