-1) success 2) how to tell Nutch "index everything"
Fred Zimmerman 2011-10-26, 14:37
1) I resolved the issues with solrindex. It turned out to be a matter of
adding all the nutch schema-specific fields to solr's schema.xml. there was
one gotcha which is that the latest solr schema does not have a default
fieldtype "text" as in Nutch 1.3/schema.xml; you must use "text_general". A
comment for developers is that the use case of copying the nutch schema to
overwrite the solr one only works for people who are beginning their
indexing with a crawl. More detailed instructions on how to modify
solr/schema.xml for nutch would be helpful, or better yet, a script to add
the appropriate fields.
2) is there a way to tell Nutch to index everything at a given site? I am
crawling a couple of my own sites and it seems rather clumsy just to give
Nutch a big "TopN." wouldn't an "all" value be helpful?