I am currently attempting to dump the contents of a crawl into multiple
WARC files using

./bin/nutch commoncrawldump -outputDir nameOfOutputDir -segment
crawl/segments/segmentDir -warc

However, I get multiple occurrences of

URL skipped. Content of size X was truncated to Y.

I have set both http.content.limit and file.content.limit to -1 in order
to remove any limits, but I'm guessing neither applies to this
situation. Any way of removing said cap?


NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB