Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Plain View
Nutch, mail # user - Using multi cores on local machines


+
Marek Bachmann 2011-06-10, 12:41
+
MilleBii 2011-06-10, 13:57
+
Marek Bachmann 2011-06-10, 14:44
Copy link to this message
-
Re: Using multi cores on local machines
Andrzej Bialecki 2011-06-10, 14:25
On 6/10/11 3:57 PM, MilleBii wrote:
> Hadoop is using a map/reduce algorithm, the reduce phase is that phase which
> collects the results from // execution.
> It is inherently not possible to parrallelized that phase.

Actually, this is not true at all - it's perfectly ok to have multiple
reduce tasks and have them run in parallel.

The only gotcha why it didn't work in this case? The LocalJobRunner -
it's limited to run only one map and one reduce task at a time, because
it's not meant to be used for anything serious.

In order to have multiple tasks running in parallel you need to use the
distributed JobTracker/TaskTracker, even if it's just a single node.

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com
+
Julien Nioche 2011-06-10, 14:26
+
Marek Bachmann 2011-06-10, 14:50
+
Ken Krugler 2011-06-10, 15:51
+
MilleBii 2011-06-13, 18:55