Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Mahout, mail # user - Re: Parallel ALS-WR on very large matrix -- crashing (I think)


Copy link to this message
-
Re: Parallel ALS-WR on very large matrix -- crashing (I think)
Nicholas Kolegraff 2012-02-02, 16:37
Sounds good. Thanks Sebastian

The interesting thing is -- I tried to sample the matrix down one time to
about 10% of non-zeros -- and worked no problem.

On Thu, Feb 2, 2012 at 8:31 AM, Sebastian Schelter <[EMAIL PROTECTED]> wrote:

> Your parameters look good, except if you have binary data, you should
> set --implicitFeedback=true. You could also set numFeatures to a very
> small value (like 5) just to see if that helps.
>
> The mappers load one of the feature matrices into memory which are dense
> (#items x #features entries or #users x #features entries). Are you sure
> that the mappers have enough memory for that?
>
> It's really strange that you have problems with such small data, I
> tested this with Netflix (> 100M non-zeros) on a few machines and it
> worked quite well.
>
> --sebastian
>
>
>
> On 02.02.2012 17:25, Nicholas Kolegraff wrote:
> > I will up the ante with the time out and report back -- thanks all for
> the
> > suggestions
> >
> > Hey, Sebastian -- Here are the arguments I am using:
> > --input matrix --output ALS --numFeatures 25 --numIterations 10 --lambda
> > 0.065
> > When the mapper loads the matrix into memory it only loads the actual
> > non-zero data, correct?
> >
> > Hey Ted -- I messed up on the sparsity.  Turns out there are only 70M
> > non-zero elements.
> >
> > Oh, and, I only have binary data -- I wasn't sure of the implications
> with
> > ALS-WR on binary data -- I couldn't find anything to suggest otherwise.
> > I am using data of the format user,item,1
> > I have read about probabilistic factorization -- which works with binary
> > data -- and perhaps naively, thought ALS-WR was similar so what-the-heck
> :-)
> >
> > I'd love nothing more than to share the data, however, I'd probably get
> in
> > some trouble :-)
> > Perhaps I could generate a matrix with a similar distribution? -- I'll
> have
> > to check on that and see if it is ok #bureaucracy
> >
> > Stay tuned...
> >
> > On Thu, Feb 2, 2012 at 1:47 AM, Sebastian Schelter <[EMAIL PROTECTED]>
> wrote:
> >
> >> Nicholas,
> >>
> >> can you give us the detailed arguments you start the job with? I'd
> >> especially be interested in the number of features (--numFeatures) you
> >> use. Do you use the job with implicit feedback data
> >> (--implicitFeedback=true)?
> >>
> >> The memory requirements of the job are the following:
> >>
> >> In each iteration either the item-features matrix (items x features) or
> >> the user-features matrix (users x features) is loaded into the memory of
> >> each mapper. Then the original user-item matrix (or its transpose) is
> >> read row-wise by the mappers and they recompute the features via
> >>
> >>
> AlternatingLeastSquaresSolver/ImplicitFeedbackAlternatingLeastSquaresSolver.
> >>
> >> --sebastian
> >>
> >>
> >> On 02.02.2012 09:53, Sean Owen wrote:
> >>> I have seen this happen in "normal" operation when the sorting on the
> >>> mapper is taking a long long time, because the output is large. You can
> >>> tell it to increase the timeout.  If this is what is happening, you
> won't
> >>> have a chance to update a counter as a keep-alive ping, but yes that is
> >>> generally right otherwise. If this is the case it's that a mapper is
> >>> outputting a whole lot of info, perhaps 'too much'. I don't know for
> >> sure,
> >>> just another a guess for the pile.
> >>>
> >>> On Thu, Feb 2, 2012 at 1:44 AM, Ted Dunning <[EMAIL PROTECTED]>
> >> wrote:
> >>>
> >>>> Status reporting happens automatically when output is generated.  In a
> >> long
> >>>> computation, it is good form to occasionally update a counter or
> >> otherwise
> >>>> indicate that the computation is still progressing.
> >>>>
> >>>> On Wed, Feb 1, 2012 at 5:23 PM, Nicholas Kolegraff
> >>>> <[EMAIL PROTECTED]>wrote:
> >>>>
> >>>>> Do you know if it should still report status in the midst of a
> complex
> >>>>> task?  Seems questionable that it wouldn't just send a friendly
> hello?
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
>
>