Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
Mahout, mail # user - Logistic Regression: number of positives and negatives


Copy link to this message
-
Re: Logistic Regression: number of positives and negatives
Ted Dunning 2011-07-11, 14:29
Downsampling negatives should make little difference to accuracy. It can substantially affect training time however.

Sent from my iPhone

On Jul 11, 2011, at 6:56, Svetlomir Kasabov <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I plan using logistic regression for predicting the probability that a patient will be given a drug Y. The problem is, that patients don't get that drug so often and I have many more training examples with Y=0 than examples  with Y=1. Do you think I should keep the number of negative examples equal to that of positive examples? Or should I ignore that number difference and give my logistic regression model all of the training examples ?
>
> Thanks!
>
> Svetlomir.