课程: Advanced Predictive Modeling: Mastering Ensembles and Metamodeling

今天就学习课程吧!

今天就开通帐号,24,600 门业界名师课程任您挑!

Curse of dimensionality

Curse of dimensionality

- [Instructor] Let's briefly talk about what I think we'd all agree is a modeling fundamental, but I think it's important to include in our discussion of ensembles. As we revisit the miles per gallon and weight scatter plot, we can contemplate how can we better tackle bias here? Well we can increase the variance by adding variables. We can go to a more flexible model like a curvilinear fit. But we want to be careful. Sometimes if we have too much faith in the algorithm, we start throwing all of our variables at the problem. I really love the way Gordon Linoff and Michael Berry put this in their book Data Mining Techniques. They remind us that in data mining, having more data is better. More variables give models more power, they make it possible to capture more nuances of customer behavior and to build stable models, but as any lover of dessert knows, they remind us, more is not always better. The same may be true of data mining, particularly in regard to the number of variables. So…

内容