课程: Predictive Analytics Essential Training: Data Mining
今天就学习课程吧!
今天就开通帐号,24,600 门业界名师课程任您挑!
Selecting relevant data
- [Instructor] This next topic is one of my favorites because there's so much confusion around it. Folks often think that it's either easier or better or both to use all of your data. When big data first became a popular phrase, a widely read book came out that tried to suggest that drawing a sample from a population was old fashioned, that the only reason we used a sample was that computers at the time couldn't handle large datasets. It creates this image that we just throw the data in and let the algorithm figure it out. We still sample for lots of reasons. One good one is you wouldn't want to drain the whole river to test the water, but there are other reasons you can't or shouldn't use all the data. So our next element is that you have to be thoughtful about the data that you select. And here, we're focused on selecting the cases or instances. In other words, the rows of the dataset. The most important reason that we…
内容
-
-
-
-
-
Understanding data requirements1 分钟 9 秒
-
(已锁定)
Gathering historical data1 分钟 45 秒
-
(已锁定)
Meeting the flat file requirement1 分钟 42 秒
-
(已锁定)
Determining your target variable1 分钟 40 秒
-
(已锁定)
Selecting relevant data3 分钟 14 秒
-
(已锁定)
Hints on effective data integration2 分钟 49 秒
-
(已锁定)
Understanding feature engineering2 分钟 45 秒
-
(已锁定)
Developing your craft1 分钟 20 秒
-
-
-
-
-
-
-