课程: Machine Learning with Python: Foundations

今天就学习课程吧!

今天就开通帐号,24,100 门业界名师课程任您挑!

Sampling your data

Sampling your data

- [Instructor] As we prepare our data for machine learning, we sometimes have to reduce the number of rows in our data or split the data into two or more partitions. We do this because the data we have is too large or too complex to use in it's current form, or because we need to hold on to some of our data for later use. In Supervised Machine Learning, our goal is to create a model that maps a given input, which we call independent variables, to the given output, which we call the dependent variable. In order to properly evaluate whether our model is learning, we have to get an unbiased estimation of its performance using data that it has not previously seen. To do this, we must first split our previously labeled historical data into training and test datasets. We hold out the test data and use the training data to build or train our model. Then we evaluate our models performance using the test data. There are several ways…

内容