课程: Machine Learning with Python: Foundations
今天就学习课程吧!
今天就开通帐号,24,100 门业界名师课程任您挑!
Sampling your data
- [Instructor] As we prepare our data for machine learning, we sometimes have to reduce the number of rows in our data or split the data into two or more partitions. We do this because the data we have is too large or too complex to use in it's current form, or because we need to hold on to some of our data for later use. In Supervised Machine Learning, our goal is to create a model that maps a given input, which we call independent variables, to the given output, which we call the dependent variable. In order to properly evaluate whether our model is learning, we have to get an unbiased estimation of its performance using data that it has not previously seen. To do this, we must first split our previously labeled historical data into training and test datasets. We hold out the test data and use the training data to build or train our model. Then we evaluate our models performance using the test data. There are several ways…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。
内容
-
-
-
-
-
-
(已锁定)
Common data quality issues3 分钟 42 秒
-
(已锁定)
How to resolve missing data in Python7 分钟 34 秒
-
(已锁定)
Normalizing your data4 分钟 39 秒
-
(已锁定)
How to normalize data in Python4 分钟 38 秒
-
(已锁定)
Sampling your data4 分钟 7 秒
-
(已锁定)
How to sample data in Python6 分钟 35 秒
-
(已锁定)
Reducing the dimensionality of your data3 分钟 24 秒
-
(已锁定)
-
-