课程: Machine Learning with Python: Foundations

今天就学习课程吧!

今天就开通帐号,24,100 门业界名师课程任您挑!

Common data quality issues

Common data quality issues

- [Instructor] An ideal dataset is one that has no missing values and has no values that deviates from the expected. Such a dataset hardly exists, if at all. In reality, most datasets have to be transformed or have data quality issues that need to be dealt with prior to being used for machine learning. This is what the third stage in the machine learning process is all about, data preparation. Data preparation is a process of making sure that our data is suitable for the machine learning approach that we choose to use. In computing, the saying, "Garbage in, garbage out," is used to express the idea that incorrect or poor quality input will invariably result in incorrect or poor quality output. This concept is fundamentally important in machine learning. If proper care is not taken on the front-end to properly deal with data quality issues before building the model, then the model output will be unreliable, misleading…

内容