课程: Spark for Machine Learning & AI

今天就学习课程吧!

今天就开通帐号,24,100 门业界名师课程任您挑!

Summary of preprocessing

Summary of preprocessing

- [Instructor] To summarize preprocessing transformations, let's review the numeric and text transformations discussed in this lesson. The three numeric transformations are MinMaxScaler, which maps attribute values from zero to one range, StandardScaler, which maps attribute values to the negative one to one range with a mean of zero and a normal distribution, and Bucketizer, which creates a number of partitions for grouping values. Two useful text transformations are Tokenizer for splitting a string into a list of words and HashingTF for creating term frequency-inverse document frequency vectors from text.

内容