课程: Complete Guide to NLP with R
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
N-grams
- [Instructor] N-grams are a special type of token. They're actually combinations of tokens. You might consider them to be phrases. The question is, how do you break tokens up into phrases? Let's take a look at how to use an N-gram command. In line three, I bring in the text mining package, and then in line six, I define some text with some sample text, very simple this time. In line seven, I use the boost tokenizer to break some text into individual tokens, and we can take a look at that just by typing in ngram_tokens. This shows that the original sumtext is now a vector of individual words. In line eight, I use the N-grams command supplied by the Tm package against ngram_tokens and I've given N-grams a number of three. Three is the number of tokens I want combined into each N-gram. I'll run line eight and then we'll use line nine to show the contents of what just resulted. Start with the first item, which was "Brillig…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。