课程: Complete Guide to NLP with R

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Weighting the document-term matrix

Weighting the document-term matrix - R教程

课程: Complete Guide to NLP with R

Weighting the document-term matrix

- [Instructor] A typical document term matrix indicates the frequency of a term within a particular document. Let's take a quick look at a standard DTM. To do that, I'll need to bring in the TM library and I'll pull in the poetCorpus which we've used in previous sessions. Now in line eight, I create a document term matrix. And in line 15, we'll inspect that document term matrix. This is what you've seen before. There's a matrix. Each row is the name of a document and each column is a term. So we have 217 occurrences of day in document 12759. TM provides us with three, actually four built-in weighting options. The first option, which is what we're looking right now and is the default, is just simply the term frequency. The third option called weightBin is a logical option. It provides us with a true or false. Does the term appear in this document, yes or no? The second one is term frequency inverse document frequency and…

内容