课程: Complete Guide to NLP with R

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Focus the document-term matrix

Focus the document-term matrix - R教程

课程: Complete Guide to NLP with R

Focus the document-term matrix

- [Host] A document term matrix is actually a sparse matrix, which means that empty cells in the matrix actually don't exist, and this is how the actual object is structured. Let's take a look at sparse matrices, how to trim them, and how this affects natural language processing. In line three, I'm going to bring in the TM library and then in line six, I bring in our old friend, the poet corpus. In line eight, I create a most simple document term matrix and then let's inspect it. Notice at the top of this inspected document term matrix, we have non-sparse entries, a sparsity level, a maximal term length, and waiting. We already talked about waiting and we've talked about term length. Let's talk about the sparsity and non-sparse entries. If we look at the actual breakdown of the structure of our document term matrix, you'll notice that we have I, J, and V. I and J are the actual grid numbers for elements in a matrix. V…

内容