课程: Complete Guide to NLP with R
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
Focus the document-term matrix
- [Host] A document term matrix is actually a sparse matrix, which means that empty cells in the matrix actually don't exist, and this is how the actual object is structured. Let's take a look at sparse matrices, how to trim them, and how this affects natural language processing. In line three, I'm going to bring in the TM library and then in line six, I bring in our old friend, the poet corpus. In line eight, I create a most simple document term matrix and then let's inspect it. Notice at the top of this inspected document term matrix, we have non-sparse entries, a sparsity level, a maximal term length, and waiting. We already talked about waiting and we've talked about term length. Let's talk about the sparsity and non-sparse entries. If we look at the actual breakdown of the structure of our document term matrix, you'll notice that we have I, J, and V. I and J are the actual grid numbers for elements in a matrix. V…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。