课程: Complete Guide to NLP with R
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
Corpus subsets and groups
Quanteda provides tools to subset and group corpora. These tools are called corpus underbar subset, corpus underbar sample, and corpus underbar group. Let's take a look at how these work. In line four, I bring in the quanteda library and then in line seven, I pull in a sampleCorpus that I've saved as sampleCorpus.RDS. This should be in your exercise files. In line 11, I've set up a corpus subset command, and you'll notice that the first argument, the corpus underbar subset is a sampleCorpus, which we just pulled in. The second one, which is startsWith creates a logical vector. And let's take a second to look at how that logical vector is built. The first thing we're doing is looking at the contents of someInfo, a vector contained in sample corpus, and you'll see down in the console that someInfo contains the words this, that, another, and one more. startsWith examines that vector and checks to see if each of these…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。