课程: Complete Guide to NLP with R

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Create a quanteda corpus

Create a quanteda corpus

Natural language processing works best when documents are collected into a consistent data structure. These data objects are called corpora. Quanteda provides five ways to create a corpus, vector, and data.frame are the most common. Quanteda also provides methods for the quanteda keyword in context object or quick, a vcorpus from the TM package and another quanteda corpus object. Creating corpora is easy. Let's take a look at some examples. We're looking at some R code in RStudio. In line one, the first thing we do, of course, is to load in the quanteda library. In line eight, we're going to create a named vector, which we'll eventually turn into a corpus for quanteda. To create myNamedVector, I just simply select line eight, and on my Macintosh, I hit command returned to run that code. I'm going to add some names to that vector. And let's take a look at what myNamedVector actually appears to be. If I open up the…

内容