课程: Complete Guide to NLP with R

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

How to think like tidytext

How to think like tidytext

- [Instructor] Many natural language processing packages work with lists and matrices. These objects are commonly described as corpora, collections of documents, and document term matrices, documents showing term frequency in a collection of documents. In contrast, tidytext subscribes to tidyverse concepts. The tidyverse is a collection of packages based around three rules of data. Each variable is a column. Each observation is a row and each type of observational unit is a table. Tidytext brings these concepts to text mining, resulting in a document structure with one word or token per row. There isn't an inherent advantage of one method over another, but if you are used to data in one form or another, you may find it easier to work with what you already know. For example, if you haven't used the tidyverse in the past, this type of data representation and the associated coding syntax can seem confusing and obscure.…

内容