课程: Complete Guide to NLP with R
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
How to think like tidytext
- [Instructor] Many natural language processing packages work with lists and matrices. These objects are commonly described as corpora, collections of documents, and document term matrices, documents showing term frequency in a collection of documents. In contrast, tidytext subscribes to tidyverse concepts. The tidyverse is a collection of packages based around three rules of data. Each variable is a column. Each observation is a row and each type of observational unit is a table. Tidytext brings these concepts to text mining, resulting in a document structure with one word or token per row. There isn't an inherent advantage of one method over another, but if you are used to data in one form or another, you may find it easier to work with what you already know. For example, if you haven't used the tidyverse in the past, this type of data representation and the associated coding syntax can seem confusing and obscure.…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。
内容
-
-
-
-
-
-
-
-
-
-
-
-
-
(已锁定)
How to think like tidytext1 分钟 59 秒
-
(已锁定)
An example: Calculate the most popular terms in a document3 分钟 10 秒
-
(已锁定)
Tokenizing with unnest_tokens( )8 分钟 19 秒
-
(已锁定)
Stopwords, punctuation, whitespace, and numbers6 分钟 30 秒
-
(已锁定)
Stemming and lemmatization5 分钟 35 秒
-
(已锁定)
Term frequency with bind_tf_idf( )5 分钟 54 秒
-
(已锁定)
Sentiment analysis with sentiments( )4 分钟 44 秒
-
(已锁定)
Parts of speech with parts_of_speech( )4 分钟 32 秒
-
(已锁定)
Import and export from other NLP packages2 分钟 30 秒
-
(已锁定)
-
-
-
-
-
-
-
-
-