Quantico: Unsupervised Learning
Quantico Shiny App

Quantico: Unsupervised Learning

Quantico offers a wide range of operations that assist in the data science & analytics process. One of those major components is unsupervised learning and that's what I'll be covering in this article. I classify unsupervised learning as separate from data wrangling and feature engineering although they can be used for both.

I wanted to cover the basics of the Shiny App, that is, overview of plotting, data wrangling, feature engineering, and now unsupervised learning. Truth be told, the most exciting parts are yet to come. The machine learning and forecasting articles are going to be coming soon, so hang in there with me while I go over the basics. The ML and Forecasting are enhanced versions of what's available in my AutoQuant package, but the methods discussed here and previously play a pivotal role in assisting the performance of those methods.

Unsupervised Learning Methods:

  • NLP
  • Anomaly Detection
  • Dimensionality Reduction


NLP Functions

The suite of NLP functions include one word2vec method along with a handful of statistical methods. There is definitely room to beef up this suite so suggestions are welcome! Each of the the methods are intended to automatically do the magic behind the scenes and make those variables available to you without any data engineering efforts.

A list of the methods include:

  • Word2Vec: by h2o
  • Text Summary
  • Sentiment
  • Readability
  • Lexical Diversity

Word2Vec: by h2o

This word2vec function will convert any number of text columns to vectors that are useful from a modeling perspective. You can run one column at a time or all of them at once. There are numerous parameters to configure to your liking as well.

Text Summary

Text summary info is useful for a variety of reasons and although it's probably not formally an unsupervised learning method, it is grouped with them for ease of understanding. There are a variety of outputs that come with this function and you can select which ones you'd like to exclude!

Sentiment

Sentiment comes in one of two flavors: positive or negative along with positive, neutral, or negative. The user selects their preference.

Readability

There are so many possible readability measures to utilize. For modeling purposes you should iterate through all them, which Quantico makes easy for you. If you have a specific one in mind then select the one you want.

Lexical Diversity

There are many possibilities here too. Same as above. Choose the one's you don't want and they won't be added to your dataset.


Anomaly Detection

Anomaly detection currently rests on the isolation forest functionality. There are many others that can be utilized but this is what's available for now. Feel free to request others!

Dimensionality Reduction

Dimensionality reduction is currently done via deep learning autoencoders so as to account for non-linearities in your data. The user can select the layer to return and the number of variables to return as well. This is another areas where more methods are welcome. Please reach out with requests!



要查看或添加评论,请登录

Adrian Antico的更多文章

  • Python QuickEcharts

    Python QuickEcharts

    Creating Echarts visualizations in Python is doable but a bit cumbersome. The basis package and API I'm utilizing under…

    15 条评论
  • Quantico: Multiclass Evaluation Plots

    Quantico: Multiclass Evaluation Plots

    One of the great benefits from the R Shiny App Quantico (dependent upon the package AutoPlots) are the model evaluation…

    4 条评论
  • Quantico: Plotting

    Quantico: Plotting

    The first article I wrote on plotting can be found here: https://www.linkedin.

  • Quantico: Code Generation Part_2 - Data Wrangling

    Quantico: Code Generation Part_2 - Data Wrangling

    With the Quantico Shiny App (https://github.com/AdrianAntico/Quantico) you can return the code that runs behind the…

    4 条评论
  • Quantico: Code Generation Part_1 - Plotting

    Quantico: Code Generation Part_1 - Plotting

    With the Quantico Shiny App (https://github.com/AdrianAntico/Quantico) you can return the code that runs behind the…

    11 条评论
  • Quantico: Hypothesis Testing

    Quantico: Hypothesis Testing

    With the Quantico Shiny App, you can run hypothesis testing and generate output to help you with decision making and…

  • Quantico: Forecasting Panel & Single Series Data

    Quantico: Forecasting Panel & Single Series Data

    I've written several articles so far that highlights some top level functionality with the Quantico Shiny App. You can…

    1 条评论
  • Quantico: Machine Learning

    Quantico: Machine Learning

    The new shiny app Quantico comes with machine learning capabilities. Check out the GitHub repo here: https://github.

    3 条评论
  • Quantico: Feature Engineering

    Quantico: Feature Engineering

    Quantico offers a wide range of operations that assist in the data science & analytics process. One of those major…

  • Quantico: Data Wrangling

    Quantico: Data Wrangling

    Quantico offers a wide range of operations that assist in the data science & analytics process. One of those major…

社区洞察

其他会员也浏览了