Better Data for Better Machine Learning

Better Data for Better Machine Learning

Usually, Data is messy! Data could be unbalanced, mislabeled with missed or incorrect values. The first step to getting the dataset cleaned is to understand and analyze it.

Facets contains two robust visualizations to aid in understanding and analyzing machine learning datasets.

  1. Facets Overview: Get a sense of the shape of each feature of your dataset. Uncover several uncommon and common issues such as unexpected feature values, missing feature values for a large number of observation, training/serving skew and train/test/validation set skew.
  2. Facets Dive: Explore individual observations. Exploring the relationship between data points across all of the different features of a dataset. Each individual item in the visualization represents a data point. Position items by "faceting" or bucketing them in multiple dimensions by their feature values. Enables the detection of classifier failure, identification of systematic errors, evaluating ground truth and potential new signals for ranking.


Regards

要查看或添加评论,请登录

Ibrahim Sobh - PhD的更多文章

社区洞察

其他会员也浏览了