Top Algorithms & Methods used by Data Scientists

Top Algorithms & Methods used by Data Scientists

Algorithms are a key aspect of Data Science, and many recent KDnuggets posts looked at popular algorithms, including The 10 Algorithms Machine Learning Engineers Need to Know or 10 Algorithm Categories for A.I., Big Data, and Data Science.

But which algorithms are actually used by Data Scientists?

This was the question asked in a recent KDnuggets Poll, and here are the top 10 algorithms:

Fig. 1: Top 10 algorithms used by Data Scientists, and their share of respondents.

See full table of all algorithms in KDnuggets Post: https://www.kdnuggets.com/2016/09/poll-algorithms-used-data-scientists.html


The average respondent used 8.1 algorithms, a big increase vs a similar poll in 2011.

Comparing with 2011 Poll Algorithms for data analysis / data mining we note that the top methods are still Regression, Clustering, Decision Trees/Rules, and Visualization. The biggest relative increases, measured by (pct2016 /pct2011 - 1) are for

  • Boosting, up 40% to 32.8% share in 2016 from 23.5% share in 2011
  • Text Mining, up 30% to 35.9% from 27.7%
  • Visualization, up 27% to 48.7% from 38.3%
  • Time series/Sequence analysis, up 25% to 37.0% from 29.6%
  • Anomaly/Deviation detection, up 19% to 19.5% from 16.4%
  • Ensemble methods, up 19% to 33.6% from 28.3%
  • SVM, up 18% to 33.6% from 28.6%
  • Regression, up 16% to 67.1% from 57.9%

Most popular among new options added in 2016 are:

  • K-nearest neighbors, 46% share
  • PCA, 43%
  • Random Forests, 38%
  • Optimization, 24%
  • Neural networks - Deep Learning, 19%
  • Singular Value Decomposition, 16%

The biggest declines are for

  • Association rules, down 47% to 15.3% from 28.6%
  • Uplift modeling, down 36% to 3.1% from 4.8% (that is a surprise, given strong results published)
  • Factor Analysis, down 24% to 14.2% from 18.6%
  • Survival Analysis, down 15% to 7.9% from 9.3%

See full results, including usage of different algorithms type by employment, Algorithm usage bias by Employment, and full table for all 29 algorithms and methods on KDnuggets:

Top Algorithms Used by Data Scientists

https://www.kdnuggets.com/2016/09/poll-algorithms-used-data-scientists.html


Harikrishnan Rajeev

Senior Data Scientist - Global Markets Gen AI @ Bank of America Merrill Lynch | Gen AI production | Agentic | Micro services

8 年

Survival Analysis, down 15% to 7.9% from 9.3% ??? ... why SA is not preferred ?.

回复
Dan Toader

Senior Dev TL CIB

8 年

Interesting however

回复

There isn't such a thing, called "Data Science". This name is just a marketing denomination for a new pseudo-science.

Robert Leithiser

Information Technology Architect - Data, Infrastructure, Cyber, Software - Views expressed are my own and do not represent my employer.

8 年

If data science really works, then data science should be able to be automated so that the right algorithms are chosen automatically rather than this being a manual process by a data scientist. I have some skepticism about the hype over data science, because much of the process needs automation, which is more of a software architecture exercise than a data science exercise. Algorithms are great, but there need to be frameworks that automate the tedious work. Most data scientist spend an inordinate amount of time viewing descriptive statistics, testing out various methods, and doing essentially manual work to import data, cleanse it, determine clustering approaches, and search for appropriate training models. Add to this, that many models require extensive modification when feedback is introduced that varies the model horizontal and vertical (row/column) dynamics. This all rightly need to be automated in software framework in my view.

Akshay Kher

Advanced Analytics Professional

8 年

Surprised Neural Networks doesn't find a place!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了