R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results

R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results

The 17th annual KDnuggets Software Poll got tremendous participation from analytics and data science community and vendors, attracting 2,895 voters, who chose from a record number of 102 different tools. Here we give a first overview, leaving a more detailed association analysis for a later post.


R remains the leading tool, with 49% share (up from 46.9% in 2015), but Python usage grew faster and it almost caught up to R with 45.8% share (up from 30.3%). RapidMiner remains the most popular general platform for data mining/data science, with about 33% share. Notable tools with the most growth in popularity include Dato, Dataiku, MLlib, H2O, Amazon Machine Learning, scikit-learn, and IBM Watson.

The increased choice of tools is reflected in wider usage. The average number of tools used was 6.0, vs 4.8 in 2015.

The usage of Hadoop/Big Data tools grew to 39%, up from 29% in 2015 (and 17% in 2014), driven by Apache Spark, MLlib (Spark Machine Learning Library) and H2O.
See also

The participation by region was: US/Canada (40%), Europe (39%), Asia (9.4%), Latin America (5.8%), Africa/MidEast (2.9%), Australia/NZ (2.2%).

Top Analytics/Data Science Tools



Next table has the top 10 most popular tools in 2016 poll


In this table 2016 % share is % of voters who used this tool, % change is the change in share vs 2015 poll, and % alone is the percent of voters who used only the reported tool among all voters who used that tool. E.g. 4.4% of KNIME voters reported using only KNIME and nothing else. We note a decrease in such lone voting, with only 9 tools having 5% or more lone votes.

 

Compared to 2015 KDnuggets Analytics/Data Science Poll results, the only newcomer in top 10 was scikit-learn, displacing SAS.

Tools with the highest growth (among tools with at least 15 users in 2015) were

 



This year, 86% of voters used commercial software and 75% used free software. About 25% used only commercial software, and 13% used only open source/free software. A majority of 61% used both free and commercial software, similar to 64% in 2015.

New (in this poll) tools that received at least 1% share votes in 2016 were

  • Anaconda, 16%
  • Microsoft other ML/Data Science tools, 1.6%
  • SAP HANA, 1.2%
  • XLMiner, 1.2%

Among tools with at least 15 votes in 2015, the largest decline in 2016 was for the tools below, which includes probably a combination of decline of popularity for free tools like F# and lack of a voter drive for some of commercial tools this year.

  • Ayasdi, down 85%, to 0.3% share from 2.0%
  • Actian, down 83%, to 0.3% share from 2.0%
  • Datameer, down 52%, to 0.4% share from 0.9%
  • SAP Analytics, down 51%, to 1.5% share from 3.0%
  • SAS Enterprise Miner, down 49%, to 5.6% from 10.9%
  • Alteryx, down 46%, to 3.0% share from 5.6%
  • F#, down 42%, to 0.4% share from 0.7%
  • TIBCO Spotfire, down 36%, to 2.8% share from 4.3%
  • JMP, down 36%, to 2.0% share from 3.1%

 

Hadoop/Big Data Tools

The usage of Hadoop/Big Data tools grew to 39%, up from 29% in 2015 and 17% in 2014), driven mainly by big growth in Apache Spark, MLlib (Spark Machine Learning Library) and H2O, which we included among Big Data tools.

For more detailed analysis of Big Data and Deep Learning tools and 3-year comparison of popularity, see 

R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results

https://www.kdnuggets.com/2016/06/r-python-top-analytics-data-mining-data-science-software.html

要查看或添加评论,请登录

Gregory Piatetsky-Shapiro的更多文章

社区洞察

其他会员也浏览了