Statistics or Data Science

Statistics or Data Science

Today we hear a lot about a new discipline “Data Science”, but do we really know what Data Science is all about, I guess many will say NO!

Data Scientists are people with some mix of coding (computer science knowledge) and statistical skills, who majorly work on collecting data, analyzing it using visualization tools and then communicating the results. Data Scientist can code well enough to work with data but is not necessarily an expert in Statistics.

Greatly said by Sir R A Fisher as “all the development on the scale of time is history, all sciences in its basis are mathematics and all decisions are statistics”.
Now tell me where is data science and statistics?

There are many such new disciplines which have evolved over a period of time, and they are Data Mining, Data Analysis, predictive modeling, optimization, simulation. I find it pretty amusing, that how many different disciplines are being defined today that basically represents some type of specialization of Statistics, isn’t it!

If you pick up any ancient Statistics book(s), most of these concepts (sampling, data mining, predictive modeling, optimization, simulation) were already described there.

Maybe there was a need to define cooler sounding names such as Data Science and hence I will quote:

Data Science as “A new bottle with old wine, but without palate”

Lastly the marking note, always remember that data scientist without statistics will not add any great value, but individual Statistician can.

Aniruddha Deshmukh

CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author

8 年

Statistics Denial Myths: Unknown to most marketing researchers, there have been tensions between the statistical and computer science communities in recent years..... Read more on https://www.dhirubhai.net/pulse/statistics-denial-myths-kevin-gray

Aniruddha Deshmukh

CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author

8 年

Inputs by Ferris Jumah: You have no idea how many "data scientists" I've interviewed couldn't answer 1) What is an AB test? 2) If I rolled a 6, what is the probability of rolling 6 again? My favorite quote on data science and statistics: Data is a Matrix, Questions are Biased, and Shortcuts are Coin Flips.

回复
Aniruddha Deshmukh

CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author

8 年

Inputs by Mr. George Stefanopoulos: It was not that long ago that what we now call Machine Learning was commonly referred to by statisticians as Statistical Learning (SL). Induction forms the core of SL. "The Elements of Statistical Learning: Data Mining, Inference, and Prediction" by Trevor Hastie, is an excellent reference.

回复
Aniruddha Deshmukh

CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author

8 年

Inputs by Mr. Kavin Gray: Many people calling themselves data scientists seem to know next to nothing about statistics....Statistical models have been designed to draw powerful inferences from tiny samples (by "big data" standards). You don't need massive amounts of data to know whether there is value in them or not - use a sample for some exploratory modeling before you invest tons of money in data infrastructure or clouds. A second point is that when "trad" stats methods "lose" to newer machine learning tools in competitions, it's usually because they have been used incompetently, as Mr. Frank Harrell has pointed out.

回复

要查看或添加评论,请登录

Aniruddha Deshmukh的更多文章

  • Effective Review and Interpretation of Control Charts

    Effective Review and Interpretation of Control Charts

    Control charts are a valuable tool for monitoring process performance. Through the control chart, the process will let…

    7 条评论
  • Arrival of big data analytics in Sports

    Arrival of big data analytics in Sports

    After her defeat in the 2016 US presidential election, Hillary Clinton remarked that her data operation simply wasn’t…

    1 条评论
  • Whose job is at stake?

    Whose job is at stake?

    Once upon a time there was a juice factory running in full production. It was very popular and a well-known brand…

    1 条评论
  • Why Statistics Is Important

    Why Statistics Is Important

    Why Statistics is so important in our life? Many of us are knowingly or unknowingly using statistics but are unknown…

    21 条评论
  • Great Indian Statisticians of all times

    Great Indian Statisticians of all times

    Whenever we talk about statistics and related field, we come across various important theoretical and practical…

    42 条评论
  • Using Social Media - be careful - Someone is watching you!

    Using Social Media - be careful - Someone is watching you!

    The Humans have started using technology with the conversion of natural resources into simple tools. Modern technology…

    3 条评论
  • Ten Simple Rules for Effective Statistical Practice

    Ten Simple Rules for Effective Statistical Practice

    By Robert E. Kass, Brian S.

    5 条评论
  • Role of CDISC in Clinical and Healthcare Research

    Role of CDISC in Clinical and Healthcare Research

    Individuals from more than 90 different countries are downloading the CDISC standards everyday and still counting..

    5 条评论
  • Big Data Analytics for Pharma Industry

    Big Data Analytics for Pharma Industry

    “73% of Pharma companies surveyed said that they leverage less than half of their available Big Data” – Pharma Industry…

    15 条评论
  • A lesson on Carelessness by Bharat Ratna Late Shri J R D Tata

    A lesson on Carelessness by Bharat Ratna Late Shri J R D Tata

    Bharat Ratna Late Shri J R D Tata a very well-known Industrialist and business tycoon from India, had a friend who used…

    3 条评论

社区洞察

其他会员也浏览了