Statistics or Data Science
Aniruddha Deshmukh
CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author
Today we hear a lot about a new discipline “Data Science”, but do we really know what Data Science is all about, I guess many will say NO!
Data Scientists are people with some mix of coding (computer science knowledge) and statistical skills, who majorly work on collecting data, analyzing it using visualization tools and then communicating the results. Data Scientist can code well enough to work with data but is not necessarily an expert in Statistics.
Greatly said by Sir R A Fisher as “all the development on the scale of time is history, all sciences in its basis are mathematics and all decisions are statistics”.
Now tell me where is data science and statistics?
There are many such new disciplines which have evolved over a period of time, and they are Data Mining, Data Analysis, predictive modeling, optimization, simulation. I find it pretty amusing, that how many different disciplines are being defined today that basically represents some type of specialization of Statistics, isn’t it!
If you pick up any ancient Statistics book(s), most of these concepts (sampling, data mining, predictive modeling, optimization, simulation) were already described there.
Maybe there was a need to define cooler sounding names such as Data Science and hence I will quote:
Data Science as “A new bottle with old wine, but without palate”
Lastly the marking note, always remember that data scientist without statistics will not add any great value, but individual Statistician can.
CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author
8 年Statistics Denial Myths: Unknown to most marketing researchers, there have been tensions between the statistical and computer science communities in recent years..... Read more on https://www.dhirubhai.net/pulse/statistics-denial-myths-kevin-gray
CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author
8 年Inputs by Ferris Jumah: You have no idea how many "data scientists" I've interviewed couldn't answer 1) What is an AB test? 2) If I rolled a 6, what is the probability of rolling 6 again? My favorite quote on data science and statistics: Data is a Matrix, Questions are Biased, and Shortcuts are Coin Flips.
CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author
8 年Inputs by Mr. George Stefanopoulos: It was not that long ago that what we now call Machine Learning was commonly referred to by statisticians as Statistical Learning (SL). Induction forms the core of SL. "The Elements of Statistical Learning: Data Mining, Inference, and Prediction" by Trevor Hastie, is an excellent reference.
CMC Statistician | PAT | Trainer (QbD, DoE, SPC, SQC, LEAN, 6σ ...) | Speaker | Author
8 年Inputs by Mr. Kavin Gray: Many people calling themselves data scientists seem to know next to nothing about statistics....Statistical models have been designed to draw powerful inferences from tiny samples (by "big data" standards). You don't need massive amounts of data to know whether there is value in them or not - use a sample for some exploratory modeling before you invest tons of money in data infrastructure or clouds. A second point is that when "trad" stats methods "lose" to newer machine learning tools in competitions, it's usually because they have been used incompetently, as Mr. Frank Harrell has pointed out.