Data modeling culture versus algorithmic modeling culture
Picture by Pixabay

Data modeling culture versus algorithmic modeling culture

Leo Breiman (1) wrote an interesting article about the two cultures in the use of statistical modeling to reach conclusions from data: the data modeling culture and the algorithmic modeling culture. The differences between these two models have recently come to the fore again in a discussion between Noam Chomsky (2) and Google's director of AI research, Peter Norvig (3). To recap these two cultures:

  • The data modeling culture argues that nature can be understood as a black box with a very simple underlying model (that can be assumed) that translates from input variables to output variables.?
  • The algorithmic modeling culture's approach is to identify a function able to predict the output from a given input.?But the inside of the box is unknown from this culture's point of view.

The difference between these two approaches us that the conclusions made by data modeling are about the model, not about the nature of phenomena.

Usually, simple parametric models (from data modeling culture) imposed on data generated by complex systems result in a loss of accuracy and information as compared to algorithmic models.
Leo Breiman (2001)

Breiman argues that data modeling culture has some limitations, such as its (sometimes) low accuracy, the inability to present a clear picture of nature’s mechanism when we have complex data, and the reasonable doubt about whether the chosen statistical model is the one that best reflects the nature of the phenomenon. Chomsky opposes the algorithmic model in his discussion because the function it produces is difficult to understand,which, in his opinion, makes no sense. He would rather think that the model used to explain this data must be relatively simple. Norving says that reality is messier and "we shouldn't accept a theoretical framework that places a priority on making the model simple over making it accurately reflect reality."

The majority of data science today is actually based on a culture of data modeling, perhaps as a result of statisticians' influence on the development of machine learning technologies. But there is evidence today that data science is becoming more and more an empirical science.?

"But if a method works, it should not be abandoned nor dismissed just because theorists haven’t yet figured out how to explain it".
Yann LeCun
Director of AI Research at Facebook and Professor at NYU

As a conclusion, data modeling culture can be very useful for a large set of problems. But it is not possible to ignore the evidence showing that machine learning technologies are becoming more empirical. It is challenging to comprehend all of the complex processes that form nature with the data modeling culture because this method produces statistical parameters rather than a thorough comprehension of the phenomena.

"This web of life, the most complex system we know of in the universe, breaks no law of physics, yet is partially lawless, ceaselessly creative."
Stuart Kauffman
Professor of Biological Sciences, Physics, Astronomy, University of Calgary


References:

  1. Statistical modeling: The two cultures (with comments and a rejoinder by the author). L Breiman (2001). Statistical science 16 (3), 199-231
  2. Katz, Y. (2017, December 7).?Noam Chomsky: Where Artificial Intelligence Went Wrong. The Atlantic. Retrieved October 13, 2022, from?https://www.theatlantic.com/technology/archive/2012/11/noam-chomsky-on-where-artificial-intelligence-went-wrong/261637/
  3. On Chomsky and the Two Cultures of Statistical Learning. (2022.). Retrieved October 13, 2022, from https://norvig.com/chomsky.html
  4. Friedman, D. (2003). Meditations on the Nature of Life, Complexity, Consciousness, the Universe and Everything [Review of?Investigations, by S. A. Kauffman].?The American Journal of Psychology,?116(1), 141–144. https://doi.org/10.2307/1423341

要查看或添加评论,请登录

Javier Marin的更多文章

社区洞察

其他会员也浏览了