登录查看更多内容

Machine Learning - Old Fish in New Paper

?? Alastair Muir, PhD, BSc, BEd, MBB

Data Science Consultant | @alastairmuir.bsky.social | Risk Analysis and Optimization

发布日期: 2017年7月12日

When NOT to use deep learning? Pablo Cordero’s post, Jeff Leek's Simply Stats Blog, and a rebuttal from Andrew Beam make good points comparing the two approaches, but I’m not so interested in the issues related to sample size; I need to understand what’s happening under the hood.

I choose methods based on understanding and interpretability. I spend more time on defining the problem and determining the appropriate questions than on the actual computation. I need statistical models to give me a defendable understanding of the underlying processes; they provide confidence intervals, optimization opportunities, diagnostics and graphical methods. This overrides any overarching drive to get one last 0.5% increment on performance. Even if I have the best model in the world, the first question in the meeting is usually about how we should change the underlying process to increase, decrease or otherwise improve or optimize the process.

Over twenty years ago when neural nets were (once again) the latest thing, I found an article, Neural Networks and Statistical Models by Warren S. Sarle at the SAS Institute, where he showed the relationship between the two approaches. As he puts it, “translates neural network jargon into statistical jargon”. The paper is a great summary of various neural net models with their equivalent statistical models. For example, this figure illustrates how a multilayer perceptron (MLP) is equivalent to multivariate multiple nonlinear regression.

This gave me a foundation for investigating the limitations and capabilities of new techniques.

tl;dr:

Learn the detailed underlying logic and assumptions behind existing techniques before jumping on the latest bandwagon. You might be using an old reliable technique in disguise.

要查看或添加评论，请登录

?? Alastair Muir, PhD, BSc, BEd, MBB的更多文章

Military Use of Machine Learning “Magic Powder” in Gaza

2024年4月12日

Military Use of Machine Learning “Magic Powder” in Gaza

This is not a political viewpoint. It is based on my own investigation on how what I love can be used or misused once…

5 条评论
SHAP is not all you need (or why you should always use permutation feature importance)

2024年3月10日

SHAP is not all you need (or why you should always use permutation feature importance)

Repost from Christoph Molnar A most annoying misconception in the world of machine learning interpretability This post…

11 条评论
Why Does My Model Not Generalize Well?

2023年12月27日

Why Does My Model Not Generalize Well?

We can learn a lot from the analysis of ecological data. These data can show complex temporal, spatial, hierarchical…

6 条评论
ChatGPT vs Gemini: What Does a Game Changer in AI Look Like?

2023年12月7日

ChatGPT vs Gemini: What Does a Game Changer in AI Look Like?

TLDR: Look at the time dependence of the distribution of metrics from an ensemble of strategies When I evaluate…
Robust AI: Rethinking Data Strategies

2023年11月2日

Robust AI: Rethinking Data Strategies

In the world of data, we often find ourselves working with whatever information comes our way. But is that always the…

2 条评论
Data Science and Data Engineering

2023年10月17日

Data Science and Data Engineering

Alastair Muir, Phd, BSc, BEd, MBB Managing Partnerships In today's data-driven world, the terms "data science” (DS) and…
Data scientists: How to talk with your subject matter experts

2020年10月23日

Data scientists: How to talk with your subject matter experts

I had an amazing meeting with my subject matter expert partner yesterday. I came to the table with lots of CNN, LSTM…

3 条评论
Why 85% of AI and ML initiatives fail*

2020年9月4日

Why 85% of AI and ML initiatives fail*

*Click bait headline, but it's not "failure to get good data", or "lack of executive support" If I see headlines like…

3 条评论
Reporting on your data science results can be a full time job

2017年10月3日

Reporting on your data science results can be a full time job

All Data Scientists should possess reporting and communication. The BBC is adding its own "Data Journalist" set to the…
Why You Need Projects in a CI Program

2017年7月3日

Why You Need Projects in a CI Program

Decades ago, I walked through a GE Power Systems service shop with Steve Zwolinski, at the time a newly promoted…

1 条评论

See all articles

Machine Learning - Old Fish in New Paper

?? Alastair Muir, PhD, BSc, BEd, MBB

Data Science Consultant | @alastairmuir.bsky.social | Risk Analysis and Optimization

tl;dr:

?? Alastair Muir, PhD, BSc, BEd, MBB的更多文章

社区洞察

其他会员也浏览了

AI Atlas #15: Ensemble Methods in Deep Learning

A Hands-On Guide to Building and Training Variational Autoencoders

Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before