Bigger is not better in machine learning

Bigger is not better in machine learning

I heard an interview today featuring the Chief Scientist of Mosaic ML. One of the things Jonathan Frankle said is that bigger is not better. He, of course, is talking about machine model size (e.g., number of parameters).

I posit that we can agree that bigger models may not be better and that when it comes to model size, we generally have a good mental model and a set of metrics to understand this concept relating to the size of a machine learning model. But what about the "better" concept relating to a machine learning model? It's much harder to define, and we have to unpack this onion a lot more.

One way to define it is that the model results are driving superior results. Another way to define it is that the model is more explainable. Another way to define it is that the model is performant enough for the use case but the cost is much lower. The list goes on and on. You can visualize this as a step function or a "build."

In my view, bigger can be better for some use cases but there will be a significant diminishing return over time. Diminishing return on time, financial investment, performance gain, and interoperability. On the other hand, there is some minimum model complexity needed for most enterprise use cases. So finding that equilibrium or balance of size and performance for a specific use case is a key part of the "art of data science and AI."

Also in my view, there is a lot of work to be done and opportunities to unpack this "better" notion --- both for enterprises looking to establish and mature their AI strategy and program and for startups that are looking to build AI tools for enterprise customers.

Exciting times!

Ranganath Venkataraman

Automation and Innovation | Enterprise-wide value creation | Consulting Director

1 年

Thanks for sharing Joyce J. Shen - agree that model complexity, usability, and ability to interpret results are all factors influencing the bigger/better connection. Also of note are the environmental impacts of AI as computers churn through ever-increasing volumes of data - an MIT review found that training just one AI model can emit more than 626,00 pounds of carbon dioxide equivalent

Janusz (John) J.

(Consultant | PM | PhD) :: (ML | n-D Visualization | Typology & Meta-Languages | Algorithms | Libraries | SOTA & beyond)

1 年

Excellent point

回复
CHESTER SWANSON SR.

Realtor Associate @ Next Trend Realty LLC | HAR REALTOR, IRS Tax Preparer

1 年

Thanks for Sharing.

要查看或添加评论,请登录

Joyce J. Shen的更多文章

社区洞察

其他会员也浏览了