登录查看更多内容

A Million Monkeys On Typewriters –?What Traditional PhDs Missed About Machine Learning

Uri Pomerantz

Applying AI and fintech for good | Entrepreneur | Investor | Dad | Writer

发布日期: 2024年5月29日

+ 关注

Introduction

In this blog post, I want to explore what traditional PhDs got wrong about machine learning over the last few decades.?

This involves a bit of throwing academia under the bus, but it's crucial to explain how fundamental research is done in academia and what was missed in terms of the breakthroughs in AI.?

This insight is valuable whether you're involved in venture investing or running a startup.

My Experience at Harvard & The Traditional Academic Approach to Research

Years ago, I was a grad student at Harvard studying advanced econometrics.?

My studies included statistics, econometrics, and both micro and macro theory—essentially the first two years of a PhD in economics.?

In this academic setting, you start with a very careful thesis of the world and then run statistical models to prove whatever you're out to demonstrate.?

For example, you might investigate the biggest drivers of attainment from an educational program: is it the quality of the schools, the teacher, the location, the materials, or the availability of a school lunch program??

You then create randomized trials, ideally triple blind, and develop a very careful theory of the world.?

This theory involves independent variables that should have some type of connection, ideally a causal connection, with a dependent variable that you're trying to predict.

Pre-baked Regression Superstar

This research process often takes the shape of some type of regression.?

You carefully track and measure a set of different variables, trying to create some kind of instrumental variable or connection to a dependent variable.?

You predict something based on your theory of the world, carry out a carefully planned experiment, do a whole bunch of statistical cleanup, and arrive at a conclusion.?

You present it, get it peer-reviewed, published, and then syndicate.

The Evolution of Machine Learning

In contrast, the modern approach, particularly in machine learning, involves a lot more guesswork and fine, continuous tweaks at scale.?

This process can be likened to having a million monkeys at typewriters miraculously creating a beautifully published Encyclopedia Britannica through a bit of training and sheer brute force.?

This is essentially how machine learning works: guesswork and fine continuous tweaks (aimed at minimizing whatever loss function you define) until brilliance emerges.

Ajit Jaokar 1 个月前

Artificial Intelligence #207

Andriy Burkov 10 个月前

Watch#5: Enjoying a Free Lunch and Boosting the Math…

Pascal Biese 1 年前

Why Does This Work?

Machine learning involves predicting some kind of outcome using features or variables.?

The process, at a high level for supervised learning, involves five steps:

Great Data Set: Quality data is fundamental. You need great data to get anything of value out.
Standard Splitting: Split the data into a training set and a testing set.
Running the Model: Use the algorithm of your choice - pull an XGBoost, Random Forest, or standard neural network off your proverbial shelf from many great open-source libraries.
Adjusting the Model: Make small tweaks to hyperparameters (or better yet, use something like GridSearchCV and have the computer do it for you).
Repeat and Tune: Iterate until you're happy with the results.

Implications for Venture Investors and Executives

The Importance of Data

First, it's all about the data.?

With bigger models and larger context windows, what you feed into the model matters immensely. Whether it's your code base, every Slack message from your company, emails, customer interactions, or videos, the quality of data determines the quality of your result.?

This is why, for example, the Reddit IPO popped due to a significant contract with Google focused on data quality.

The Future: Synthetic Data

Looking ahead, venture investors should consider the opportunities for synthetic data—where we can manufacture data and build models that can more effectively train themselves. This can speed up the process and improve results further.

Broader Societal Implications

For broader society, it's all about encouraging access. We're starting to see the first performant open-source models, which hopefully will become the norm. With this technology becoming free to access for standard consumer use cases, countries and companies can gain significant advantages.

The Role of Smart Government

Countries thinking about building sovereign clouds can look to modern equivalents of Singapore, which transformed from a poor country in the 1960s to one of the wealthiest per capita through a focus on services and the knowledge economy.?

There's a significant play for smart government and companies training their employees, building the right incentives, and encouraging open-source software and accessibility (just ask Larry Ellison and some of the leadership at Oracle about how they see the future unfolding).

Conclusion

We are living in a world where proverbial million monkeys are producing brilliance.?

Now, in practice, you’ll be using things like an optimized back propagation algorithm (using calculated gradients rather than random values) to speed up and optimize the training process and get better results with less compute and time.

But in essence, you’d get similar results if you just randomly tweaked the parameters yourself (again, this is very time-consuming if you’ve got hundreds of billions of parameters).?

However, overall—with quality data and adequate compute power, you can achieve better results and succeed—and you don’t need a PhD or a carefully pre-crafted theory of the world to have your machine learning work bear fruit.

Applying AI

716 位关注者

Eric Koester

Creating Creators; Georgetown Professor & Founder of Manuscripts

3 个月

Great point! It's fascinating how machine learning leverages vast amounts of data and computational power to uncover patterns that traditional methods might miss. ??

要查看或添加评论，请登录

Uri Pomerantz的更多文章

After Sinwar: A Rare Chance to End the Pro-Israeli vs. Pro-Palestinian Divide

2024年10月17日

After Sinwar: A Rare Chance to End the Pro-Israeli vs. Pro-Palestinian Divide

The death of Yahya #Sinwar was confirmed this morning. What comes next? A rare opportunity (however slim) to move…

2 条评论
Optimistic AI, the Management Premium, and Personal AI Alignment

2024年10月9日

Optimistic AI, the Management Premium, and Personal AI Alignment

I've recently become more optimistic about the future of AI and what it will mean for society. Here are the stages we…

4 条评论
OpenAI o1: The One Chart That Explains Why This Is a Big Deal and 3 Predictions for the Near Future of AI

2024年9月23日

OpenAI o1: The One Chart That Explains Why This Is a Big Deal and 3 Predictions for the Near Future of AI

Noam Brown, a researcher on the OpenAI team, recently published this chart: In my opinion, this is a very big deal…

4 条评论
What generative AI looked like 20 years ago, and what it means for our future

2024年5月20日

What generative AI looked like 20 years ago, and what it means for our future

Introduction Going forward, I’ll be writing extensively about AI and its applications in our lives, focusing on venture…

8 条评论
My personal experience with terrorism and how to stop Hamas from within

2023年10月10日

My personal experience with terrorism and how to stop Hamas from within

I was born in Israel and lived a few minutes from the border with Gaza as a child. My parents tell stories of how there…

49 条评论
Fintech for Good - 3 Lessons from the life of Nick Hungerford

2023年8月7日

Fintech for Good - 3 Lessons from the life of Nick Hungerford

I knew Nick Hungerford as a friend, classmate, and investor in Grupago. But beyond that, I knew him as a brave and kind…

5 条评论
The Win-Win Value

2023年2月10日

The Win-Win Value

Years ago, I heard a quote from Stanford Business Professor Charles O'Reilly that sticks with me deeply to this day:…

8 条评论
Launching a new product? Learn from Amazon.

2023年1月25日

Launching a new product? Learn from Amazon.

For your convenience, this post is also available as a podcast or en Espa?ol. Writing a press release – it is one of…

3 条评论
The Unexpected Secret of How Steve Jobs Changed The World

2022年12月19日

The Unexpected Secret of How Steve Jobs Changed The World

The New Year is rapidly approaching – and if you're like most people, you might write some resolutions. Below each…

4 条评论
A.I. is finally here. Prepare for everything to change (eventually)

2022年12月13日

A.I. is finally here. Prepare for everything to change (eventually)

Without hyperbole, I can finally say that AI – truly powerful and more generalized AI – is finally here, and it's going…

1 条评论

See all articles

A Million Monkeys On Typewriters –?What Traditional PhDs Missed About Machine Learning

Uri Pomerantz

Applying AI and fintech for good | Entrepreneur | Investor | Dad | Writer

Introduction

My Experience at Harvard & The Traditional Academic Approach to Research

Pre-baked Regression Superstar

The Evolution of Machine Learning

领英推荐

Why Does This Work?

Implications for Venture Investors and Executives

The Importance of Data

The Future: Synthetic Data

Broader Societal Implications

The Role of Smart Government

Conclusion

Applying AI

716 位关注者

Uri Pomerantz的更多文章

社区洞察

其他会员也浏览了

??Top ML Papers of the Week

Breaking the Jargons: Issue 9

Artificial Intelligence #91: How could domain experts learn Artificial Intelligence? Bias Variance tradeoff as a pedagogy

Reasons Why You Will Need Linear Algebra as a Data Scientist

Tensorflow

FAQ - Mathematical Foundations of Data Science

AI Stack Economics: Where the Real Money Flows in Artificial Intelligence

Tensorflow Extended (TFX) - Towards End to End Machine Learning pipeline - Part 1

Implementing AdaGrad Optimizer in Spark

Best Path for Developers to Get into Machine Learning (ML4Devs Newsletter, Issue 4)

Introduction

My Experience at Harvard & The Traditional Academic Approach to Research

Pre-baked Regression Superstar

The Evolution of Machine Learning

领英推荐

Why Does This Work?

Implications for Venture Investors and Executives

The Importance of Data

The Future: Synthetic Data

Broader Societal Implications

The Role of Smart Government

Conclusion

Applying AI

716 位关注者

Uri Pomerantz的更多文章

After Sinwar: A Rare Chance to End the Pro-Israeli vs. Pro-Palestinian Divide

Optimistic AI, the Management Premium, and Personal AI Alignment

OpenAI o1: The One Chart That Explains Why This Is a Big Deal and 3 Predictions for the Near Future of AI

What generative AI looked like 20 years ago, and what it means for our future

My personal experience with terrorism and how to stop Hamas from within

Fintech for Good - 3 Lessons from the life of Nick Hungerford

The Win-Win Value

Launching a new product? Learn from Amazon.

The Unexpected Secret of How Steve Jobs Changed The World

A.I. is finally here. Prepare for everything to change (eventually)

社区洞察

其他会员也浏览了

??Top ML Papers of the Week

Breaking the Jargons: Issue 9

Artificial Intelligence #91: How could domain experts learn Artificial Intelligence? Bias Variance tradeoff as a pedagogy

Reasons Why You Will Need Linear Algebra as a Data Scientist

Tensorflow

FAQ - Mathematical Foundations of Data Science

AI Stack Economics: Where the Real Money Flows in Artificial Intelligence

Tensorflow Extended (TFX) - Towards End to End Machine Learning pipeline - Part 1

Implementing AdaGrad Optimizer in Spark

Best Path for Developers to Get into Machine Learning (ML4Devs Newsletter, Issue 4)