How to survive ML research
Image from: https://twitter.com/AnnalsofIM/status/704471427609718784

How to survive ML research

How (and why?) to stay ahead.

I’ve seen numerous articles about how to “stay ahead” in ML research in the last two years. Notably, the pandemic seems to have had an accelerating effect on the pace of machine learning research, as it has with so many other things. To put this in perspective, in the five years between 2016 and 2021, the number of papers at NeurIPS quadrupled from a little over 500 papers in 2016 to more than 2000 in 2021. More than ever, I find I need the excellent arxiv sanity preserver www.arxiv-sanity-lite.com maintained by Andrej Karpathy just to keep abreast of what is going on in entire fields.

So, how did we get here, and what do I suggest you do to keep up but stay sane? Spoiler alert, the answer depends on what you’re doing with the research!

On reproducibility and perverse incentives

For those of you who might be avid readers of Eliezer Yudkowsky (like me), my central thesis is that the the ML research arxiv is not a great map for the territory of ML research. I actually think this problem persists across a lot of science, too, I’m not singling out ML as a supervillain.

So, let’s brave the bad bits. Type the words “reproducibility crisis” into google, and there’s a good chance that if you have a similar search history to me, google with autosuggest you complete it with “machine learning”. There’s a recent nature paper here which provides a handy tour of why ML is particularly susceptible to this, but it’s hardly a problem confined only to ML, or even computer science. In my prior research field of physics, there is a well-known correlation between the impact factor of a given journal, and the likelihood of having to withdraw an article (another reason to be relieved that my most impactful articles only made it into Phys. Rev. Letters!). Even a side interest of mine, environmental hydrology isn’t immune!

It’s worth taking a moment to dig into the reasons for this, because it’s not evil scientists doing things they know to be false to fraudulently claim grants and prizes, it’s much more of an ethical creep and “people respond to incentives” kind of a problem. Of course, part of the reason for this reproducibility crisis is that there are rarely any academic plaudits for checking someone else’s results - all the accolades go to the original discovery, and the meticulous painstaking work of validating that discovery is far from glamorous. Add to that the fact that almost no grants are awarded for this kind of validation work, and you can understand why people might rush to publish before they are sure. Nobody wants to be the person who got scooped.

On top of that, there’s also a tendency to fall for traps that are perhaps better known and better avoided in the biological sciences, those of p-hacking and cherry picking. There are a plethora of possible datasets on which to test your new shiny model these days, making it hard to know what to do. Should you try out your new idea for a vision transformer on imagenet or VQA first? Increasingly, I’m afraid to say that the approach of some research groups has been to try everything and see what works, which is akin to designing a new biologically-active molecule that you know must work for something, and trying it as a treatment for all disease. Notice how that sudden change of context made you flinch? Yeah, that’s because in medicine we all know that’s cheating!

Bonus - how can we make it better?

As an aside - if we want to improve the reproducibility of ML research, what should we do? Well, for one thing we should encourage people (and importantly pay them) to validate results. The papers with code publication approach is not a bad one, but I suspect that papers out of the mainstream of a field published on there are unlikely to be checked by anyone. There’s also an issue that many models, especially those published by large research companies are still inaccessible to most - they tend to rely on very large, private datasets, and massive compute resources few can afford (or should use, given the climate crisis!). A move towards data-centric ML can help here, or perhaps a soft limit of the amount of compute organisations can use to train models.

Advice from the front lines.

So, how does this affect you, aspiring or current ML person? Well, it really depends, but it should be clear to you now that (a) you should be wary about the veracity of most claims about machine learning models and (b) it’s very tough to be sure whether or not something is reliable just from looking at the paper? So, what to do? Well, it depends on the nature of your work.

1) Academic ML researcher.

As an academic researcher, your aim should be to push the boundaries of one, or at most a couple of fields. As such, you should pick a field that you find particularly interesting (or in the most pragmatic case, one where there are opportunities and where the thought of spending 8hrs a day on it doesn’t make you want to cry), and focus on that. Don’t get distracted by what is going on in named entity recognition if you’re an image segmentation researcher - if it’s important, you’ll find it when you look into your area. Don’t feel pressured to keep up with everything, it’s not possible in any area of science now (It’s often said that the last person who knew all of the mathematics around in their lifetime was Henri Poincaré and he died in 1912!). Instead, focus your efforts on getting really good at a specific field, understanding not only the technical solutions but also the data, and the human side of problems. By all means read the arxiv most mornings with a coffee to get a sense of what’s out there, and where the new frontiers are, but when your coffee is finished, close the tab and get on with your work.

2) Commercial ML researcher.

Here is where I think most of us are. It also seems to be who these “How to stay on top of ML in 2022” articles are written for. But to be totally honest, you don’t need to be on top of ML. Your role is to take ideas from research and turn them into business value. This will probably involve using data from operational databases (warning: unless you’re business is one of vanishingly few who take this seriously as the source of business value of ML, this will likely be a hot mess) to train models. But whatever you do, don’t start with the SOTA 1.3Bn parameter model you read about on the arxiv last week. Your cloud devops will kill you for blowing the budget, you are pretty unlikely to have enough data for that, and if it goes wrong it’s going to be very difficult to understand why. Start simple. If you can, use linear regression. If you can’t, seriously ask why this is vital for the business.

In terms of keeping up with the latest developments, my advice would be this - “don’t”. Or at least, not because it will make you better at your job, it won’t. What will make you better at your job is looking at where the value is in solving a particular problem with ML, understanding a simple approach, and then methodically iterating on that to make it better. Oh, and making sure that what you put out there is well enough monitored that you can say whether a new variation in in fact better.

3) Software developer working with ML

Most software developers working with ML solutions are more than capable of understanding how they work, but again, to do a good job they don’t need to. For the most part, we are talking about large, pre-trained models that have been open-sourced by google, facebook etc, and which are very hard to fine tune or train yourself. Let’s not get too excited about AGI etc, a machine learning model, even the massive large language models like GPT-3 are just a function from one complicated space to another. They have a bunch of parameters which you can tune, but ultimately they are just functions. Understanding how the function behaves operationally (e.g. changing the temperature for a large language model) is the most valuable thing you can know. Learn by experience and by “directed play” and you will be an expert in no time. But don’t beat yourself up about not reading papers.

Outro - A hiring rantifesto.

As an addendum to this - I often see job adverts asking for people who “have publications in NeurIPS” as a requirement or a “nice to have”. To me, a business asking for this a strong red flag - it says loudly “we want researchers”. But the problem is that researchers are very focussed and knowledgeable in one area - pushing the state of the art in some field of image recognition or natural language processing. If you’re a giant like Google who have many research groups working on this kind of thing, it makes sense - you have a pre-built community into which academic researchers can slot. However, if you’re not Google or similar-sized, you’re likely artificially narrowing your search, and directly competing with the FAANGs for talent you don’t need. In this situation, you’re likely better off taking a punt on a business analyst wanting to try out ML for the first time. They will know where to focus their efforts to best reward the business.

Dr Markus Bernhardt

Leading AI Strategist and Tech Visionary | Advisory | Operations & Change Management | Board Member | International Keynote Speaker & Author

2 年

Excellent piece! Thanks for sharing.

Richard Maunder

CTO | CIO | Energy & Climate Tech

2 年

Interesting read, makes me feel more relaxed about not following any of this in great detail. The points on Software Dev working with ML are very relevant. One point I'd take issue with: "which is akin to designing a new biologically-active molecule that you know must work for something, and trying it as a treatment for all disease" -> "that’s cheating" For sure if you cherry pick a narrow, non-representative trial group etc then your results are questionable. However testing medicine randomly is mainly objectionable as it is hugely costly, wastes people's time but most seriously exposes patients to lots of side effects. Drug repurposing seems to be quite an active field. That is slightly different as you have something you know is (somewhat) useful in one case, and has acceptable side effects, and go looking for other apps. I assume ML is similar, and random application is likely to be fruitless. However repurposing might be similarly useful? The main issue, as you point out, is that people tend to only report their successes. There is little incentive to say we used this ML algo. on 8 datasets and only one produced promising results. The tendency is to pretend the other 7 never happened.

Robert Hardman

Experienced CTO/CDIO & AI Leader | Driving Innovation | Challenging The Conventional

2 年

Challenge the orthodox, look outside our own bubbles of comfort, attack dogma a d respect the "work" not the reputation

Julian - Thomas Erdoedy

there is beauty in methodical thought

2 年

great read!

  • 该图片无替代文字

要查看或添加评论,请登录

Chris Pedder的更多文章

  • Conform to be free.

    Conform to be free.

    As a sometimes awkward, sometimes I’m sure downright frustrating teenager, who just wanted to be, I always remember my…

    4 条评论
  • What is emergence in neural networks?

    What is emergence in neural networks?

    Large language models & emergence. If you’re reading this, I don’t need Bayes’ theorem to tell me that there’s a very…

    10 条评论
  • Why “speed” is a bad metric for success.

    Why “speed” is a bad metric for success.

    To start, two aphorisms: “If you want to go fast, go alone. If you want to go far, go together” - African proverb.

    3 条评论
  • Why I love UX/UI as an ML engineer.

    Why I love UX/UI as an ML engineer.

    “There’s a truth, universally accepted, that an AI startup in posession of funding must be in search of good UX…

  • Building a data company in 2022.

    Building a data company in 2022.

    I've had a pretty varied career in machine learning and software development. I've worked for ten person startups and…

    6 条评论
  • Don’t make a mesh (unless you have to…)

    Don’t make a mesh (unless you have to…)

    Apologies for the punny title, it’s a bit clickbaitey, but I want to talk a bit about one of the current hypes in…

    9 条评论
  • What I learned from my first year in an innovation team.

    What I learned from my first year in an innovation team.

    I have spent the last year as part of Cisco's internal innovation program. As a result, I have read a lot of books and…

    3 条评论
  • What makes NLP hard (and fun).

    What makes NLP hard (and fun).

    So it's 2020, and the much-anticipated AI-powered robot uprising is still very much in the indiscernible mists of the…

    1 条评论
  • The "A" in AI?

    The "A" in AI?

    There’s really only one possible interpretation, and it’s “artificial”, isn’t it? For a long time, people would have…

  • "Fail fast" vs Machine learning.

    "Fail fast" vs Machine learning.

    Yep, you read that right. There can be only one.

社区洞察

其他会员也浏览了