Reevaluating Google’s Reinforcement Learning for IC Macro Placement

Reevaluating Google’s Reinforcement Learning for IC Macro Placement

A Nature paper from Google with revolutionary claims in AI-enabled chip design was heralded as a breakthrough in the popular press, but it was met with skepticism from domain experts for being too good to be true and for lacking reproducible evidence.

Now, crosschecked data indicate that the integrity of the Nature paper is substantially undermined owing to errors in conduct, analysis, and reporting. Independently, detailed allegations of fraud and research misconduct in the Google Nature paper have been filed under oath in California.

In this edition of "Advances in Computing," computer scientist Igor Markov discusses the reproduction and evaluation of results in the 2021 paper, as well as the validity of methods, results, and claims.

Also featured in this issue: two research articles on regulatable AI systems and human-centered cybersecurity, as well as selected stories from the ACM magazines Interactions and XRDS.

Enjoy!


Reevaluating Google’s Reinforcement Learning for IC Macro Placement

A 2021 paper in Nature by Mirhoseini et al. about the use of reinforcement learning (RL) in the physical design of silicon chips raised eyebrows, drew critical media coverage, and stirred up controversy due to poorly documented claims. The paper, authored by Google researchers, withheld critical methodological steps, and most inputs needed to reproduce its results. Our meta-analysis shows how two separate evaluations filled in the gaps and demonstrated that Google RL lags behind human chip designers, a well-known algorithm (simulated annealing), and generally available commercial software, while also being slower. Crosschecked data indicates that the integrity of the Nature paper is substantially undermined, owing to errors in conduct, analysis, and reporting. Before publishing, Google rebuffed internal allegations of fraud which still stand. We note policy implications.

As AI applications demand greater compute power, efficiency may be improved via better chip design. The Nature paper was advertised as a chip-design breakthrough using machine learning (ML). It addressed a challenging problem to optimize locations of circuit components on a chip and described applications to five tensor processing unit (TPU) chip blocks, implying that no better methods were available at the time in academia or industry. The paper generalized the claims beyond chip design to suggest that RL outperforms the state of the art in combinatorial optimization. “Extraordinary claims require extraordinary evidence” (per Carl Sagan) but the paper lacked results on public test examples (benchmarks) and did not share the proprietary TPU chip blocks used. Source code—released seven months after publication to support the paper’s findings after the initial controversy—was missing key parts needed to reproduce the methods and results (as explained in Cheng et al. and Goth). More than a dozen researchers, from Google and academia questioned the claims of Mirhoseini et al., performed experiments, and raised concerns, about the reported research. Google engineers have updated their open source many times since, filling in some missing pieces but not all. The single open source chip-design example in the Google repository does not clearly show strong performance of Google’s RL code. Apparently, the only openly claimed independent (of Google) reproduction of techniques in Mirhoseini et al. was developed in Fall 2022 by UCSD researchers. They reverse-engineered key components missing from Google’s open source code and fully reimplemented the simulated annealing (SA) baseline absent in the code. Google released no proprietary TPU chip design blocks used in Mirhoseini et al. (nor sanitized equivalents), ruling out full external reproduction of results. So, the UCSD Team shared their experiments on modern, public chip designs: Both SA and commercial electronic design automation (EDA) tools outperformed Google RL code.

@ACM

Reporters from The New York Times and Reuters covered this controversy in 2022, and found that, well before the Nature submission, several Google researchers disputed the claims they had been tasked with checking. The paper’s two lead authors complained of persistent allegations of fraud in their research. In 2022, Google fired the internal whistleblower, and denied publication approval for a paper written by Google researchers critical of Mirhoseini et al. The whistleblower sued Google for wrongful termination under California whistleblower-protection laws: Court documents, filed under penalty of perjury, detail allegations of fraud and scientific misconduct related to research in Mirhoseini et al. The 2021 Nature News & Views article introducing the paper in the same issue urged replication of the paper’s results. Given the obstacles to replication and the results of replication attempts,11 the author of the News & Views article retracted it. On Sept. 20, 2023, Nature added an online Editor’s Note to the paper:

“Editor’s Note: Readers are alerted that the performance claims in this article have been called into question. The Editors are investigating these concerns, and, if appropriate, editorial action will be taken once this investigation is complete.”

A year later (late September 2024), as this article goes to print, the Editor’s note was removed from the Nature article, but an authors’ addendum appeared. This addendum largely repeats the arguments from an earlier statement discussed in the section on authors response to critiques. There is little for us to modify in this article: none of the major concerns about the Nature paper have been addressed. In particular, “results” on one additional proprietary TPU block with undisclosed statistics do not support any substantiated conclusions. This only aggravates concerns about cherrypicking and misreporting. The release of a pre-trained model without information about pre-training data aggravates concerns about data contamination—any circuit could have been used in pre-training and then in testing. We do not comment on the recent Google blog post, except that it repeats the demonstrably false claim of a full source-code release that allows one to reproduce the results in the Nature paper. Among other pieces, source code for SA is missing, and additionally the Nature results cannot be reproduced without proprietary training data and test data.

This article first covers the background and the chip-design task solved in the Nature paper and then introduces secondary sources used. Next, the article lists initial suspicions about the paper and shows that many of them were later confirmed. The article then checks if Mirhoseini et al. improved the state of the art, outlines how the authors responded, and discusses possible uses of the work in practice. Finally, the article draws conclusions and notes policy implications.

Read the rest of the article here.


Directions of Technical Innovation for Regulatable AI Systems
@CACM

More like this:

Directions of Technical Innovation for Regulatable AI Systems

In reviewing the technical criteria in two regulatory frameworks—the Canadian Directive on Automated Decision-Making and the World Economic Forum’s AI Procurement in a Box—a group of researchers from 新加坡国立大学 found that while we have many existing metrics for evaluating data and model quality, there are also many technical gaps.

Human-Centered Cybersecurity Revisited: From Enemies to Partners

A group of researchers from 瑞士苏黎世联邦理工学院 and ZHAW School of Management and Law propose viewing humans as partners, not only focusing on errors and incidents but also holistically analyzing and supporting human contributions to cybersecurity.


All About HCI:

Why the HCI Community Should Care About AR Journalism

Read why Vicky McArthur thinks the interdisciplinary perspectives provided by HCI can be very useful in empowering journalists to define AR journalism in their own terms.

Adventures in AI Wonderland: How Children Are Shaping the Future of AI

What do children think about AI? Should they, or even we, be afraid? Learn more from this piece by Eliza Kosoy , Emily Rose Reagan, and Soojin Jeong .


Computing for Students:

What Can AI Ethics Learn from Anarchism?

Five anarchist principles that can benefit AI Ethics.

On the Promising Path of Making Education Effective for Every Student

Modern machine learning algorithms are promised to adapt and personalize for each student. In this piece, Allen Nie tries to answer: Can we optimize student learning via AI and what are the harms and benefits of introducing it into the classroom?


Unlock our past editions:

AI Should Challenge, Not Obey

The Myth of the Coder

Summer Special Issue: Research and Opinions from Latin America

New Metrics for a Changing World

Do Know Harm: Considering the Ethics of Online Community Research

Now, Later, and Lasting

The Science of Detecting LLM-Generated Text

Can Machines Be in Language?


Enjoyed this newsletter?

Subscribe now to keep up to speed with what's happening in computing. The articles featured in this edition are from CACM, ACM's flagship magazine on computing and information technology; Interactions, a home to research related to human-computer interaction (HCI), user experience (UX), and all related disciplines; and XRDS, an ACM magazine for students. If you are new to ACM, consider following us on X | IG | FB | YT | Mastodon. See you next time!

@ACM


要查看或添加评论,请登录

ACM, Association for Computing Machinery的更多文章

社区洞察

其他会员也浏览了