Compiler Limitations #3/3
Some examples, before the main point
This list can go on, but hopefully the pattern is clear.
The main point, after some examples
I whole heartly recommend this brilliant 2021 keynote by Chris Lattner, the mastermind behind LLVM: The Golden Age of Compilers . He makes a point to repeat the sentence:
Larger center of gravity concentrated scarce compiler engineering effort. Enables innovations in languages, frontends and backends.
(in fact it repeats 4 times, check the slides ). It is a noble goal and might have been true for a while, but in my experience this definitely isn't the case today. Of all compilers around today the LLVM suite is by far the most friendly and approachable - and yet, as the examples show, even in LLVM it is extremely hard to push innovations forward. Not just for me, but also for people whose day job is LLVM.
Why is that? I don't know, obviously, but here are some thoughts.
For a while I believed the main issue is that of scale. Concentrating so many engineers around a project risks getting it to the 'mythical man-month' point: not only does it not 'Enable innovations in languages, frontends and backends', it hinders them.
I no longer think that. Today I feel the main factor limiting LLVM progress (and maybe gcc too) is that of ownership, and specifically in the mid-end. 'Mid end' is a term for the passes that optimize IR into better IR, sometimes also called 'Optimizer'. Note the examples above mostly land there.
Backends have obvious ownerships (intel owns the intel backend etc.). Somehow front-ends have strong ownerships too: google gives the clang front-end a strong backing, Apple backs swift, seems flang is headed by representatives from ARM/AMD/Huawei, etc.
Not so for the mid-end. As far as I can see only a handful of people really know their way around mid-end, and they're spread around academia and US national laboratories. Not only are mid-end patches struggling to get timely reviews, reported issues are largely un-discussed and un-assigned (there are +20K as of this writing). The few mavens who are holding the mid-end fort are just overwhelmed. In one conversation I had with a key figure in this domain about a missed optimization I was trying to draw attention to, he outright told me: fixing this would not result in an academic paper, so he can't assign anyone to it.
I honestly don't understand this.
How is it that google (for example) has entire teams working on the clang front end and is heavily involved in C++ language design, but has relatively little investment in the actual optimizing parts of the compiler? I would think investment in optimization would have a larger, across-the-board impact on their developer productivity, infrastructure utilization and customer experience.
Can you see something I'm missing? Any thoughts are welcome.
Developer Support Engineer @ EngFlow | Bazel Bandit, C++ Connoisseur
1 年This is a brilliant write up. Thank you for your insights. Very thought provoking. I think you've nailed some larger issues with large, long-standing open source projects generally.