Better than AlphaCode?
Dr. Christian Peters
Director, Global Head of Smart Sensors & Hardware Systems at Bosch Research
DeepMind’s latest publication on “Competition-Level Code Generation with AlphaCode†really got my attention. Recent large-scale models demonstrated the ability to generate code, but usually with poor performance on unseen and complex problems. AlphaCode is on the next level and can generate code for unseen (not trained) problems based only on natural language descriptions. The authors claim AlphaCode achieved an average ranking of 54% on programming competitions on www.Codeforces.com. The natural question is, how would I rank in such a competition?
But first, have a brief look into AlphaCode. The full paper can be found here, it is worth to read, but a bit lengthy. I only give a brief summary of what I think are the key elements. Behind AlphaCode is a transformer-based language model. The model is pre-trained with GitHub code (approx. 715GB of code) and fine-tuned by a dataset of competitive programming data. The key is to have access to extensive and clean competitive programming code. The evaluation was done with truly unseen code, thus a problem cannot be solved by copying code from the training set. Instead, AlphaCode relies on the correct interpretation of the natural language description of the problem. Going from “what is the problem†to “how to solve the problem†is a great leap. The problem is fed into the model, and a large set of potential solutions is generated. This set is filtered and clustered to reduce the number to less than 10. This technique is one of the core elements used for AlphaCode. Tested on CodeForce ranked AlphaCode within the top 28% of users participating in the contest. Very impressive! The authors showed AlphaCode is indeed not copying code from the training set; the longest substrings between the generated code and the training dataset is comparable to human solutions. Also, the amount of dead code is similar to the amount of dead code a human would generate.
领英推è
What are possible future applications of such a model? I don’t think it will replace humans, but I believe it makes human programmers more efficient. The developers can move further up to higher abstraction levels and be more efficient (same as we are not writing in assembly anymore… well, most of us). The model can also help to optimize code make it more robust and efficient. Imagine having a model supporting you 24/7 is quite lovely. Of course, there are also risks. How readable is the code? Is the generated code to generalize and would also pass out-of-distribution tests? Another aspect is the environmental impact. Training such an extensive network consumes a tremendous amount of energy. Some models can be executed very efficiently; this is not the case with AlphaCode. Sampling and training require petaFLOPS days.
I am coming back to the question how good is my coding compared to AlphaCode. You can find all problems from the competition on CodeForces (e.g. 1567E) and upload your own code for evaluation. It is enjoyable to upload your?own code, get the results and compare the code against the solution from AlphaCode. Am I better? No, but this was also not expected ??