DeepMind has created an artificial intelligence system called AlphaCode, which is said to “write computer programs at a competitive level.” Alphabet’s subsidiary is testing its system against coding challenges used in human competitions and found that its program has achieved an “approximate rank”, placing it among the top 54 percent of coders. The result is a significant step forward for autonomous coding, says DeepMind, although AlphaCode’s skills aren’t necessarily representative of the type of programming tasks that a regular coder faces.
That’s what Oriol Vinyals, lead researcher at DeepMind, said On the edge by email that the study is still in its early stages, but the results have brought the company closer to creating a flexible AI for problem solving – a program that can autonomously cope with the coding challenges that are currently the domain of only humans. “In the long run, we are excited about [AlphaCode’s] potential to help programmers and non-programmers write code, improve productivity, or create new ways to build software, ”said Vinyals.
AlphaCode has been tested against challenges coded by Codeforces, a competitive coding platform that shares weekly issues and issues coder rankings similar to the Elo rating system used in chess. These challenges are different from the tasks an encoder may face while making, say, a commercial application. They are more independent and require a broader knowledge of both algorithms and theoretical concepts in computer science. Think of them as very specialized puzzles that combine logic, math, and coding experience.
In one example challenge that AlphaCode has been tested on, competitors are being asked to find a way to convert a string of random, repetitive s and t letters in another string of the same letters using a limited set of inputs. Competitors, for example, cannot simply enter new letters, but must use the backspace command, which deletes several letters in the original string. You can read the full description of the challenge below:
Ten of these challenges were introduced in AlphaCode in exactly the same format as given to humans. AlphaCode then generates as many responses as possible and transfers them by running the code and checking the output, just as a competitor could. “The whole process is automatic, with no human selection of the best samples,” said Eugene Lee and David Choi, co-hosts of AlphaCode. On the edge via e-mail.
AlphaCode was tested on 10 of the challenges, which were solved by 5000 users of the Codeforces site. On average, it ranks in the top 54.3 percent of responses, and DeepMind estimates it gives Codeforces Elo 1238, making it among the top 28 percent of users who have competed on the site in the past six months.
“I can safely say that the results of AlphaCode exceeded my expectations,” Codeforces founder Mike Mirzayanov said in a statement shared by DeepMind. “I was skeptical [sic] because even with simple racing tasks, it is often required not only to apply the algorithm, but also (and this is the most difficult part) to invent it. AlphaCode has managed to present itself at the level of a promising new competitor. ”
DeepMind notes that AlphaCode’s current skill set is only currently applicable in the field of competitive programming, but that its capabilities open the door to creating future tools that make programming more accessible and one day fully automated.
Many other companies are working on similar applications. For example, Microsoft and the OpenAI AI lab have adapted the GPT-3 language generation program to the latter to function as an autocomplete program that completes strings of code. (Like GPT-3, AlphaCode is also based on an AI architecture known as Transformer, which is particularly adept at analyzing sequential text, both natural language and code). For the end user, these systems work just like Gmail’s Smart Compose feature – it offers ways to finish everything you write.
Much progress has been made in recent years in the development of AI coding systems, but these systems are far from ready to simply take on the work of human programmers. The code they produce is often a bug, and because systems are usually trained on public code libraries, they sometimes reproduce copyrighted material.
In a study of an artificial intelligence programming tool called Copilot, developed by the GitHub code repository, researchers found that about 40 percent of its results contained security vulnerabilities. Security analysts even suggest that bad actors may intentionally write and share hidden backdoor code online, which could then be used to train AI programs that would insert these errors into future programs.
Challenges like these mean that AI coding systems are likely to be slowly integrated into programmers’ work – starting as assistants whose proposals are treated with suspicion before being entrusted to do the work themselves. In other words: they have to do an apprenticeship. But so far, these programs are learning fast.