Coding Made AI—Now How Will AI Unmake Coding?

Are coders doomed? That question has been bouncing around computer programming communities ever since OpenAI’s large language model, GPT-3, surprised everyone with its ability to create html websites from simple written instructions.

In the months since, rapid-fire advances have led to systems that can write complete, albeit simple, computer programs from natural language descriptions—spoken or written human language—to automated coding assistants that speed the work of computer programmers. How far will artificial intelligence go in replacing or augmenting the work of human coders?

According to the experts IEEE Spectrum consulted, the bad news is coding as we know it may indeed be doomed. But the good news is computer programming and software development appears poised to remain a very human endeavor for the foreseeable future. In the meantime, AI-powered automated code generation will be increasingly speeding software development by allowing more code to be written in a shorter amount of time.

Programmers will not always need to learn a programming language. That will open software development to a much broader population.

“I don’t believe AI is anywhere near replacing human developers,” said Vasi Philomin, Amazon’s VP for AI services, adding that AI tools will free coders from routine tasks, but the creative work of computer programming will remain.

If someone wants to become a developer, say 10 years down the line, they won’t necessarily need to learn a programming language. Instead, they will need to understand the semantics, concepts, and logical sequences of building a computer program. That will open software development to a much broader population.

When the programming of electronic computers began in the 1940s, programmers wrote in numerical machine code. It wasn’t until the mid-1950s that Grace Hopper and her team at the computer company Remington Rand developed FLOW-MATIC, which allowed programmers to use a limited English vocabulary to write programs.

Since then, programming has climbed a ladder of increasingly efficient languages that allow programmers to be more productive.

AI-written code is the cutting edge of a broader movement to allow people to write software without having to code at all. Already, with platforms like Akkio, people can build machine-learning models with simple drag, drop, and button-click features. Users of Microsoft’s Power Platform, which includes a family of low-code products, can generate simple applications by just describing them.

In June, Amazon released CodeWhisperer, a coding assistant for programmers, like GitHub’s Copilot, which was first released in limited preview in June 2021. Both tools are based on large language models that have been trained on massive code repositories. Both offer autocomplete suggestions as a programmer writes code or suggests executable instructions from simple natural language phrases.

“There needs to be some incremental refinement, some conversation between the human and the machine.”
—Peter Schrammel, Diffblue

A GitHub survey of 2,000 developers found that Copilot cuts in half the time it takes for certain coding tasks and raised overall developer satisfaction in their work.

The problem is teaching the intent to the computer. Software requirements are usually vague, while natural language is notoriously imprecise.

“To resolve all these ambiguities in English written specification, there needs to be some incremental refinement, some conversation between the human and the machine,” said Peter Schrammel, cofounder of Diffblue, which automates the writing of unit tests for Java.

To address these problems, researchers at Microsoft have recently proposed adding a feedback mechanism to LLM-based code generation so that the computer asks the programmer for clarification of any ambiguities before generating code.

The interactive system, called TiCoder, refines and formalizes user intent by generating what is called a “test-driven user-intent formalization”—which attempts to use iterative feedback to divine the programmer’s algorithmic intent and then generate code that is consistent with the expressed intentions.

According to their paper, TiCoder improves the accuracy of automatically generated code by up to 85 percent from 48 percent, when evaluated on the Mostly Basic Programming Problems (MBPP) benchmark. MBPP, meant to evaluate machine-generated code, consists of around 1,000 crowd-sourced Python programming problems, designed to be solvable by entry level programmers.

A unit of code, which can be hundreds of lines long, is the smallest part of a program that can be maintained and executed independently. A suite of unit tests, typically consisting of dozens of unit tests, each of them between 10 and 20 lines of code, checks that the unit executes as intended, so that when you stack the units together, the program works as intended.

Unit tests are useful for debugging individual functions and for detecting errors when code is manually changed. But a unit test can also be used as the specification for the unit of code and can be used to guide programmers to write clean, bug-free code. While not many programmers pursue true test-driven development, in which the unit tests are written first, unit test and units are generally written together.

Hand-coding software programs will increasingly be like hand-knitting sweaters.

According to a survey by Diffblue, developers spend roughly 35 percent of their time writing quality-control tests (as opposed to writing code destined for production use), so there are significant productivity gains to be made just by automating a part of this.

Meanwhile, Github’s Copilot, Amazon’s CodeWhisperer, and AI programming assistant packages can be used as interactive autocompletion tools for writing unit tests. The programmer is given suggestions and picks the one that they think will work best. For instance, Diffblue’s system, called Diffblue Cover, uses reinforcement learning to write unit tests automatically, with no human intervention.

Earlier this year, Google’s UK-based, artificial intelligence lab, DeepMind, went further in fully automatic code generation with AlphaCode, a large language model that can write simple computer programs from natural-language instructions.

AlphaCode uses an encoder-decoder Transformer architecture, first encoding the natural language description of the problem and then decoding the resulting vector into code for a solution.

The model was first trained on the GitHub code repository until the model was able to produce reasonable-looking code.

To fine-tune the model, DeepMind used 15,000 pairs of natural-language problem descriptions and successful code solutions from past coding competitions to create a specialized data set of input-output examples.

Once AlphaCode was trained and tuned, it was tested against problems it hadn’t seen before.

“I don’t believe AI is anywhere near replacing human developers. It will remove the mundane, boilerplate stuff that people have to do, and they can focus on higher-value things.”
—Vasi Philomin, Amazon

The final step was to generate many solutions and then used a filtering algorithm to select the best one. “We created many different program possibilities by essentially sampling the language model almost a million times,” said Oriol Vinyals, who leads DeepMind’s deep learning team Vinyals.

To optimize the sample-selection process, DeepMind uses a clustering algorithm to divide the solutions into groups. The clustering process tends to group the working solutions together, making it easier to find a small set of candidates that are likely work as well as human programmers.

To test the system, DeepMind submitted 10 AlphaCode-written programs to a human coding competition on the popular Codeforces platform where its solutions ranked among the top 54 percent.

“To generate a program, will you just write it in natural language, no coding required, and then the solution comes out at the other end?” Oriol Vinyals, who leads DeepMind’s deep learning team, asked rhetorically in a recent interview. “I believe so.”

Vinyals and others caution that it will take time, possibly decades, to reach that goal. “We are still very far away from when a person would be able to tell a computer about the requirements for an arbitrary complex computer program, and have that automatically get coded,” said Andrew Ng, a founder and CEO of Landing AI who is an AI pioneer and founding lead of Google Brain.

But given the speed at which AI-code generation has advanced in a few short years, it seems inevitable that AI systems will eventually be able to write code from natural language instructions. Hand-coding software programs will increasingly be like hand-knitting sweaters.

To give natural language instructions to a computer, developers will still need to understand some concepts of logic and functions and how to structure things. They will still need to study foundational programming, even if they don’t learn specific programming languages or write in computer code. That will, in turn, enable a wider range of programmers to create more and more varied kinds of software.

“I don’t believe AI is anywhere near replacing human developers,” Amazon’s Philomin said. “It will remove the mundane, boilerplate stuff that people have to do, and they can focus on higher-value things.”

Diffblue’s Schrammel agrees that AI-automated code generation will allow software developers to focus on more difficult and creative tasks. But, he adds, there will at least need to be one interaction with a human to confirm what the machine has understood is what the human intended.

“Software developers will not lose their jobs because an automation tool replaces them,” he said. “There always will be more software that needs to be written.”

Source: IEEE Spectrum Computing