Industry Out of Phase With Supercomputers

Technical and economic changes in the semiconductor industry threaten to stifle U.S. development of the next generation of high-performance computers, warns a new report from the National Research Council.

With Moore’s law and the scaling of transistors waning, the industry is turning to chip designs that don’t work for supercomputing used in massive simulations. The report focuses on defense use in modeling the physics of nuclear weapons, but the changes also would affect simulations including those used for climate modeling and weather forecasting.

The National Nuclear Security Administration, responsible for the U.S. nuclear stockpile, “needs to fundamentally rethink its advanced computing research, engineering, acquisition, deployment, and partnership strategy,” warns the report.

NNSA has developed massive and sophisticated codes that run on supercomputers to verify the continued security and performance of nuclear weapons designed decades ago. Keeping them up to date requires new generations of supercomputers that can run more complex models faster than the months required on today’s machines. But industry, which has shelled out big bucks for state-of-the-art fabs, is targeting big, profitable markets like cloud computing.

Nuclear weapon designers used computers to understand the physics of nuclear weapons long before the U.S. stopped underground nuclear testing in 1992. Since then, powerful computer models have been their primary tools for maintaining the country’s nuclear capability via NNSA’s Stockpile Stewardship program.

Federal spending on supercomputers for the weapons program complemented industry investment in chip production for decades. NNSA’s most powerful machine currently in operation is the Frontier computer, which began operation last year at the Oak Ridge National Laboratory in Tennessee. It can perform 1018 (a quintillion) floating-point operations per second (flops) making it the first “exascale” computer. Custom-built by Cray, it can, in theory, perform 2 exaflops. Cray is building another exascale computer that will be deployed at the Los Alamos National Laboratory in New Mexico.

But those easy days are over, says Kathy Yelick of the University of California at Berkeley. “The NNSA has had a really successful run over the last 30 years with a combination of high-end computing facilities and expertise in computational science that make its labs a critical national resource,” the chair of panel that wrote the NRC report said at a 14 April online press conference. In addition to challenges in technology, she says, “the rapidly evolving geopolitical situation… reinforces the need for computing leadership as an element of deterrence.”

Industry trends are worrying. Most semiconductor manufacturing has moved outsise of the United States. Only a single domestic developer of supercomputers remains since the 2019 Hewlett Packard Enterprise purchase of Cray. Industry is developing technology for high volume markets like cloud computing, which won’t transfer easily to the much smaller supercomputing market. The hot technology frontiers are artificial intelligence and quantum computing.

“Business as usual will not be adequate” for NNSA, the report says. The agency needs an aggressive roadmap to develop new computing technology. The report urges stressing “high-risk, high-reward research” in math and computer science “to cultivate radical innovation.” The report also says both artificial intelligence and quantum computing have promise and deserve serious investigation, but warns that neither is likely to replace the massive computation essential to traditional simulations.

NNSA now plans to replace its new “exascale” computers with a higher-capacity new system based at Los Alamos in four to five years, says Rob Neely, program director for advanced simulation and computing at the Lawrence Livermore National Laboratory in California. A second such system will follow around 2030 at Livermore. “Early discussions with vendors about their roadmaps have begun,” Neely says. “We are also already well underway in implementing some of the NAS recommendations at LLNL, in particular by increasing our partnerships with cloud providers.” Livermore and Amazon Cloud Web Services are exploring common interests in cloud and high-performance computing technology.

What happens next “will depend a lot on where overall technology trends are headed in that timeframe, and how well we can adapt our codes to those changes without sacrificing mission needs,” says Neely. He expects AI and the Cloud to influence the post-exascale systems—if NNSA can adapt its codes to the new technology. That’s a big if. Having just spent a decade adapting its codes to GPUs, the NNSA brain trust is “not anxious to divest from the GPU accelerated approach just yet.”

Both NNSA and the authors of the report think quantum computing is farther off. “They will not replace classical computers for our primary mission of large, complex and integrated weapons design codes anytime in the next 10-15 years,” says Neely.

The overall concerns are not just huge and highly specialized weapons codes. A government program identified ,» rel=»noopener noreferrer» target=»_blank»>more than 20 applications requiring exascale computing—many of which would benefit from even larger scales.

Source: IEEE Spectrum Computing