The press meeting
The press meeting
[Click to enlarge image]

The Tokyo Institute of Technology announced the details of the "Tsubame 2.0," the next-generation supercomputer system for the university that will start operation in the fall of 2010, at a press meeting.

The computation capacity of the system is 2.39 PFLOPS (petaflops, double-precision value), which ranks second in the "Top500," a ranking of supercomputers, as of June 2010.

"It will be the first petaflops computer in Japan," said Satoshi Matsuoka, professor at the Global Scientific Information and Computing Center (GSIC) of the university. "And it will be the first world-class supercomputer system for our university."

However, the actual construction of the system, which will be conducted by NEC Corp and Hewlett-Packard Co, has yet to be done.

The system has the "vector-scalar mixture architecture," Matsuoka said. But the computation capacity of its graphics processing units (GPUs) accounts for 90% of the total computation capacity, making the system more like a vector computer.

Therefore, the performance of the system slightly differs depending on the type of calculation. Specifically, the performance target in terms of the Linpack benchmark is 1-1.4 PFLOPS (double-precision value), which ranks third or fourth in the Top500 as of June 2010. On the other hand, for calculations that are suited for vector computers such as weather prediction, the performance can be more than 150 TFLOPS (teraflops), which is much higher than the world record (50 TFLOPS).

Outperforming "earth simulator" with one rack

The backbone of the supercomputer system consists of 2,816 units of Intel Corp's "Xeon 5600" microprocessor (developing code: Westmere-EP), which has six cores and operates at a frequency of 2.93GHz, and 4,224 units of Nvidia Corp's "Tesla M2050" GPU.

The double precision arithmetic performance of the Tesla M2050 is much higher than that of the existing Tesla GPUs, which are developed mainly for single precision arithmetic. A unit of the Tesla M2050 has a performance of 515 GFLOPS (gigaflops, double-precision value).

"The performance per node (two microprocessors and three GPUs) is 1.6 TFLOPS," Matsuoka said. "The performance per rack is 51.2 TFLOPS, which is higher than that of early earth simulators."

World's 1st SSD-based super computer?

(Continue to the next page)