by Heather Mackey

Professor Takayuki Aoki from the Tokyo Institute of Technology just finished giving a fascinating look under the hood of the university’s Tsubame supercomputer, which has been used (among other things) in collaboration with a number of agencies in Japan to provide complex computational fluid dynamics modeling.

The Tsubame is notable because it leverages GPU clusters, and its success is one of the milestones for GPU’s in supercomputing. The Tsubame 1.0 uses Tesla S1070 in a 680 GPU cluster. With it, scientists have been able to experience speed ups of up to 80x in problems like weather modeling. Coming in December, the next-generation Tsubame 2.0 will use the Tesla 2050 with 4224 GPUs and provide performance of more than 3 PFLOPs.

Performance metrics are an important scorecard with supercomputers. But Professor Aoki explained how performance achieved really depends on the application as well as tuning and optimization – and gave some insights into how his group has been able to improve results. With full GPU implementation, Tsubame experienced acceleration of 10x to 100x over CPU-only performance, but notes that the numbers depended on the application.

With such intensive computation, any performance increase makes a huge difference – as do any bottlenecks. Communication issues between GPUs over cluster nodes can create some additional overhead, and Professor Aoki explained an overlapping technique they’ve developed at Tokyo Tech to deal with this.

While sessions was highly technical, audience members of all comfort levels could appreciate the incredible demos of CFD simulations, including water splashing into a container, a milk crown, and dendrite solidification in metal.