by Joe Greco

Kepler’s impressive efficiency has been considered a major achievement by press and PC gamers, but a key part of the story has never been told.

As the leader of the engineering team that worked with TSMC for three years to manufacture Kepler, I’d like to shed some light on the impact of 28nm process technology on Kepler’s efficiency.

The 28nm GeForce GTX 680 die is 294mm sq.

Kepler was an ambitious project because it introduced a new architecture at the same time as a new silicon process technology node. This is a bit like designing a new jet engine using exotic materials which are still in development. Much like the engineers at Pratt & Whitney, there was an intense focus on power efficiency, on delivering the best performance per energy unit (watts in our case, gallons of jet fuel in theirs).

The advancement that TSMC offered was a new optimized process technology. Kepler is manufactured using TSMC’s 28nm high performance (HP) process, the foundry’s most advanced 28nm process which uses their first-generation high-K metal gate (HKMG) technology and second generation SiGe (Silicon Germanium) straining. HKMG is a process that uses a gate insulator film with a high dielectric constant which reduces power by reducing gate leakage compared to the previous generation SiON gate. SiGe straining is a chemical process to stretch the silicon atoms to improve the mobility or the effective frequency of the transistor. Both technical advances improve the performance per watt of the transistor translating to a more power efficient system.

Using TSMC’s 28nm HP process enabled us to reduce active power by about 15 percent and leakage by about 50 percent compared to 40nm, resulting in an overall improvement in power efficiency of about 35 percent (see chart). Let me explain why this is so critical.

Today, the primary constraint on processor performance is the power consumption budget. So our goal is always to develop solutions that deliver the highest performance within a fixed power budget. Having a more efficient process enabled us to add more processing cores, thus increasing performance. Put simply, greater efficiency equals greater performance and optimal performance per watt.

To maximize the efficiency of 28nm (while developing a new architecture) required us to change our silicon process development model with TSMC. In previous process nodes we had worked independently—with TSMC preparing the process, and NVIDIA working on the design. TSMC engineers would do the best job making a volume process platform, and NVIDIA would implement our designs following the guidelines of process design rules and electrical performance.

For Kepler, we began working with TSMC three years before our product tape-out (when the processor design is complete and ready for manufacturing). Together we created a Production Qualification Vehicle (PQV) to allow the TSMC process engineers and our internal design engineers to optimize the process before the product tape-out. Through repeated prototyping, we were able to optimize both the process and design, creating a more efficient Kepler design rather than simply a chip in a standard 28nm process.

TSMC’s 28nm HP process, seen here under an electron
microscope, is 30 percent smaller than 40nm and about
35 percent more energy efficient.

We’re extremely proud of what we accomplished with Kepler. It combines NVIDIA’s world-class GPU engineering with TSMC’s very best 28nm process. But while Kepler was a key milestone, it is one point in a continuum. We continue to improve on what we developed and continue our collaboration with TSMC. In fact, we recently received our first version of an enhanced PQV for 20nm from TSMC. That process will yield even greater efficiency for NVIDIA’s next next-generation GPUs.