WORLD, MEET OPTIMUS

by Sasha Ostojic

I am very proud to be part of the NVIDIA team that helped bring NVIDIA Optimus technology to market today. It is quite gratifying to see an idea we had several years ago finally come to pass with flying colors. The concept of Optimus is deceptively simple: use the best graphics device for the job, automatically, transparently, on the fly, and so fast the end user cannot tell anything is happening under the hood.

I was also part of the team that helped pioneer switchable graphics to market in notebook computers when it first appeared in the Sony VAIO SZ back in 2006. At the time switchable graphics was very innovative because it allowed a notebook computer to use both a discrete GPU and Intel integrated graphics. That meant you had access to the power of a discrete GPU when you needed it, but you could also use integrated graphics when it was sufficient for the task.

Switchable was well received and well reviewed when we launched it, and we knew we were on to something. But for all the praise we got for switchable graphics, when our team actually used the computers that had the feature, we found several flaws. Most of the flaws of switchable graphics had to do with the user experience. Namely:

  • You had to actively switch between the discrete GPU and integrated graphics, and many times people forget to do so. Ugh, which mode am I in?
  • You had to close all your applications to allow the computer to make the switch… kind of a pain.
  • You had to reboot your system to make the switch, although we solved that issue in the current generation of switchable graphics, still a pain.
  • State of the art switchable graphics takes several seconds to switch and the screen goes blank and flickers… not a deal breaker, but far from perfect.

We knew we could do better. The user experience must be one where things just simply work. You don’t want to know which graphics device is in use at any one time. You don’t want to have to decide which device to use for a given application. You want Optimus. There, problem solved.

In order to make Optimus so simple, we had to do a lot of very challenging, innovative, and forward thinking engineering work – in hardware, software, and at what is known in the industry as the back-end.

We needed hardware support to quickly move the graphics data around in the system, so we created a fast copy engine. The Optimus Copy Engine is a new alternative to traditional DMA (Direct Memory Access) transfers between the GPU frame buffer memory and system memory used by the IGP. With Optimus we also removed multiplexers, called MUXs, so we use the integrated graphics as a display adapter or pass through. The discrete GPU can do the heavy lifting and pass through the results to the integrated graphics chip to be displayed. By doing this, Optimus eliminated the need for hardware multiplexer and completely removed glitches associated with switching the display from IGP to GPU. Optimus transfers the display surface from the GPU frame buffer over the PCI Express bus to the system memory-based frame buffer used by the IGP. The key to performing the display transfer without negatively impacting 3D performance is the Optimus Copy Engine.

To make Optimus a reality, we added this fast copy engine to the GPU when we moved to the 40nm process. Hence, the fast copy engine is present in 200M series (40nm), GeForce 300M series, next-gen GeForce M, and next-gen ION GPUs.

On the software side of things, we needed a way for the system to know when to use the NVIDIA GPU and when to use Intel integrated graphics. The solution is Optimus profiles. We introduced profiles with NVIDIA SLI technology so the system would know how to handle the work associated with graphics. NVIDIA has invested significant resources in profiling applications based on whether they can benefit from using the GPU. When an application is launched, its profile with tell it which graphics subsystem to use.

Most types of applications are automatically detected by Optimus and do not need a profile. CUDA and Compute, and many video applications fall in to this category because the GPU knows when they make their calls to the GPU, such as:

  • DX Calls: Any 3D game engine or DirectX application will trigger these calls
  • DXVA Calls: Video playback will trigger these calls (DXVA = DirectX Video Acceleration)
  • CUDA Calls: CUDA applications will trigger these calls
  • OpenGL: Like DirectX, OGL applications will trigger these calls as well

However, certain applications need a profile to help determine which graphics system it should utilize. Most games and certain multimedia applications fall into this category.

But to really make Optimus automatic and future-proof, we needed a way to get new profiles delivered to end user system easily. We are proud of our NVIDIA Verde driver program and proud of the fact that we are the only GPU maker that supports end users with driver updates for notebook computers. But we also know that updating a driver every time a new application comes out is not a great user experience. This is the back-end part of Optimus.

We now have the ability to push Optimus profiles to users, much the same way that virus protection software continually updates virus definitions in the background. Once validated, the updated profiles are hosted on an NVIDIA web server and then automatically pushed out to the end user. Profiles are tiny in size and the process is painless. Even better, sophisticated end users can edit and create their own profiles. The Optimus Profiles update process has been verified and audited by a 3rd party security firm to ensure that the best coding practices have been used in implementing and deploying this automatic update feature. The updates are encrypted to alleviate the risk of security breaches.

Oh, one more thing. When the discrete GPU is not in use, it automatically powers off. It powers back on as soon as an application is launched that requires it. The power-on and power-off are imperceptible by the end user, and of course automatic and glitch-free. When you don’t need the power of the GPU, it sits there waiting to be launched into action, using zero watts of your battery. Green is good.

We briefed some very technical press on Optimus a few weeks ago. Their reactions were always the same. “This is how switchable graphics should be done. Why didn’t you do this from the beginning?”

If they only knew how much hard work it was to make it so simple.:)