Quantcast

As you may have seen, NVIDIA announced today that it is developing high-performance ARM-based CPUs designed to power future products ranging from personal computers to servers and supercomputers.

Known under the internal codename “Project Denver,” this initiative features an NVIDIA CPU running the ARM instruction set, which will be fully integrated on the same chip as the NVIDIA GPU. This initiative is extremely important for NVIDIA and the computing industry for several reasons.

NVIDIA’s project Denver will usher in a new era for computing by extending the performance range of the ARM instruction-set architecture,  enabling the ARM architecture to cover a larger portion of the computing space.  Coupled with an NVIDIA GPU, it will provide the heterogeneous computing platform of the future by combining a standard architecture with awesome performance and energy efficiency.

ARM is already the standard architecture for mobile devices.  Project Denver extends the range of ARM systems upward to PCs, data center servers, and supercomputers. ARM’s modern architecture, open business model, and vibrant eco-system have led to its pervasiveness in cell phones, tablets, and other embedded devices.  Denver is the catalyst that will enable these same factors to propel ARM to become pervasive in higher-end systems.

Denver frees PCs, workstations and servers from the hegemony and inefficiency of the x86 architecture.  For several years, makers of high-end computing platforms have had no choice about instruction-set architecture.  The only option was the x86 instruction set with variable-length instructions, a small register set, and other features that interfered with modern compiler optimizations, required a larger area for instruction decoding, and substantially reduced energy efficiency.

Denver provides a choice.   System builders can now choose a high-performance processor based on a RISC instruction set with modern features such as fixed-width instructions, predication, and a large general register file.   These features enable advanced compiler techniques and simplify implementation, ultimately leading to higher performance and a more energy-efficient processor.

Microsoft’s announcement that it is bringing Windows to ultra-low power processors like ARM-based CPUs provides the final ingredient needed to enable ARM-based PCs based on Denver.   Along with software stacks based on Android, Symbian, and iOS, Windows for ultra-low power processors demonstrates the huge momentum behind low-power solutions that will ultimately propel the ARM architecture to dominance.

An ARM processor coupled with an NVIDIA GPU represents the computing platform of the future.  A high-performance CPU with a standard instruction set will run the serial parts of applications and provide compatibility while a highly-parallel, highly-efficient GPU will run the parallel portions of programs.

The result is that future systems – from the thinnest laptops to the biggest data centers, and everything in between — will deliver an outstanding combination of performance and power efficiency. Their processors will provide the best of both worlds, while enabling increased battery life for mobile solutions. We’re really excited to help engineer smarter brains for the next major era in computing.

  • moneyman10k

    as an independent observer I completely agree with bill dally. I have been waiting for this day for many years. parallelism is the most exciting field of computer science, and the practical implementation of ARM and gpu technology will enable the next age of parallelism research and development. intel larrabee and amd fusion are interesting. nvidia denver is what I would do if I were in charge.

    nvidia’s current 5 year plan must look very exciting from the inside.

  • Rish

    Professor Dally you are an inspiration to me along with Jen-Hsun Huang and David Kirk.

    I have been longing for an Nvidia CPU/SoC solution to go with my Nvidia GPU’s. Finally Microsoft has smashed the x86 monopoly.

    Computing has changed forever today thanks to Nvidia and Microsoft. AMD and Intel can only decline from here due to ARM competition from Nvidia, Qualcomm, Marvell and Samsung.

    Keep up the good work and I hope to work at Nvidia one day!

  • http://www.jfwhome.com John Wells

    Please just make sure it is open. I’ve been burned before by unstable nVidia closed-source drivers for your mobile video cards under Linux.

    If you are moving into the CPU space, you need to ensure that you are completely open about how it works, and allow / contribute open source drivers for *everything* on the chip.

    Otherwise, you will simply be cementing the Microsoft monopoly.

    You have a real opportunity to change the market, but you have to be sure to nurture the entire ecosystem.

  • Steven

    I’m extremely excited about this news!

  • http://www.303plumber.com Brad Plumber

    Agree with John Wells about making sure it’s open. What good is wicked-fast processing power if it doesn’t come with the wide-open ability to run wild with it?

    I wonder why they called it “Project Denver?”

  • http://www.hosca.com Erhan Hosca

    everything old is new again …. remember the 88000 ?

  • Max

    Viva Nvidia !!!

    I cant wait to have a system with all parts come from Nvidia !!!

  • Sushan

    it’s better to use the parallel Nvidia Gpu processors ,than coupled with the ARM processors,
    and it’s also conviniant to minimaze the the power use,it’s help to reduce the extra overhead of complexity of the architecture.

  • Dogby

    Why not make the GPU a CPU – make everything run on the graphics card (including the OS).

  • TechU

    “NVIDIA announced today that it is developing high-performance ARM-based CPUs designed to power future products ranging from personal computers to servers and supercomputers.
    Known under the internal codename “Project Denver,” this initiative features an NVIDIA CPU running the ARM instruction set, which will be fully integrated on the same chip as the NVIDIA GPU.”

    Great But you are also aware BILL that Freescale actually introducing a few days ago what people want to actually buy THIS Year, that being an ARM QUAD core cortex A9/NEON 128bit SIMD powered mobile device with many hours of battery use AT FULL LOAD

    http://www.linuxfordevices.com/c/a/News/Freescale-iMX-6/

    “The i.MX 6 series is Freescale’s first ARM-based multicore SoC and first Cortex-A9 model. The processor advances the i.MX family with dual-stream 1080p video playback at 60 frames per second (fps), 3D video playback at 50Mbps, desktop-quality gaming, augmented reality applications, and novel content creation capabilities, says Freescale.

    The SoC is also touted for being one of the first applications processors to offer hardware support for the open source VP8 codec.

    VP8 drives the related WebM (MKV) open container format, both of which are supported in the most recent Android 2.3 release….”

    “the SoC is claimed to enable 1080p video (single stream) with only 350mW consumption. As a result, the i.MX 6 series can deliver up to 24 hours of HD video playback and 30-plus days of device standby time, claims the company.”

    “All three i.MX 6 models are clocked to 1.2GHz, and offer the ARMv7 instruction set with Neon multimedia extensions, VFPvd16 (vector floating point graphics), and Trustzone support, says Freescale. While the single-core i.MX 6Solo offers 256KB of L2 cache, the dual and quad versions are each said to offer 1MB of L2 cache.”

    also you mention
    “NVIDIA’s project Denver will usher in a new era for computing by extending the performance range of the ARM instruction-set architecture, enabling the ARM architecture to cover a larger portion of the computing space. Coupled with an NVIDIA GPU”

    are we to take that to mean the NEON 128bit SIMD is also going to get even more microcode extensions to make things like the OSS x264 AVC/H.264 Encoder even better and faster on ARM/NEON with your help ? for instance

    pop over to the IRC channel #x264dev and introduce yourself to the devs there where assembly development takes place every day would be my advice to you :D

  • http://www.technicstoday.com/ Anish K.S

    Thanks for Project Denver

  • Datsun

    “Denver frees PCs, workstations and servers from the hegemony and inefficiency of the x86 architecture. For several years, makers of high-end computing platforms have had no choice about instruction-set architecture. The only option was the x86 instruction set with variable-length instructions, a small register set, and other features that interfered with modern compiler optimizations, required a larger area for instruction decoding, and substantially reduced energy efficiency.”

    Wrong, AMD has a solution about this. X86 is very scalable because of its variable length instructions word. Every software functions can be mapped in hardware efficiently by using x86 ISA.

  • TechU

    that’s interesting DATSUN until you realise that AMD CPU’s are slower at assembly than Intel chips
    as can be seem quick simply if you compile x264 from git

    a simple
    make checkasm;./checkasm

    checkasm –bench

    getting real life results on each CPU you run tests on would probably be a good start though to prove the point.

    true they need more ARM cortex NEON SIMD assembly written, but that just means any good NEON assembly dev popping over to the x264dev irc channel and spending few hours days there helping the GOS code-in students with that porting of their 10bit SIMD code to fill up and speed up that OSS code base, and learning few things about all things video from the likes of

    clearly Nvidia could see the train wreck coming and started to invest in ARM Cortex, intel is armoured to some degree with their new sandy bridge internal video Encode.decode engine ASIC taking off, and can just as easily buy into any Arm vendors chip licence if the need arises.

    AMD are screwed and they have neater a gfx chip presence in that ARM mobile space with the new ARM MALI T 604 coming coming there (needs work to fully open it to Linux source though OC) soon, nor any windows/Linux Encode ASIC strategy , hell even their UVD does not function in Linux Linux video today other than the odd bit of h.264 if you happen to have the right buggy driver and UVD2/3.

    and they couldn’t even be bothered to release their OpenCL/Open Decode so library in their latests holiday season SDK to the Linux developers deperatly and clearly in need on something official to decode that Linux video.

    al things considered to date, ARM/NEON 128bit SIMD and a new SOC that has two or more 1 gig ethernet capability etc as standard are looking very attractive to many a vendor of mobile devices for 2011, and OC x264 plus Intel video Encode/decode for the rest of your needs.

  • Datsun

    “that’s interesting DATSUN until you realise that AMD CPU’s are slower at assembly than Intel chips
    as can be seem quick simply if you compile x264 from git

    a simple
    make checkasm;./checkasm

    checkasm –bench

    getting real life results on each CPU you run tests on would probably be a good start though to prove the point.”

    Yeah, current AMD processor core is too old to be competitive against Intel. AMD has not yet fully implemented Intel SSE4 instructions except with its upcoming Bulldozer based core.

  • kpap

    I think that Intel recognizes the industry’s paradigm shift to SoC and highly integrated products.

    http://www.eetimes.com/electronics-news/4210937/Intel-rolls-six-merged-Atom-FPGA-chips

    Whether they will be sucessful to integrate their processors in consumer or other devices except from ones oriented to industrial applications, remains to be seen.

  • http://www.highlinemotorssouth.com Hugh Mac

    This is especially exciting considering Microsoft’s announcement of the next generation of windows devices supporting ARM based processors.

  • http://Xara.com CMO

    I think the comment about the ARM architecture being modern is a bit funny. This 32-bit architecture is 25 years old and has changed hardly at all in that time. Advanced for it’s time, it certainly was, but that was 25 years ago !
    (I published the first book on ARM assembly language programming in 1987)

  • SH

    At least nVidia is continuing the roadmap… It seems like the Tegra’s seemingly lack luster market adoption hasn’t put a damper on nVidia’s enthusiasm and confidence.

    Quad-core ARM w/ Neon (OpenMax)? When was the last time a CPU coprocessor SIMD instructions actually made a difference? MMX, wireless MMX.. Also force feeding all the data through the ARM bus interface is just so 1990′s, despite what ARM/FreeScale wants you to believe… OpenCL will be more widely adopted in the next few years… GPU is the way to go. Even more important, nVidia seems to have a lot more expertise in offering a complete SW driver/stack for developers/OEM’s to use. If nVidia can fully address the SW requirement from MS and Android, they will have their fair share of the market (except the Apple sockets).

    Now when the hell will nVidia stock get to $30/share? And they have to sell millions and millions of these ARM/nVidia SoC… Wonder which major OEM they bedding w/…

  • TechU

    ?
    “Quad-core ARM w/ Neon (OpenMax)? When was the last time a CPU coprocessor SIMD instructions actually made a difference? MMX, wireless MMX..” and NEON :D

    if you need to ask that question, then you Obviously Have Not tried simply compiling x264 with assembly SIMD turned off, or it would be perfectly clear that SIMD Helps a hell of a lot with speed if you actually write different SIMD code for each C routine as is clear with the very clean x264 code base.

  • PaTrond

    Does this mean i will never have to buy Intel again? = D Hurray! They make some quite good SSDs, though..

    For a few days (during CES) I though Intel would be able to continue roaming on with their “well” performing CPU monopoly (AMD is not “that” bad, but Intel beats it.) I’m doing high-end 3D rendering and Intel has so far been the holy grail for a program called Maxwell Rendering. Hope this will more or less revolutionize it.

  • Bill Dally

    Thanks everyone for your comments on Project Denver. It has certainly generated a great deal of interest among people at CES and throughout the industry. The industry is excited about seeing more energy efficient solutions using powerful yet efficient CPU cores combined with a highly parallel GPU for the heavy lifting. The industry seems particularly enthusiastic about seeing innovative alternatives to the x86 architecture which has hit the power wall (see my OpEd in Forbes).

    We’re on the verge of an exciting new era in computing. Stay tuned!

  • TechU

    “Thanks everyone for your comments on Project Denver. It has certainly generated a great deal of interest among people at CES and throughout the industry.”

    …. WHAT…, Is That It Bill ?… a canned PR response to the blog entry’s, that’s just….. sad

    You can step Off your soap box now mate.

    its time to put away your PR hat, and put on your tech cap and fill in the many missing bits of data people reading your blog are wanting to see here, or at least put on your Prof top hat and give us some background lecture as to where YOU want to push and improve ARM in general

  • TechU

    for instance, with your flat cap on, a very basic question to start with
    will You be standardising on the ARM ABMA 4 AXI4 interconnect protocol ?

  • Dusan

    I´m using only Linux 6 years.

    I waiting very long time for ARM procesor (low power consumption). Which is useful for desktop /laptop. But is must working out of the box = driver for graphic(opengl,opencl) must be FOSS.

  • http://www.softmachinery.com Mel Pullen

    Nice. When can we have a dev board, please?

  • gopiu

    I really hope Denver could run Crysis 2 and to be a high-end desktop/server GPGPU chip. Limiting it to mobile/netbook devices won’t be good ( we already have Tegra for that ).

    A dev board or SDK would be fantastic, pls !

  • mike3

    Will the GPU/CPU thing be premiered on a graphics card or otherwise put _alongside_ a conventional X86 processor? As that’s the only way I could see this being viable for taking over the PC. All other attempts to torpedo the X86 beast have failed because nobody will want to purchase anything else unless there’s software they can run on it, and nobody will want to make software unless there are machines out there to run it on. A huge chicken-and-egg problem. The *only* way I can see of resolving this is to premiere this first as a graphics upgrade, with the usual increased performance. Then people will buy it (for the increased performance and features like with every new graphics upgrade), and that ARM processor will come in as part of the integrated package. And it would be possible to start building programs. So initially it would be alongside the X86, then a shift-over could be done until everything runs on it alone and the X86 can be pitched.

  • Malokedy

    Very nice move, actually. I have been NVIDA fan since RIVA128 times. Please, make something brand new and forget current CPU kings and their obsolete instruction sets as well. Take a look at the computing power of the nature. Some C(G)PU features were introduced very long time ago. No? What about brain – left hemisphere for logic and the right side for the imagination. Sounds familiar now. About 100 million MIPS needed to match human brain power. In order to reach this, your strong ARMs and our wallets will be needed as well. Put your CPU into some AI thinking gadgets, I will definitely buy and benchmark some.

    Your friendly alien.

  • TechU

    MALOKEDY, funnily enough to take full advantage of that your probably going to need and updated low power “transputer” Mesh (another UK Innovation, Bristol this time ” http://www.cs.bris.ac.uk/~dave/transputer.html
    and probably integrate that into a nice 8-nm FPGA parts AND even :D cheap 22-nm devices operated at a peak performance of 1.5 GHz In the Near future, that’s another open question in the cue for BILL to answer BTW :D

  • TechU

    Oops thats 28nm now, and 22nm parts soon, an edit button might be nice here for simple corrections something BTW, Please LD

  • TechU

    you have to admit that
    Transputer david “May’s Law
    Software efficiency halves every 18 months, compensating Moore’s Law.”
    turned out to be very try today too unfortunately :D

    XMOS is where David May (the transputer architect) now works on his XCore processor, incorporating Software Defined Silicon (SDS). XMOS sell a XC-1 development kit for $100. It’s a cheap way to have a play with this new technology. The XS1 is a modern take on the transputer with parallel support and programmable software flexibility – http://www.xmos.com/

    a meeting of minds and VC to bring something even more innovative (And I Use that word very sparingly unlike PR innovators) with Nvidia,ARM. and Freescale plus a general purpose end developer programmable FPGA block or two to build up the so called ‘reconfigurable computing’ that’s now back in vogue with both http://www.staho.com/quad-core-to-ki…cessor/208227/ and http://www.gla.ac.uk/news/headline_183814_en.html
    bringing some basic hints and updated advances news since the Rapport kilocore 256 and 1024 FPGA’s appeared back in 2006 (always give them credit where’s it’s due)
    the commercial Rapport kilocore 256 and 1024 CPUs on a single FPGA (Field Programmable Gate Array) where seeing 30 frames a second while consuming only 100 milliwatts in 2006 where ARM at the time were getting 3.3 a second while consuming half a watt of power. could be just the ticket

  • TechU

    Oop’s again… also including David May the transputer architect OC
    (makes sense if my other post gets moderated ? “Your comment is awaiting moderation.”)

  • Malokedy

    …to TECHU… BILL and/or bills…. who cares anymore :-) …socket/pin number not important… OK, straight to the point – the technology production technology limit will be a bottleneck very soon… TICK TOCK as we know will come to the end… So what… the power consumption vs processing power 4 next year, hmm… let me say an equation and ballance issue :-) …you humans almost solved your own gravity overcome (84%), but nobody asked 4 help since 1937… just received kinda of warning, so it`s my last post here… see/gather/meat some individuals in 2013 than… please dig deep enough & survive incoming heat… oh, almost forgot – NVIDIA still rules… do not ask why.

  • http://none Josue

    great news!!! But just running windows is not all. We use CAD/CAM/CAE software that is optimaced for SSE instruction set, HT and MMX (that one). We have a world of x86 and when nvidia show us the final product how we can migrate to this step beyond if we have these software?? can anyone explein to me?? i´m not expert on this field. Thankyou.

    PD: Software: Siemens solidedge, Siemens NX, Coretech Moldex (about to pucharse).

  • http://laptop-komputer.com/ komputer

    Denver published PCs, workstations and servers of hegemony and the inefficiency of the x86 architecture. In recent years, manufacturers of high-end computing platforms, the choice of instruction set architecture. The only way is the x86 instruction set with variable length, a short list together, and other features, the set of interfering with a modern compiler optimization, to decode a larger area in the instructions, and reduces energy efficiency significantly. ”

    It has an AMD solution in this regard. X86 is highly scalable variable word length instruction. Each software function can be effective if the hardware x86 ISA

  • TechU

    bill mate, i know you have been reading as you moded my post, but its been a week now since you you even bothered to post that PR thanks, that’s Not the way your supposed to run a successful BLOG mate, at least pop in and chat now and then get one of your minions to give feedback if your to scared to make the time to talk to ordinary people :D

    you need to get out of your bat cave more often and INFORM and entertain your readers…

    by the way, did you get your mate Henri Richard on the blower and have a chat
    http://blogs.freescale.com/2011/01/11/computing-redefined-lean-back-and-enjoy-the-competition/?tid=NL_0211

    Henri’s PR speak is even dryer than yours , so your getting better though

  • ICET

    TECHU, you talk too much.

  • witek

    Why Microsoft Windows is important on ARM. You cannot run any application from PC Wiindows on it anyway (eventually using some form of binary translation/emulation which will be quite slow). Linux (and Android), and other operating systems are more important than Windows ARM IMHO.

  • Jack

    I totally agree with John Wells. The drivers must be fully open. Lack of linux compatibility is a no-go for me.