How the Booming GPU Computing Market Helped Turn an Immigrant’s Life Around

by Donal Murphy

You’d never know from looking at him that Martin Peniak is an accomplished parallel programmer.

To a standing ovation after a recent TEDx talk, Peniak walked off stage in baggy shorts, a t-shirt and off-center baseball cap. He looked more like a hip-hop artist than a Ph.D. in cognitive robotics.

He’s come a long way from 10 years ago when he  arrived in England from his native Slovakia with little English and less money. He had to sleep on a park bench. Today, he’s pushing the boundaries of image recognition as a parallel software engineer at visual search specialist Cortexica.

Peniak isn’t alone in discovering that the NVIDIA accelerated computing platform – GPU accelerators, the CUDA parallel programming model, enabling software, development tools, and a vast supporting hardware and software ecosystem – is a powerful solution for solving the most complex problems.

More than 34,000 programmers from 8,000 institutions worldwide are part of NVIDIA’s CUDA Registered Developer program, which provides them tools and training that helps them use GPU accelerators for innovation and breakthroughs in science, engineering and many other fields. They’re designing safer cars, life-saving medical imaging technologies and breathtaking cinematic special effects.

Martin Peniak has come a long way from 10 years ago when he arrived in England from his native Slovakia with little English and less money.

Demand for parallel programming expertise is growing so fast that it’s hard to determine how many jobs need it. But Fortune 500 companies, including Chevron, Amazon, Lockheed Martin and JP Morgan Chase, use NVIDIA’s accelerated computing platform every day. Log into LinkedIn and see CUDA cited in the profiles of more than 23,000 members.

Those opportunities reflect the power young programmers are unleashing with GPUs. Over the past decade, Peniak, 30, has gone from a penniless immigrant to an undergraduate – and later a graduate student – at the University of Plymouth. Ph.D. in hand, Peniak is now an authority in the booming field of autonomous robotics.

And Peniak has found a technology that didn’t even exist just a decade ago to be one of his most powerful tools. Ian Buck, NVIDIA’s VP of Accelerated Computing, began the work of creating CUDA when he joined NVIDIA in 2004. He wanted programmers to be able to use simple language extensions to unleash GPU accelerators for general computing tasks.

Now available in version 6.5, CUDA is a key component of the NVIDIA accelerated computing platform, which allows programmers all over the world to accelerate their applications using familiar programming languages such as C/C++, Fortran, Java, .NET, Python and MATLAB.

In the intervening eight years, NVIDIA has shipped more than 430 million CUDA GPUs. Close to 800 universities now prepare students for careers in modern parallel computing by teaching CUDA. Put “CUDA” in the search box of Google Scholar and you’ll get about 63,500 results. A similar search on Stack Overflow yields close to 19,000 results. Downloads of the CUDA Toolkit top 2.5 million to date.

So, what compelled Peniak to put his faith in parallel programming to propel his research? When Peniak started his work, he needed computing technology that – like neural networks themselves – was parallel.

“As a result, I learned CUDA to develop all my code for the GPU to be able to do side-by-side comparisons,” he said. “I saw speedups of around 50x, which meant a lot to me because this allowed me to increase the complexity of the neural network controllers.”

Now he says he’s using parallel programming every day. “I’m really fascinated by how we’re now able to program these massively parallel processors and redefine what computing means,” referring to the first CUDA-programmable mobile processor, the Tegra K1.

NVIDIA Tegra K1 64-bit dieshot
Power and Portability: Our mobile Tegra K1 offers researchers like Martin Peniak a CUDA-programmable mobile processor.

At Cortexica, Peniak is helping develop one of the world’s leading visual search systems. Think Shazam – the app that identifies songs – for pictures. The challenge is delivering results to users in an instant.

“This would not be possible if we used standard CPUs to do the work,” Peniak says. “It would take several seconds for every query, which is unacceptable. My job is to develop and fine-tune GPU code, so we can continue to make our algorithms perform better while minimizing the processing time.”

Speaking about his TEDx talk, Peniak said one of the things he tried to emphasize was the importance of dreaming big. “I have no special skills other than being a dreamer and I don’t give up easily,” he told the crowd.