Reading the Signs: NVIDIAN Wins Kaggle AI Contest for Sign Language Recognition

by Isha Salian

NVIDIAN Christof Henkel recently clinched the top spot in an American Sign Language recognition challenge on Kaggle, the online Olympics of data science — earning a $50,000 prize that he donated for disaster and crisis response through the NVIDIA Foundation.

Currently ranked the No. 2 Kaggle competitor in the world, Munich-based Henkel is a member of the Kaggle Grandmasters of NVIDIA. Known as KGMoN, this global team of nine all-star data scientists and engineers use NVIDIA-accelerated techniques to compete in Kaggle challenges.

This Google-sponsored competition challenged participants to detect American Sign Language fingerspelled characters and translate them into text.

“It was challenging due to the novelty of the problem — less research has been done compared to other areas,” said Henkel, who worked on the competition with Darragh Hanley, senior researcher at AI solutions company DoubleYard. “While normally you’d be working in computer vision, speech recognition or speech-to-text, this falls somewhere in between.”

Their work could help provide deaf and hard-of-hearing users of web search, map directions and texting apps the option to fingerspell words instead of using a keyboard. It could also power a dedicated sign-language-to-speech app, enabling faster, smoother communication between signers and non-signers.

Adapting Speech Recognition AI for Fingerspelling

Instead of word-based signs, the challenge focused on digits, special characters and phrases that are commonly spelled out — such as phone numbers, addresses and URLs.

Henkel and his teammate’s winning solution used an end-to-end model adapted from speech recognition AI. The pair also used data augmentation to bolster a training dataset of over 3 million fingerspelled characters, which were captured via smartphone videos and then converted into x, y and z coordinates that corresponded to the position of the signer’s face, hand and pose in each frame of the video.

“Data augmentation was essential, because an AI for fingerspelling needs to generate well across different signers,” Henkel said. “People have different styles, dialects and speeds of fingerspelling — and they may be signing at different distances from a selfie camera.”

The duo developed their solution using a PyTorch container in NGC — NVIDIA’s hub of software, services and tools — and trained the model on NVIDIA Tensor Core GPUs. The main model architecture they used, called Squeezeformer, is available in the NVIDIA NeMo generative AI framework.

From AI Enthusiast to Kaggle Grandmaster

Henkel first joined Kaggle six years ago after completing his doctorate degree at Ludwig Maximilian University of Munich in math, seeing it as an effective forum to learn about data science, AI and neural networks. Since joining the Kaggle community — first while leading a deep learning startup, and then as an NVIDIAN — he’s participated in about 60 competitions, working with fellow KGMoN members as well as external collaborators.

“After my first Kaggle competition, I was amazed by how much you can learn by tackling a hands-on, real-world project in an environment where people openly share ideas, approaches, data and code,” Henkel said. “You can see examples of solid and robust experimentation, model design and validation — it’s a really dense and effective way of learning.”

He recommends that beginners jump in to tackle Kaggle challenges they find interesting, no matter their skill level.

“Don’t be scared of a complicated problem, as long as you’re interested in the topic,” he said. “Even if you don’t have a high chance of winning a big prize, you get the most valuable learning experience by observing what others do and discuss on Kaggle.”

Since it was founded in 2020, the KGMoN team has won dozens of Kaggle competitions. All KGMoN members have earned the exclusive rank of grandmaster in Kaggle’s competition category — achieved by just 300 of the community’s 14 million members. In addition to participating in competitions, KGMoN members also publish open-source code and contribute to discussions on the platform.

Learn more about the KGMoN team and about NVIDIA life, culture and careers.