Search and you might find.
Spend enough time online, however, and what you want will start finding you just when you need it.
This is what’s driving the internet right now.
They’re called recommender systems, and they’re among the most important applications today.
That’s because there is an explosion of choice and it’s impossible to explore the large number of available options.
If a shopper were to spend just one second each swiping on their mobile app through the two billion products available on one prominent ecommerce site, it would take 65 years — almost an entire lifetime — to go through their entire catalog.
This is one of the major reasons why the Internet is now so personalized, otherwise it’s simply impossible for the billions of Internet users in the world to connect with the products, services, even expertise — among hundreds of billions of things — that matter to them.
They might be the most human, too. After all, what are you doing when you go to someone for advice? When you’re looking for feedback? You’re asking for a recommendation.
Now, driven by vast quantities of data about the preferences of hundreds of millions of individual users, recommender systems are racing to get better at doing just that.
The internet, of course, already knows a lot of facts: your name, your address, maybe your birthplace. But what the recommender systems seek to learn better, perhaps, than the people who know you are your preferences.
Key to Success of Web’s Most Successful Companies
Recommender systems aren’t a new idea. Jussi Karlgren formulated the idea of a recommender system, or a “digital bookshelf,” in 1990. Over the next two decades researchers at MIT and Bellcore steadily advanced the technique.
The technology really caught the popular imagination starting in 2007, when Netflix — then in the business of renting out DVDs through the mail — kicked off an open competition with a $1 million prize for a collaborative filtering algorithm that could improve on the accuracy of Netflix’s own system by more than 10 percent, a prize that was claimed in 2009.
Over the following decade, such recommender systems would become critical to the success of Internet companies such as Netflix, Amazon, Facebook, Baidu and Alibaba.
Virtuous Data Cycle
And the latest generation of deep-learning powered recommender systems provide marketing magic, giving companies the ability to boost click-through rates by better targeting users who will be interested in what they have to offer.
Now the ability to collect this data, process it, use it to train AI models and deploy those models to help you and others find what you want is among the largest competitive advantages possessed by the biggest internet companies.
It’s driving a virtuous cycle — with the best technology driving better recommendations, recommendations which draw more customers and, ultimately, let these companies afford even better technology.
That’s the business model. So how does this technology work?
Collecting Information
Recommenders work by collecting information — by noting what you ask for — such as what movies you tell your video streaming app you want to see, ratings and reviews you’ve submitted, purchases you’ve made, and other actions you’ve taken in the past
Perhaps more importantly, they can keep track of choices you’ve made: what you click on and how you navigate. How long you watch a particular movie, for example. Or which ads you click on or which friends you interact with.
All this information is streamed into vast data centers and compiled into complex, multidimensional tables that quickly balloon in size.
They can be hundreds of terabytes large — and they’re growing all the time.
That’s not so much because vast amounts of data are collected from any one individual, but because a little bit of data is collected from so many.
In other words, these tables are sparse — most of the information most of these services have on most of us for most of these categories is zero.
But, collectively these tables contain a great deal of information on the preferences of a large number of people.
And that helps companies make intelligent decisions about what certain types of users might like.
Content Filtering, Collaborative Filtering
While there are a vast number of recommender algorithms and techniques, most fall into one of two broad categories: collaborative filtering and content filtering.
Collaborative filtering helps you find what you like by looking for users who are similar to you.
So while the recommender system may not know anything about your taste in music, if it knows you and another user share similar taste in books, it might recommend a song to you that it knows this other user already likes.
Content filtering, by contrast, works by understanding the underlying features of each product.
So if a recommender sees you liked the movies “You’ve Got Mail” and “Sleepless in Seattle,” it might recommend another movie to you starring Tom Hanks and Meg Ryan, such as “Joe Versus the Volcano.”
Those are extremely simplistic examples, to be sure.
Data as a Competitive Advantage
In reality, because these systems capture so much data, from so many people, and are deployed at such an enormous scale, they’re able to drive tens or hundreds of millions of dollars of business with even a small improvement in the system’s recommendations.
A business may not know what any one individual will do, but thanks to the law of large numbers, they know that, say, if an offer is presented to 1 million people, 1 percent will take it.
But while the potential benefits from better recommendation systems are big, so are the challenges.
Successful internet companies, for example, need to process ever more queries, faster, spending vast sums on infrastructure to keep up as the amount of data they process continues to swell.
Companies outside of technology, by contrast, need access to ready-made tools so they don’t have to hire whole teams of data scientists.
If recommenders are going to be used in industries ranging from healthcare to financial services, they’ll need to become more accessible.
GPU Acceleration
This is where GPUs come in.
NVIDIA GPUs, of course, have long been used to accelerate training times for neural networks — sparking the modern AI boom — since their parallel processing capabilities let them blast through data-intensive tasks.
But now, as the amount of data being moved continues to grow, GPUs are being harnessed more extensively. Tools such as RAPIDS, a suite of software libraries for accelerating data science and analytics pipelines much more quickly, so data scientists can get more work done much faster.
And NVIDIA’s just announced Merlin recommender application framework promises to make GPU-accelerated recommender systems more accessible still with an end-to-end pipeline for ingesting, training and deploying GPU-accelerated recommender systems.
These systems will be able to take advantage of the new NVIDIA A100 GPU, built on our NVIDIA Ampere architecture, so companies can build recommender systems more quickly and economically than ever.
Our Recommendation? Learn How to Build Intelligent Recommendation Systems
The NVIDIA Deep Learning Institute offers instructor-led, hands-on training on the fundamental tools and techniques for building highly effective recommender systems. Taught by an expert, this in-depth, 8-hour-long workshop instructs participants in how to:
- Build a content-based recommender system using the open-source cuDF library and Apache Arrow
- Construct a collaborative filtering recommender system using alternating least squares and CuPy
- Design a wide and deep neural network using TensorFlow 2 to create a hybrid recommender system
- Optimize performance for training and inference using large, sparse datasets
- Deploy a recommender model as a high-performance web service
Earn a DLI certificate to demonstrate subject-matter competency and accelerate your career growth. Take this workshop at GTC or request a workshop for your organization.
Read more about NVIDIA Merlin, NVIDIA’s application framework for deep recommender systems.
Featured image credit: © Monkey Business – stock.adobe.com.