17 Predictions for 2024: From RAG to Riches to Beatlemania and National Treasures

Move over, Merriam-Webster: Enterprises this year found plenty of candidates to add for word of the year. “Generative AI” and “generative pretrained transformer” were followed by terms such as “large language models” and “retrieval-augmented generation” (RAG) as whole industries turned their attention to transformative new technologies.

Generative AI started the year as a blip on the radar but ended with a splash. Many companies are sprinting to harness its ability to ingest text, voice and video to churn out new content that can revolutionize productivity, innovation and creativity.

Enterprises are riding the trend. Deep learning algorithms like OpenAI’s ChatGPT, further trained with corporate data, could add the equivalent of $2.6 trillion to $4.4 trillion annually across 63 business use cases, according to McKinsey & Company.

Yet managing massive amounts of internal data often has been cited as the biggest obstacle to scaling AI. Some NVIDIA experts in AI predict that 2024 will be all about phoning a friend — creating partnerships and collaborations with cloud service providers, data storage and analytical companies, and others with the know-how to handle, fine-tune and deploy big data efficiently.

Large language models are at the center of it all. NVIDIA experts say advancements in LLM research will increasingly be applied in business and enterprise applications. AI capabilities like RAG, autonomous intelligent agents and multimodal interactions will become more accessible and more easily deployed via virtually any platform.

Hear from NVIDIA experts on what to expect in the year ahead:

MANUVIR DAS
Vice President of Enterprise Computing

One size doesn’t fit all: Customization is coming to enterprises. Companies won’t have one or two generative AI applications — many will have hundreds of customized applications using proprietary data that is suited to various parts of their business.

Once running in production, these custom LLMs will feature RAG capabilities to connect data sources to generative AI models for more accurate, informed responses. Leading companies like Amdocs, Dropbox, Genentech, SAP, ServiceNow and Snowflake are already building new generative AI services built using RAG and LLMs.

Open-source software leads the charge: Thanks to open-source pretrained models, generative AI applications that solve specific domain challenges will become part of businesses’ operational strategies.

Once companies combine these headstart models with private or real-time data, they can begin to see accelerated productivity and cost benefits across the organization. AI computing and software are set to become more accessible on virtually any platform, from cloud-based computing and AI model foundry services to the data center, edge and desktop.

Off-the-shelf AI and microservices: Generative AI has spurred the adoption of application programming interface (API) endpoints, which make it easier for developers to build complex applications.

In 2024, software development kits and APIs will level up as developers customize off-the-shelf AI models using AI microservices such as RAG as a service. This will help enterprises harness the full potential of AI-driven productivity with intelligent assistants and summarization tools that can access up-to-date business information.

Developers will be able to embed these API endpoints directly into their applications without having to worry about maintaining the necessary infrastructure to support the models and frameworks. End users can in turn experience more intuitive, responsive and tailored applications that adapt to their needs.

IAN BUCK
Vice President of Hyperscale and HPC

National treasure: AI is set to become the new space race, with every country looking to create its own center of excellence for driving significant advances in research and science and improving GDP.

With just a few hundred nodes of accelerated computing, countries will be able to quickly build highly efficient, massively performant, exascale AI supercomputers. Government-funded generative AI centers of excellence will boost countries’ economic growth by creating new jobs and building stronger university programs to create the next generation of scientists, researchers and engineers.

Quantum leaps and bounds: Enterprise leaders will launch quantum computing research initiatives based on two key drivers: the ability to use traditional AI supercomputers to simulate quantum processors and the availability of an open, unified development platform for hybrid-classical quantum computing. This enables developers to use standard programming languages instead of needing custom, specialized knowledge to build quantum algorithms.

Once considered an obscure niche in computer science, quantum computing exploration will become more mainstream as enterprises join academia and national labs in pursuing rapid advances in materials science, pharmaceutical research, subatomic physics and logistics.

KARI BRISKI
Vice President of AI Software

From RAG to riches: Expect to hear a lot more about retrieval-augmented generation as enterprises embrace these AI frameworks in 2024.

As companies train LLMs to build generative AI applications and services, RAG is widely seen as an answer to the inaccuracies or nonsensical replies that sometimes occur when the models don’t have access to enough accurate, relevant information for a given use case.

Using semantic retrieval, enterprises will take open-source foundation models, ingest their own data so that a user query can retrieve the relevant data from the index and then pass it to the model at run time.

The upshot is that enterprises can use fewer resources to achieve more accurate generative AI applications in sectors such as healthcare, finance, retail and manufacturing. End users should expect to see more sophisticated, context-sensitive and multimodal chatbots and personalized content recommendation systems that allow them to talk to their data naturally and intuitively.

Multimodality makes its mark: Text-based generative AI is set to become a thing of the past. Even as generative AI remains in its infancy, expect to see many industries embrace multimodal LLMs that allow consumers to use a combination of text, speech and images to deliver more contextually relevant responses to a query about tables, charts or schematics.

Companies such as Meta and OpenAI will look to push the boundaries of multimodal generative AI by adding greater support for the senses, which will lead to advancements in the physical sciences, biological sciences and society at large. Enterprises will be able to understand their data not just in text format but also in PDFs, graphs, charts, slides and more.

NIKKI POPE
Head of AI and Legal Ethics

Target lock on AI safety: Collaboration among leading AI organizations will accelerate the research and development of robust, safe AI systems. Expect to see emerging standardized safety protocols and best practices that will be adopted across industries, ensuring a consistent and high level of safety across generative AI models.

Companies will heighten their focus on transparency and interpretability in AI systems — and use new tools and methodologies to shed light on the decision-making processes of complex AI models. As the generative AI ecosystem rallies around safety, anticipate AI technologies becoming more reliable, trustworthy and aligned with human values.

RICHARD KERRIS
Vice President of Developer Relations, Head of Media and Entertainment

The democratization of development: Virtually anyone, anywhere will soon be set to become a developer. Traditionally, one had to know and be proficient at using a specific development language to develop applications or services. As computing infrastructure becomes increasingly trained on the languages of software development, anyone will be able to prompt the machine to create applications, services, device support and more.

While companies will continue to hire developers to build and train AI models and other professional applications, expect to see significantly broader opportunities for anyone with the right skill set to build custom products and services. They’ll be helped by text inputs or voice prompts, making interactions with computers as simple as verbally instructing it.

“Now and Then” in film and song: Just as the “new” AI-augmented song by the Fab Four spurred a fresh round of Beatlemania, the dawn of the first feature-length generative AI movie will send shockwaves through the film industry.

Take a filmmaker who shoots using a 35mm film camera. The same content can soon be transformed into a 70mm production using generative AI, reducing the significant costs involved in film production in the IMAX format and allowing a broader set of directors to participate.

Creators will transform beautiful images and videos into new types and forms of entertainment by prompting a computer with text, images or videos. Some professionals worry their craft will be replaced, but those issues will fade as generative AI gets better at being trained on specific tasks. This, in turn, will free up hands to tackle other tasks and provide new tools with artist-friendly interfaces.

KIMBERLY POWELL
Vice President of Healthcare

AI surgical assistants: The day has come when surgeons can use voice to augment what they see and understand inside and outside the surgical suite.

Combining instruments, imaging, robotics and real-time patient data with AI will lead to better surgeon training, more personalization during surgery and better safety with real-time feedback and guidance even during remote surgery. This will help close the gap on the 150 million surgeries that are needed yet do not occur, particularly in low- and middle-income countries.

Generative AI drug discovery factories: A new drug discovery process is emerging, where generative AI molecule generation, property prediction and complex modeling will drive an intelligent lab-in-the-loop flywheel, shortening the time to discover and improving the quality of clinically viable drug candidates.

These AI drug discovery factories employ massive healthcare datasets using whole genomes, atomic-resolution instruments and robotic lab automation capable of running 24/7. For the first time, computers can learn patterns and relationships within enormous and complex datasets and generate, predict and model complex biological relationships that were only previously discoverable through time-consuming experimental observation and human synthesis.

CHARLIE BOYLE
Vice President of DGX Platforms

Enterprises lift bespoke LLMs into the cloud: One thing enterprises learned from 2023 is that building LLMs from scratch isn’t easy. Companies taking this route are often daunted by the need to invest in new infrastructure and technology and they experience difficulty in figuring out how and when to prioritize other company initiatives.

Cloud service providers, colocation providers and other businesses that handle and process data for other businesses will help enterprises with full-stack AI supercomputing and software. This will make customizing pretrained models and deploying them easier for companies across industries.

Fishing for LLM gold in enterprise data lakes: There’s no shortage of statistics on how much information the average enterprise stores — it can be anywhere in the high hundreds of petabytes for large corporations. Yet many companies report that they’re mining less than half that information for actionable insights.

In 2024, businesses will begin using generative AI to make use of that untamed data by putting it to work building and customizing LLMs. With AI-powered supercomputing, business will begin mining their unstructured data — including chats, videos and code — to expand their generative AI development into training multimodal models. This leap beyond the ability to mine tables and other structured data will let companies deliver more specific answers to questions and find new opportunities. That includes helping detect anomalies on health scans, uncovering emerging trends in retail and making business operations safer.

AZITA MARTIN
Vice President of Retail, Consumer-Packaged Goods and Quick-Service Restaurants

Generative AI shopping advisors: Retailers grapple with the dual demands of connecting customers to the products they desire while delivering elevated, human-like, omnichannel shopping experiences that align with their individual needs and preferences.

To meet these goals, retailers are gearing up to introduce cutting-edge, generative AI-powered shopping advisors, which will undergo meticulous training on the retailers’ distinct brand, products and customer data to ensure a brand-appropriate, guided, personalized shopping journey that mimics the nuanced expertise of a human assistant. This innovative approach will help set brands apart and increase customer loyalty by providing personalized help.

Setting up for safety: Retailers across the globe are facing a mounting challenge as organized retail crime grows increasingly sophisticated and coordinated. The National Retail Federation reported that retailers are experiencing a staggering 26.5% surge in such incidents since the post-pandemic uptick in retail theft.

To enhance the safety and security of in-store experiences for both customers and employees, retailers will begin using computer vision and physical security information management software to collect and correlate events from disparate security systems. This will enable AI to detect weapons and unusual behavior like the large-scale grabbing of items from shelves. It will also help retailers proactively thwart criminal activities and maintain a safer shopping environment.

REV LEBAREDIAN
Vice President of Omniverse and Simulation Technology

Industrial digitalization meets generative AI: The fusion of industrial digitalization with generative AI is poised to catalyze industrial transformation.Generative AI will make it easier to turn aspects of the physical world — such as geometry, light, physics, matter and behavior — into digital data. Democratizing the digitalization of the physical world will accelerate industrial enterprises, enabling them to design, optimize, manufacture and sell products more efficiently. It also enables them to more easily create virtual training grounds and synthetic data to train a new generation of AIs that will interact and operate within the physical world, such as autonomous robots and self-driving cars.

3D interoperability takes off: From the drawing board to the factory floor, data for the first time will be interoperable.

The world’s most influential software and practitioner companies from the manufacturing, product design, retail, e-commerce and robotics industries are committing to the newly established Alliance for OpenUSD. OpenUSD, the universal language between 3D tools and data, will break down data siloes, enabling industrial enterprises to collaborate across data lakes, tool systems and specialized teams easier and faster than ever to accelerate the digitalization of previously cumbersome, manual industrial processes.

XINZHOU WU
Vice President of Automotive

Modernizing the vehicle production lifecycle: The automotive industry will further embrace generative AI to deliver physically accurate, photorealistic renderings that show exactly how a vehicle will look inside and out — while speeding design reviews, saving costs and improving efficiencies.

More automakers will embrace this technology within their smart factories, connecting design and engineering tools to build digital twins of production facilities. This will reduce costs and streamline operations without the need to shut down factory lines.

Generative AI will make consumer research and purchasing more interactive. From car configurators and 3D visualizations to augmented reality demonstrations and virtual test drives, consumers will be able to have a more engaging and enjoyable shopping experience.

Safety is no accident: Beyond the automotive product lifecycle, generative AI will also enable breakthroughs in autonomous vehicle (AV) development, including turning recorded sensor data into fully interactive 3D simulations. These digital twin environments, as well as synthetic data generation, will be used to safely develop, test and validate AVs at scale virtually before they’re deployed in the real world.

Generative AI foundational models will also support a vehicle’s AI systems to enable new personalized user experiences, capabilities and safety features inside and outside the car.

The behind-the-wheel experience is set to become safer, smarter and more enjoyable.

BOB PETTE
Vice President of Enterprise Platforms

Building anew with generative AI: Generative AI will allow organizations to design cars by simply speaking to a large language model or create cities from scratch using new techniques and design principles.

The architecture, engineering, construction and operations (AECO) industry is building the future using generative AI as its guidepost. Hundreds of generative AI startups and customers in AECO and manufacturing will focus on creating solutions for virtually any use case, including design optimization, market intelligence, construction management and physics prediction. AI will accelerate a manufacturing evolution that promises increased efficiency, reduced waste and entirely new approaches to production and sustainability.

Developers and enterprises are focusing in particular on point cloud data analysis, which uses lidar to generate representations of built and natural environments with precise details. This could lead to high-fidelity insights and analysis through generative AI-accelerated workflows.

GILAD SHAINER
Vice President of Networking

AI influx ignites connectivity demand: A renewed focus on networking efficiency and performance will take off as enterprises seek the necessary network bandwidth for accelerated computing using GPUs and GPU-based systems.

Trillion-parameter LLMs will expose the need for faster transmission speeds and higher coverage. Enterprises that want to quickly roll out generative AI applications will need to invest in accelerated networking technology or choose a cloud service provider that does. The key to optimal connectivity is baking it into full-stack systems coupled with next-generation hardware and software.

The defining element of data center design: Enterprises will learn that not all data centers need to be alike. Determining the purpose of a data center is the first step toward choosing the appropriate networking to use within it. Traditional data centers are limited in terms of bandwidth, while those capable of running large AI workloads require thousands of GPUs to work at very deterministic, low-tail latency.

What the network is capable of when under a full load at scale is the best determinant of performance. The future of enterprise data center connectivity requires separate management (aka north-south) and AI (aka east-west) networks, where the AI network includes in-network computing specifically designed for high performance computing, AI and hyperscale cloud infrastructures.

DAVID REBER JR.
Chief Security Officer

Clarity in adapting the security model to AI: The pivot from app-centric to data-centric security is in full swing. Data is the fundamental supply chain for LLMs and the future of generative AI. Enterprises are just now seeing the problem unfold at scale. Companies will need to reevaluate people, processes and technologies to redefine the secure development lifecycle (SDLC). The industry at large will redefine its approach to trust and clarify what transparency means.

A new generation of cyber tools will be born. The SDLC of AI will be defined with new market leaders of tools and expectations to address the transition from the command line interface to the human language interface. The need will be especially important as more enterprises shift toward using open-source LLMs like Meta’s Llama 2 to accelerate generative AI output.

Scaling security with AI: Applications of AI to the cybersecurity deficit will detect never-before-seen threats. Currently, a fraction of global data is used for cyber defense. Meanwhile, attackers continue to take advantage of every misconfiguration.

Experimentation will help enterprises realize the potential of AI in identifying emergent threats and risks. Cyber copilots will help enterprise users navigate phishing and configuration. For the technology to be effective, companies will need to tackle privacy issues inherent in the intersection of work and personal life to enable collective defense in data-centric environments.

Along with democratizing access to technology, AI will also enable a new generation of cyber defenders as threats continue to grow. As soon as companies gain clarity on each threat, AI will be used to generate massive amounts of data that train downstream detectors to defend and detect these threats.

RONNIE VASISHTA
Senior Vice President of Telecoms

Running to or from RAN: Expect to see a major reassessment of investment cases for 5G.

After five years of 5G, network coverage and capacity have boomed — but revenue growth is sluggish and costs for largely proprietary and inflexible infrastructure have risen. Meantime, utilization for 5G RAN is stuck below 40%.

The new year will be about aggressively pursuing new revenue sources on existing spectrum to uncover new monetizable applications. Telecoms also will rethink the capex structure, focusing more on a flexible, high-utilization infrastructure built on general-purpose components. And expect to see a holistic reduction of operating expenses as companies leverage AI tools to increase performance, improve efficiency and eliminate costs. The outcome of these initiatives will determine how much carriers will invest in 6G technology.

From chatbots to network management: Telcos are already using generative AI for chatbots and virtual assistants to improve customer service and support. In the new year they’ll double down, ramping up their use of generative AI for operational improvements in areas such as network planning and optimization, fault and fraud detection, predictive analytics and maintenance, cybersecurity operations and energy optimization.

Given how pervasive and strategic generative AI is becoming, building a new type of AI factory infrastructure to support its growth also will become a key imperative. More and more telcos will build AI factories for internal use, as well as deploy these factories as a platform as a service for developers. That same infrastructure will be able to support RAN as an additional tenant.

MALCOLM DEMAYO
Vice President of Financial Services

AI-first financial services: With AI advancements growing exponentially, financial services firms will bring the compute power to the data, rather than the other way around.

Firms will undergo a strategic shift toward a highly scalable, hybrid combination of on-premises infrastructure and cloud-based computing, driven by the need to mitigate concentration risk and maintain agility amid rapid technological advancements. Firms that handle their most mission-critical workloads, including AI-powered customer service assistants, fraud detection, risk management and more, will lead.

MARC SPIELER
Senior Director of Energy

Physics-ML for faster simulation: Energy companies will increasingly turn to physics-informed machine learning (physics-ML) to accelerate simulations, optimize industrial processes and enhance decision-making.

Physics-ML integrates traditional physics-based models with advanced machine learning algorithms, offering a powerful tool for the rapid, accurate simulation of complex physical phenomena. For instance, in energy exploration and production, physics-ML can quickly model subsurface geologies to aid in identification of potential exploration sites and assessment of operational and environmental risks.

In renewable energy sectors, such as wind and solar, physics-ML will play a crucial role in predictive maintenance, enabling energy companies to foresee equipment failures and schedule maintenance proactively to reduce downtimes and costs. As computational power and data availability continue to grow, physics-ML is poised to transform how energy companies approach simulation and modeling tasks, leading to more efficient and sustainable energy production.

LLMs — the fix for better operational outcomes: Coupled with physics-ML, LLMs will analyze extensive historical data and real-time sensor inputs from energy equipment to predict potential failures and maintenance needs before they occur. This proactive approach will reduce unexpected downtime and extend the lifespan of turbines, generators, solar panels and other critical infrastructure. LLMs will also help optimize maintenance schedules and resource allocation, ensuring that repairs and inspections are efficiently carried out. Ultimately, LLM use in predictive maintenance will save costs for energy companies and contribute to a more stable energy supply for consumers.

DEEPU TALLA
Vice President of Embedded and Edge Computing

The rise of robotics programmers: LLMs will lead to rapid improvements for robotics engineers. Generative AI will develop code for robots and create new simulations to test and train them.

LLMs will accelerate simulation development by automatically building 3D scenes, constructing environments and generating assets from inputs. The resulting simulation assets will be critical for workflows like synthetic data generation, robot skills training and robotics application testing.

In addition to helping robotics engineers, transformer AI models, the engines behind LLMs, will make robots themselves smarter so that they better understand complex environments and more effectively execute a breadth of skills within them.

For the robotics industry to scale, robots have to become more generalizable — that is, they need to acquire skills more quickly or bring them to new environments. Generative AI models — trained and tested in simulation — will be a key enabler in the drive toward more powerful, flexible and easier-to-use robots.

Explore generative AI sessions and experiences at NVIDIA GTC, the global conference on AI and accelerated computing, running March 18-21 in San Jose, Calif., and online.