country_code

Research Galore From 2024: Recapping AI Advancements in 3D Simulation, Climate Science and Audio Engineering

by Bill Dally

Editor’s note: As of June 6, 2025, NVIDIA Edify is no longer available as an NVIDIA NIM microservice preview. To explore available visual AI models, visit build.nvidia.com.

The pace of technology innovation has accelerated in the past year, most dramatically in AI. And in 2024, there was no better place to be a part of creating those breakthroughs than NVIDIA Research.

NVIDIA Research is comprised of hundreds of extremely bright people pushing the frontiers of knowledge, not just in AI, but across many areas of technology.

In the past year, NVIDIA Research laid the groundwork for future improvements in GPU performance with major research discoveries in circuits, memory architecture and sparse arithmetic. The team’s invention of novel graphics techniques continues to raise the bar for real-time rendering. And we developed new methods for improving the efficiency of AI — requiring less energy, taking fewer GPU cycles and delivering even better results.

But the most exciting developments of the year have been in generative AI.

We’re now able to generate, not just images and text, but 3D models, music and sounds. We’re also developing better control over what is generated: to generate realistic humanoid motion and to generate sequences of images with consistent subjects.

The application of generative AI to science has resulted in high-resolution weather forecasts that are more accurate than conventional numerical weather models. AI models have given us the ability to accurately predict how blood glucose levels respond to different foods. Embodied generative AI is being used to develop autonomous vehicles and robots.

And that was just this year. What follows is a deeper dive into some of NVIDIA Research’s greatest generative AI work in 2024. Of course, we continue to develop new models and methods for AI, and expect even more exciting results next year.

ConsiStory: AI-Generated Images With Main Character Energy

ConsiStory, a collaboration between researchers at NVIDIA and Tel Aviv University, makes it easier to generate multiple images with a consistent main character — an essential capability for storytelling use cases such as illustrating a comic strip or developing a storyboard.

The researchers’ approach introduced a technique called subject-driven shared attention, which reduces the time it takes to generate consistent imagery from 13 minutes to around 30 seconds.

Read the ConsiStory paper.

Panels of multiple AI-generated images featuring the same character
ConsiStory is capable of generating a series of images featuring the same character.

Edify 3D: Generative AI Enters a New Dimension

NVIDIA Edify 3D is a foundation model that enables developers and content creators to quickly generate 3D objects that can be used to prototype ideas and populate virtual worlds.

Edify 3D helps creators quickly ideate, lay out and conceptualize immersive environments with AI-generated assets. Novice and experienced content creators can use text and image prompts to harness the model, which is now part of the NVIDIA Edify multimodal architecture for developing visual generative AI.

Read the Edify 3D paper and watch the video on YouTube.

GluFormer: AI Predicts Blood Sugar Levels Four Years Out

Researchers from the Weizmann Institute of Science, Tel Aviv-based startup Pheno.AI and NVIDIA led the development of GluFormer, an AI model that can predict an individual’s future glucose levels and other health metrics based on past glucose monitoring data.

The researchers showed that, after adding dietary intake data into the model, GluFormer can also predict how a person’s glucose levels will respond to specific foods and dietary changes, enabling precision nutrition. The research team validated GluFormer across 15 other datasets and found it generalizes well to predict health outcomes for other groups, including those with prediabetes, type 1 and type 2 diabetes, gestational diabetes and obesity.

Read the GluFormer paper.

LATTE3D: Enabling Near-Instant Generation, From Text to 3D Shape 

Another 3D generator released by NVIDIA Research this year is LATTE3D, which converts text prompts into 3D representations within a second — like a speedy, virtual 3D printer. Crafted in a popular format used for standard rendering applications, the generated shapes can be easily served up in virtual environments for developing video games, ad campaigns, design projects or virtual training grounds for robotics.

Read the LATTE3D paper.

MaskedMimic: Reconstructing Realistic Movement for Humanoid Robots

To advance the development of humanoid robots, NVIDIA researchers introduced MaskedMimic, an AI framework that applies inpainting — the process of reconstructing complete data from an incomplete, or masked, view — to descriptions of motion.

Given partial information, such as a text description of movement, or head and hand position data from a virtual reality headset, MaskedMimic can fill in the blanks to infer full-body motion. It’s become part of NVIDIA Project GR00T, a research initiative to accelerate humanoid robot development.

Read the MaskedMimic paper.

StormCast: Boosting Weather Prediction, Climate Simulation 

In the field of climate science, NVIDIA Research announced StormCast, a generative AI model for emulating atmospheric dynamics. While other machine learning models trained on global data have a spatial resolution of about 30 kilometers and a temporal resolution of six hours, StormCast achieves a 3-kilometer, hourly scale.

The researchers trained StormCast on approximately three-and-a-half years of NOAA climate data from the central U.S. When applied with precipitation radars, StormCast offers forecasts with lead times of up to six hours that are up to 10% more accurate than the U.S. National Oceanic and Atmospheric Administration’s state-of-the-art 3-kilometer regional weather prediction model.

Read the StormCast paper, written in collaboration with researchers from Lawrence Berkeley National Laboratory and the University of Washington.

NVIDIA Research Sets Records in AI, Autonomous Vehicles, Robotics

Through 2024, models that originated in NVIDIA Research set records across benchmarks for AI training and inference, route optimization, autonomous driving and more.

NVIDIA cuOpt, an optimization AI microservice used for logistics improvements, has 23 world-record benchmarks. The NVIDIA Blackwell platform demonstrated world-class performance on MLPerf industry benchmarks for AI training and inference.

In the field of autonomous vehicles, Hydra-MDP, an end-to-end autonomous driving framework by NVIDIA Research, achieved first place on the End-To-End Driving at Scale track of the Autonomous Grand Challenge at CVPR 2024.

In robotics, FoundationPose, a unified foundation model for 6D object pose estimation and tracking, obtained first place on the BOP leaderboard for model-based pose estimation of unseen objects.

Learn more about NVIDIA Research, which has hundreds of scientists and engineers worldwide. NVIDIA Research teams are focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.

New Adobe Premiere Color Grading Mode Accelerated on NVIDIA GPUs

New NVIDIA RTX-accelerated features streamline creative workflows in Adobe Premiere and system optimization with NVIDIA Project G-Assist.
by Joel Pennington

The NAB Show 2026 trade show, running April 18-22 in Las Vegas, is set to showcase a wave of new features and optimizations for top video editing applications. Bringing together over 60,000 content professionals from across the broadcast and media and entertainment industries, the event highlights how video editors, livestreamers and professional creators are exploring new tools, accelerated by NVIDIA RTX technology, to enhance and streamline their creative workflows.

At the show, Adobe is announcing a new Adobe Premiere Color Mode in beta. 

Designed to function as a dedicated grading environment nested directly within Premiere, it offers a clean, responsive interface that lets editors stay in their creative flow rather than relying on external tools for color correction. Tapping into GPU acceleration on NVIDIA GeForce RTX- and NVIDIA RTX PRO-equipped systems, this streamlined workflow, operating in 32-bit color depth for the first time, delivers significantly faster performance and quality.

NVIDIA also launched a new update to NVIDIA Project G-Assist — an experimental AI assistant that helps tune, control and optimize GeForce RTX systems. 

Color Meets Compute

Premiere’s Color Mode is a new clean, responsive interface within Adobe Premiere that enables editors to do color grading on native videos. Every element is designed to guide editors through the grading process without distractions. A large program monitor anchors the experience, providing immediate visual feedback as adjustments are made to enable faster decision-making and more precise control.

A clip grid view allows editors to visualize progression across shots in a sequence. This makes it easier to maintain consistency across scenes and ensure a cohesive look throughout a project. 

Controls are organized into focused modules, each tailored to a specific aspect of color grading. Multiple modules can be active simultaneously, giving editors flexibility while maintaining clarity. Each control features a unique heads-up display (HUD), providing contextual guidance without cluttering the interface.

Color grading is one of the most computationally intensive tasks in post-production. Every adjustment — bidirectional controls, multi-zone tonal shaping and stacked color operations — runs on NVIDIA GPUs, accelerating playback, iteration and visual feedback.

Editors can work with up to six luminance adjustment zones, moving beyond traditional highlights, midtones and shadows models. This allows for more nuanced tonal control and finer adjustments across the image. 

Visual scopes are context-aware, dynamically adapting based on the selected tool. HUD overlays provide visual cues directly within the scopes, helping editors understand how their adjustments affect the image without needing to interpret complex visual scopes and graphs.

The entire system now operates in 32-bit color depth precision, delivering maximum color fidelity and preventing unwanted clipping. Editors retain full control, with the ability to clip colors intentionally when needed for creative effect. Color styles can also be applied flexibly, at the sequence, clip, reel or custom group level, making it easier to manage looks across complex projects.

Download the Adobe Premiere (beta) to get started with Color Mode. 

Project G-Assist: Enhanced Recommendations and Controls 

The NVIDIA Project G-Assist on-device AI assistant helps users get the most out of their hardware. Today’s update adds an advanced detection system for gaming settings, as well as an enhanced knowledge system, enabling G-Assist to deliver higher accuracy when providing advice or adjusting settings for esports and AAA gaming.

The assistant can also now control more settings across systems. It can configure advanced RTX features from the NVIDIA App, including NVIDIA DLSS Overrides, Smooth Motion, RTX HDR, Digital Vibrance and encoder settings.

Download Project G-Assist v0.2.1 from the NVIDIA App.

#ICYMI: The Latest Updates for RTX AI PCs

📹 Learn how visual effects shop Corridor Crew’s Niko Pueringer built his own green screen key tool, powered by NVIDIA RTX GPUs, at NAB. Stop by the Puget Systems booth on Monday, April 20, at 1 p.m. PT for a special presentation, or tune in on NVIDIA Studio’s YouTube channel on Tuesday, April 21, at 12 p.m. PT to watch the full session.

🖼️ Also at NAB, join NVIDIA’s Sabour Amirazodi for a special presentation at the ASUS booth on Tuesday, April 21, at 11 a.m. PT. Amirazodi will showcase how guiding generative AI can produce creative outputs like storyboards or entire movie trailers — based on a single image input. 

📽️ Check out content creator Gavin Herman’s Studio Session, “How to Edit Professional Talking Head Videos in DaVinci Resolve,” on the NVIDIA Studio YouTube channel. Generative workflow specialists can watch this two-hour, instructor-led workshop on how to use NVIDIA GPU acceleration for ComfyUI.

🦞 LM Studio is now an official OpenClaw provider. OpenClaw can now run local models through LM Studio on NVIDIA GPUs, unlocking faster on-device performance.

🦥 Unsloth and NVIDIA have teamed up to eliminate hidden bottlenecks that slow down fine-tuning on NVIDIA GPUs, improving fine-tuning performance by 15%. 

✨ Google’s Gemma 4 family of omni-capable models are built for local AI across a wide range of devices. Google and NVIDIA have optimized Gemma 4 for NVIDIA GPUs, enabling efficient performance on NVIDIA RTX-powered PCs and workstations, NVIDIA DGX Spark personal AI supercomputers and NVIDIA Jetson Orin Nano edge AI modules.

📽️ Check out this NVIDIA GTC session on how developers can build, run and optimize AI agents locally on NVIDIA GPUs, covering everything from quantization to backends like Ollama and applications like OpenClaw and ComfyUI.

👀 Wondershare Filmora has added a new feature for Eye Contact Correction based on the NVIDIA Broadcast Eye Contact feature. This feature runs on the cloud on NVIDIA GPUs, designed to refine the gaze of subjects in post production for a more natural, confident and camera-ready look, delivering polished, professional videos in seconds. 

Filmora’s AI Eye Contact Correction feature powered in the cloud by NVIDIA GPUs.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Follow NVIDIA Workstation on LinkedIn and X

Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era

by Heather McDiarmid
Key visual showcasing partner robots in action running and working on an assembly line.

Editor’s note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners, and enterprises can transform their workflows using the latest advances in OpenUSD and NVIDIA Omniverse.

NVIDIA GTC last week showcased a turning point in physical AI: Robots, vehicles and factories are scaling from single use cases and isolated deployments to sophisticated enterprise workloads across industries. 

At the center of this shift are new frontier models for physical AI, including NVIDIA Cosmos 3, NVIDIA Isaac GR00T N1.7 and NVIDIA Alpamayo 1.5. 

NVIDIA also released the NVIDIA Physical AI Data Factory Blueprint, designed to push the state of the art in world modeling, humanoid skills and autonomous driving, as well as the NVIDIA Omniverse DSX Blueprint for AI factory digital twin simulation.

Open source agentic frameworks such as OpenClaw extend the AI stack all the way to operations — enabling long‑running “claws” that use tools, memory and messaging interfaces to orchestrate workflows, manage data pipelines and execute tasks autonomously on dedicated machines. 

“With NVIDIA and the broader ecosystem, we’re building the claws and guardrails that let anyone create powerful, secure AI assistants,” said Peter Steinberger, creator of OpenClaw, in an NVIDIA press release from GTC. 

OpenUSD is a driving force behind the scalability of physical AI — providing a common, scene‑description language that lets teams bring computer-aided design (CAD) data, simulation assets and real‑world telemetry into a shared, physically accurate view of the world. 

Simulating the AI Factory Before It’s Built

Modern AI factories are complex — spanning thermals, power grids, network load and mechanical systems. Building them on time and on budget becomes much easier when using simulation technology. 

To tackle this, NVIDIA introduced the Omniverse DSX Blueprint at GTC, a reference architecture that unifies simulation across every layer of an AI factory through a single digital twin. This enables operators to optimize performance and efficiency before a rack is installed in the real world.

Compute Is Data: Real-World Data Is No Longer the Moat

Real-world data used to function as a moat for physical AI — but it doesn’t scale. The real world is messy, unpredictable and full of edge cases, and the pipelines to process, simulate and evaluate data are fragmented. The bottleneck isn’t just data — it’s the entire data factory.

To help address this, NVIDIA introduced at GTC its Physical AI Data Factory Blueprint, an open reference architecture that transforms compute into large-scale, high-quality training data. Built on NVIDIA Cosmos open world foundation models and the NVIDIA OSMO operator, it unifies data curation, augmentation and evaluation into a single pipeline, enabling developers to generate diverse, long-tail datasets from limited real-world inputs.

Leading physical AI developers including FieldAI, Hexagon Robotics, Linker Vision, Milestone Systems, Skild AI and Teradyne Robotics are already tapping the blueprint to speed up robotics projects, vision AI agents and autonomous vehicle programs.

Microsoft Azure and Nebius are the first cloud platforms to offer the blueprint, turning world-scale compute into turnkey data production engines.

“Together with cloud leaders, we’re providing a new kind of agentic engine that transforms compute into the high-quality data required to bring the next generation of autonomous systems and robots to life,” said Rev Lebaredian, vice president of Omniverse and simulation technologies at NVIDIA, in this press release. “In this new era, compute is data.”

From OpenUSD to Reality: Seamless Design to Deployment

Converting CAD files to OpenUSD is a critical step in the physical AI pipeline — transforming engineering data into simulation-ready assets that developers can use to build, test and validate robots in physically accurate virtual environments. 

Using tools like the NVIDIA Omniverse Kit software development kit and NVIDIA Isaac Sim, teams can optimize and enrich 3D data for real-time rendering, simulation and collaborative workflows.  

Companies including FANUC and Fauna Robotics are using this seamless CAD-to-OpenUSD workflow to speed up robotic system design and validation.

Transforming Manufacturing and Logistics Through Industrial Digital Twins

“Factories themselves are now robotic systems,” Lebaredian said during his special address on digital twins and simulation at GTC. 

All factories are born in simulation. The NVIDIA Mega Omniverse Blueprint provides enterprises with a reference architecture to design, test and optimize robot fleets and AI agents in a physically accurate facility digital twin before a single robot is deployed on the floor. 

KION, working with Accenture and Siemens, is using this blueprint to build large-scale warehouse digital twins that train and test fleets of NVIDIA Jetson-based autonomous forklifts for GXO, the world’s largest pure-play contract logistics provider. 

Physical AI Steps From Simulation to the Real World

NVIDIA is partnering with the global robotics ecosystem — including leading robot brain developers, industrial robot giants and humanoid pioneers — to enhance production-level physical AI. 

ABB Robotics, FANUC, KUKA and Yaskawa, which have a combined global install base of over 2 million robots, are using NVIDIA Omniverse libraries and NVIDIA Isaac simulation frameworks to validate complex robot applications and production lines through physically accurate digital twins. These companies have also integrated NVIDIA Jetson modules into their controllers to enable real-time AI inference. 

Robot development starts with the robot brains, which is why leading developers including FieldAI and Skild AI are building theirs using NVIDIA Cosmos world models for data generation and Isaac simulation frameworks to validate policies in simulation. 

Meanwhile, Generalist AI is using NVIDIA Cosmos to explore generating synthetic data. This combination allows robots to become proficient in any task — from supply chain monitoring to food delivery — at an exceptional pace. 

Read all of NVIDIA’s announcements from GTC on this online press kit and watch the keynote replay. Catch up on all Physical AI Days sessions from GTC and watch the developer livestream replay.

More Than Meets the Eye: NVIDIA RTX-Accelerated Computers Now Connect Directly to Apple Vision Pro

NVIDIA and Apple’s collaboration brings native integration of NVIDIA CloudXR 6.0 to visionOS, securely delivering NVIDIA RTX-powered simulators and professional 3D graphics applications — like Immersive for Autodesk VRED on Innoactive’s XR streaming solutions — to Apple Vision Pro.
by Richard Kerris