MD Anderson Researchers Harness AI to Transform Cancer Care

Scientists from the leading cancer center tap into the power of AI as they reshape their approach to data.

November 9, 2021 by Mona Flores

To unlock real insights from data, AI and data science research can’t live on the outskirts of an institution — it has to become part of an organization’s core strategy.

The University of Texas MD Anderson Cancer Center, the top-ranked cancer hospital in the U.S., is doing just that, with a new focus on data governance and dozens of researchers pursuing AI-accelerated oncology projects to improve patient care.

“We are focusing on the data in context, ensuring we have a coordinated metadata supply chain to address the current challenges in making AI models translate to impact in the clinic,” said Dr. Caroline Chung, who was recently appointed MD Anderson’s first chief data officer. “To build better and more robust predictive models, we need a coordinated strategy that covers every step from data generation to the clinical use of machine learning insights.”

This data governance strategy will influence the way hospital data is collected and used for insight generation, and enable findability, accessibility, interoperability and reusability of the data.

“It’s a big culture change,” said Chung. “The more data we can capture with contextual information, the more complex questions we can ask and the greater potential we have to use machine learning insights to help our clinicians improve their interactions with patients to guide the data-driven treatment decisions with the best patient outcomes aligned with the goals of care.”

By building a pipeline that collects the high-quality data researchers need, stores it securely and tracks how it’s being used, MD Anderson aims to better support projects to help clinicians analyze radiology data, deliver cancer treatment and predict complications like sepsis.

Many of these projects are already underway, accelerated by the speed of new GPU-powered technologies, such as NVIDIA DGX systems. New investments coming online at MD Anderson will give researchers access to thousands of additional GPU cores to support AI projects across the institution.

Applying AI to Diagnostic Imaging

The first step in oncology is detecting tumors — the earlier the better. MD Anderson is developing early detection AI applications to help diagnose patients with pancreatic cancer, which has a five-year survival rate of just 10 percent.

“Pancreatic cancer is often diagnosed after it’s already metastasized, meaning it’s spread to other organs,” said Dr. Eugene Koay, co-director of Gastrointestinal Radiation Oncology at MD Anderson. “We’re working on AI models to analyze the pancreas anytime we see it in a CT scan, MRI study or endoscopic ultrasound, whether or not the patient’s appointment is related to the pancreas.”

Not all pancreatic tumors are the same. Some are slow moving, others are aggressive. Some originate from cysts in the pancreas, others don’t.

In collaboration with the Early Detection Research Network, Koay and his team are working on convolutional neural networks that identify which cases are most likely to develop into malignant cancer, so clinicians can better support patients at risk.

Imaging Insights Inform Treatment Planning

When preparing for radiation therapy to treat cancerous cells, oncologists rely on a process known as contouring to trace the tumors that will be targeted by radiation treatment.

It’s a time-consuming process, and oncologists often have a backlog of radiotherapy treatment plans to create for patients. Dr. Laurence Court, associate professor of Radiation Physics at MD Anderson, hopes to reduce the burden of manual contouring with AI tools, enabling hospitals to treat thousands more cancer patients each year.

He’s especially interested in the impact these AI clinical tools could have in low-resource settings, where a shortage of radiologists and oncologists makes it harder to access lifesaving radiotherapy treatments.

Contouring is also used to plan for MRI-assisted radiosurgery, an advanced form of brachytherapy in which a radiation dose is delivered to cancerous tissue through implanted seeds. MD Anderson radiation oncologist Dr. Steven Frank uses this therapy to treat prostate cancer.

Precise contouring of the prostate and surrounding organs on MRI ensures that radioactive seeds are delivered to the right areas to treat the cancer without harming neighboring tissues.

By adopting an AI model that uses advances in GPU technologies, MD Anderson oncologists have improved the quality of contours for brachytherapy treatment planning and treatment quality assessment, said Dr. Jeremiah Sanders, a medical imaging physics fellow at MD Anderson who’s developing translational AI in Frank’s lab.

Sanders and Frank are also working on a model for use after a brachytherapy procedure — an AI application that analyzes MRI studies of the prostate to determine the quality of the radiation delivery. Insights from this model can help clinicians determine if additional treatment is needed and how to manage patients after their treatments.

Keeping a Watchful AI on Model Accuracy

For an AI model to succeed in a clinical setting, medical researchers need to catch the cases where the neural network struggles and retrain it to improve the application’s performance.

Dr. Kristy Brock, professor of Imaging Physics and Radiation Physics at MD Anderson, is working on an anomaly detection project to determine the cases where an AI model that contours liver tumors from CT scans fails — such as unusual images where a patient has a stent in the liver or fluid around the organ.

By identifying these rare failures, researchers can introduce additional training examples that are similar to cases the neural network previously stumbled on. This continuous training method selectively bolsters training data to improve model performance more efficiently.

“We don’t want to keep collecting data that looks the same as our first 150 scans,” Brock said. “We want to identify cases that will increase the variability of our sample dataset, which in turn boosts the model’s accuracy and generalizability.”

MD Anderson is one of several leading healthcare institutions adopting AI to improve medical research and patient care. Learn more about AI in healthcare at NVIDIA GTC, running online through Nov. 11.

Tune in to a healthcare special address by Kimberly Powell, NVIDIA’s VP of healthcare, on Nov. 9 at 10:30am Pacific. Watch NVIDIA founder and CEO Jensen Huang’s GTC keynote address below. Subscribe to NVIDIA healthcare news here.

At Cannes Lions, NVIDIA Partners Reshape Advertising and Marketing With AI

June 18, 2026 by Jamie Allan

0 Comments

The digital era gave the advertising and marketing industry speed; the AI era is giving it autonomous operations.

For companies building next-generation technologies for advertising and marketing, the question is no longer whether to adopt AI but whether their infrastructure can support it at the speed and scale the industry demands.

At Cannes Lions, running June 22-26 in France, industry leaders including Alembic, Amazon Web Services (AWS), Criteo, Higgsfield, KERV.ai and Taboola are showcasing how NVIDIA technologies help unlock greater creativity and enable faster, autonomous operations at enterprise scale.

Decision Intelligence at Enterprise Scale

Causal AI platform Alembic helps solve one of enterprises’ biggest challenges: proving what marketing initiatives actually drive growth, not just reporting on what happened. Modeling true causation simultaneously across every channel, market and audience requires AI infrastructure that can process enormous, fast-changing datasets without reducing them to correlation-based assumptions.

NVIDIA DGX Vera Rubin NVL72 systems enable Alembic to scale its Causal AI models to analyze more variables, run larger simulations and quantify the true drivers of growth across marketing investments. Alembic will be the first Causal AI company to use NVIDIA DGX Vera Rubin SuperPODs for enterprise-scale causal modeling, giving executives a single source of unbiased truth on what drove business outcomes and where capital is being wasted, so they can act with confidence on future decisions.

Alembic’s inference runs on private supercomputing infrastructure inside Equinix data centers where the enterprise data already lives, keeping AI workloads local. World Wide Technology extends this to secure and regulated environments. Together, the companies offer a complete enterprise AI stack purpose-built for executives and data leaders accountable for capital decisions.

Smarter Bidding at Auction Speed

For advertisers, serving ads and relevant recommendations across billions of daily transactions requires AI that’s accurate, fast and affordable enough to run at scale.

Amazon Web Services (AWS) is bringing cloud infrastructure, foundation models and NVIDIA GPU-accelerated computing together into a cohesive stack for the adtech industry that can scale for the era of AI agents. AWS is giving advertisers and demand-side platforms, supply-side platforms and independent software vendors a production-ready reference implementation to run AI-powered bidding directly inside auctions — powered by NVIDIA Triton Inference Server, which delivers deep learning inference fast enough to fit within real-time auction windows.

That means adtech companies can move from rules-based decisioning to AI-powered models for bid price optimization, audience activation and deal scoring directly within the live auction pipeline.

Advertising company Criteo helps retailers show the right product to the right shopper at the right moment, across one of the largest recommendation networks in digital advertising. Keeping those recommendations relevant means continuously retraining its AI on billions of shopper timelines, a process where speed directly translates to quality.

Collaborating with NVIDIA, Criteo achieved a roughly 2x speedup in model training on NVIDIA Blackwell GPUs, driven by the NVIDIA cuEmbed open library. That efficiency already frees roughly 17,000 GPU hours a year, and the companies are now scaling the work further.

Taboola is applying the same infrastructure logic to conversational AI, using NVIDIA GPUs to power DeeperDive, its AI answer engine, and extending that infrastructure to AI platforms and chatbots so they can generate revenue from advertising.

Agentic AI Across the Marketing Workflow

In marketing and other industries, AI agents are increasingly acting as digital coworkers, taking on long-running tasks across planning, execution and optimization. But these agents are only deployable for enterprises when they come with proper controls, including safety guardrails, auditability and role-based permissioning.

The NVIDIA Agent Toolkit, which includes NVIDIA NemoClaw blueprints and the NVIDIA OpenShell secure runtime, provides these controls.

For example, Higgsfield AI, an AI video and image generator production platform, offers Higgsfield Supercomputer agents that manage the full marketing automation lifecycle: from campaign ideation, planning, creative production to posting and autonomous campaign optimization — in a single interface. It orchestrates leading large language models alongside 35+ image, audio and video models, including Higgsfield’s proprietary Soul and Soul 2.0 models built on NVIDIA Blackwell architecture.

As part of the collaboration, NVIDIA Agent Toolkit software, including NVIDIA Nemotron open models, powers specialized subagents within the Higgsfield Supercomputer, running continuously inside every campaign. NemoClaw and OpenShell are being integrated to provide the enterprise trust layer.

The result: the full marketing lifecycle, from ideation and creative production through posting, performance analysis and optimization, is available in a single interface. Marketing campaigns for nearly 400 of the Fortune 500 companies are created on the platform.

Video courtesy of Higgsfield.

Contextual and Content Intelligence at Scale

AI understanding content at the level of meaning requires advanced infrastructure. NVIDIA’s multimodal stack provides the vector search, data processing and video understanding capabilities that make this kind of intelligence viable at production scale.

AI-powered media leader KERV’s Moment Match Engine evaluates a multitude of signals across every video frame and media asset to understand individual scenes, objects and products, providing content recommendations based on ad creative — the visual and textual elements of an advertisement — to drive improved engagement.

Video courtesy of KERV.ai.

KERV.ai recently optimized its processing pipeline, achieving over 10x improvements in speed and efficiency when using the NVIDIA Nemotron 3 Nano Omni open model in the platform. KERV’s solution analyzes what each ad or media brief contains, who it resonates with and which exact moment within content environments to target.

On MediaPerf, an open benchmark for AI video understanding, Nemotron 3 Nano Omni — adopted by ecosystem partners including PYLER, which uses NVIDIA DGX B200 systems — delivered the highest throughput and lowest inference cost of any model evaluated, open or closed source.

Learn more about how NVIDIA powers advertising and marketing technologies.

Featured video courtesy of Higgsfield.

Coherent Breaks Ground on Expanded Texas Facility, Scaling AI’s Optical Backbone

Coherent’s expansion at its Sherman, Texas, campus scales what it calls the world’s first volume production 6-inch indium phosphide fab, a key supplier across NVIDIA’s AI stack.

June 16, 2026 by NVIDIA Newsroom

0 Comments

AI runs at the speed of light. More and more, that light is made in Texas.

Coherent broke ground today on an expanded manufacturing building in Sherman, Texas.

The company makes the lasers, optical components and compound semiconductors that wire AI systems together — and runs what it calls the world’s first 6-inch indium phosphide fab.

NVIDIA founder and CEO Jensen Huang and Coherent CEO Jim Anderson were on hand for the ceremony, joined by Sherman Mayor Shawn Temann and Adriana Cruz, executive director of Texas Economic Development and Tourism, who delivered remarks.

The expanded building will scale production of the same InP wafers that carry data between chips, servers and data centers at the speed of light — the optical backbone of modern AI infrastructure.

It’s the kind of milestone that turns a commitment into construction: a concrete step in expanding advanced semiconductor manufacturing in the United States.

“AI is the ultimate general-purpose technology,” Huang said during a conversation with Anderson at the groundbreaking. “Because intelligence is fundamental — the ability to process information, to reason and solve problems — it affects every single industry.”

Public programs like the CHIPS Act, funded at roughly $50 billion, were designed to bring chip manufacturing back to the U.S.

As part of today’s event, Coherent is announcing a $50 million CHIPS Act grant to help finance the expanded Sherman facility — building on roughly $17 million in earlier support from the Texas CHIPS program and the Sherman Economic Development Corporation.

NVIDIA’s own commitment to produce up to $500 billion of AI infrastructure in the U.S. through industry partnerships with new sites in Arizona and Texas adds private-sector momentum.

“Coherent is a world-class company, and the work you do is vital to our future, vital to the future of artificial intelligence and vital to reindustrializing the United States,” Huang said.

NVIDIA founder and CEO Jensen Huang and Coherent CEO Jim Anderson.

Compound semiconductors like indium phosphide and gallium arsenide — the materials behind the high-speed networking and optical interconnects that modern AI runs on — don’t get the headlines that logic chips do. But their domestic supply chains have been thin for years. Today’s event was an argument that the gap is closing.

When 576 GPUs span eight racks and operate as a single system — as they will in NVIDIA Vera Rubin Ultra NVL576, which links eight NVLink racks of 72 NVIDIA Rubin Ultra GPUs into one 576-GPU domain — copper can’t carry the signal across that distance.

To connect hundreds of thousands of processors separated by hundreds or thousands of feet across a data center, the only way to solve that problem is silicon photonics, Huang explained.

As signaling rates climb, the reach of a metal trace shrinks, and spanning eight racks in copper would burn power on retimers and signal conditioning that a data center would rather spend on compute.

Optics pays a one-time penalty to move from electrical to light, but once paid, distance is nearly free. At NVL576 scale, light is the most power-efficient option.

NVIDIA and Coherent aren’t new to each other — they’ve worked together for roughly two decades.

In March, they deepened the relationship into a multiyear strategic partnership: NVIDIA is investing $2 billion in Coherent to support R&D, future capacity and U.S.-based manufacturing, alongside a multibillion-dollar purchase commitment for advanced laser and optical networking products.

Sherman, a city of roughly 45,000 people an hour north of Dallas, has become the latest dateline for the AI era — emblematic of a boom built as much on picks, shovels and manufacturing muscle as on software.

“When we get to full capacity, this site will support more than 550 direct jobs — and thousands of jobs, direct and indirect,” Anderson said.

What the factory ships isn’t a single product dropped into a single slot. It’s the lasers, transceivers and pluggable optical modules that move data across NVIDIA networking — each enabling a different part of the system.

“As AI systems grow larger and more powerful, connectivity is just as important as compute,” Anderson said. “AI runs on compute, but it scales on connectivity — and Sherman is where that connective tissue gets built.”

Today’s event made that visible.

Before the groundbreaking, guests toured the existing fab and previewed the equipment that will populate the expanded building once it’s running. An NVIDIA rack stood on the factory floor, one of the six stops on the tour.

The tour was followed by a fireside chat with Huang and Anderson, where the two CEOs discussed the partnership and what scaling domestic optical manufacturing means for the AI buildout ahead.

“Today marks an important milestone — not just for Coherent, but for American manufacturing and for the future of AI infrastructure,” Anderson said.

The semiconductor laser was born in U.S. labs — Bell Labs demonstrated a room-temperature version in 1970 — before the technology and its manufacturing largely migrated overseas.

“We were founded as a manufacturing company in 1971. We’ve always been a U.S. manufacturing company — and after 50 years, the most advanced 6-inch indium phosphide line in the world is right here in Sherman,” Anderson said.

That manufacturing gap shows up in the wafers themselves: while silicon fabs run on 12-inch wafers, most of the world’s InP production is still stuck on 3- and 4-inch wafers — lower yields and far fewer components per run.

Moving to 6-inch wafers roughly quadruples the usable area of a 3-inch wafer (area scales with the square of the diameter), driving down cost and unlocking the volume the AI buildout demands.

It took 50 years to build the first line, Huang said — and in one year, they’ve quadrupled it, a measure of the demand for accelerated computing.

Inside, the core processes are familiar: lithography, photoresist, depositing and etching materials, layer by layer. The difference is the material. On an InP substrate, engineers grow exotic compound-semiconductor layers and tune them for precise optical properties — the physics that lets a chip emit and modulate light.

Today, that InP travels inside Coherent’s pluggable optics — transceivers about the size of a USB stick that plug into the front of NVIDIA networking switches and move data between racks across the data center floor, where copper can’t reach. Each module carries an indium phosphide laser.

Those same modules now help enable NVIDIA Spectrum-X Photonics and Quantum-X Photonics switches with co-packaged optics: Coherent supplies the external laser module that plugs into the switch’s front plate.

And as NVIDIA works to keep optics from becoming the next bottleneck, demand for those lasers only climbs.

“Ten years from now, I think we’ll look back and realize AI is what made it possible to invest in sustainable energy, upgrade our energy grid and reconstitute a workforce,” Huang said. “You can’t have only information workers in an economy — you also have to have builders. We have an opportunity over the next 10 years to reshape our communities and be much more balanced.”

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

The new DiffusionGemma open model generates text in parallel — not one token at a time — and is optimized to run on the NVIDIA RTX PRO platform, NVIDIA DGX Spark systems and GeForce RTX GPUs.

June 10, 2026 by Michael Fukuyama

0 Comments

Today, Google DeepMind released DiffusionGemma — an experimental open model built for exceptionally fast text generation. NVIDIA has optimized DiffusionGemma to run even faster across NVIDIA GeForce RTX GPUs, the NVIDIA RTX PRO platform and NVIDIA DGX Spark systems, from local PCs to the cloud.

Rather than generating text one word at a time, DiffusionGemma generates multiple words in parallel to output whole blocks of text, opening a new, low-latency frontier for the kind of single-user workloads that developers, researchers and AI enthusiasts run every day.

Features of the new model include:

Parallel generation: DiffusionGemma denoises up to 256 tokens per step instead of predicting one at a time.
Built on Gemma 4: DiffusionGemma is built on Gemma 4, a 26-billion-parameter mixture-of-experts model that activates just 3.8 billion parameters per step, pairing a diffusion head with Google’s Gemma 4 architecture.
Up to 4x faster performance: The boost means fast text generation, where single-user generation usually stalls — on local hardware.
Open and local: DiffusionGemma is open weights under a permissive Apache 2.0 license and runs entirely on RTX and DGX Spark — no cloud, no per-token cost — with day-zero support in Hugging Face Transformers, vLLM and Unsloth.

A Different Way to Generate Text

Almost every large language model (LLM) in wide use today is autoregressive — meaning it generates text one token at a time, with each new word depending on the one before it. That sequential process is what makes interactive AI feel like it’s typing.

DiffusionGemma takes a different path. Built on the Gemma 4 26B mixture-of-experts architecture, it generates text the way diffusion models generate images: by starting from noise and refining a whole block of text at once. Each step denoises up to 256 tokens in parallel rather than emitting a single token and waiting to compute the next.

The result is a model that thinks in blocks instead of sequentially. For latency-sensitive, single-user work — such as interactive chat, agentic loops or on-device assistants that plan and act — that parallelism translates into responses fast enough to keep pace with how developers think and iterate.

DiffusionGemma Flies on NVIDIA GPUs

Generating one token at a time is fundamentally a memory-bound problem — a traditional LLM spends most of its time waiting on memory bandwidth, not doing math, which leaves a lot of compute on the table.

Diffusion flips the equation. Pulling a full 256-token block through the transformer in parallel is a compute-bound workload — exactly what NVIDIA GPUs are built for. NVIDIA Tensor Cores accelerate the dense parallel math, and the CUDA software stack lets the model run efficiently from day one without bespoke tuning. In short, the model’s design plays directly to the GPU’‘s strengths.

That shows up in the numbers. DiffusionGemma delivers 1,000 tokens/sec on a single NVIDIA H100 Tensor Core GPU, 150 tokens/sec on NVIDIA DGX Spark and up to 2,000 tokens/sec on NVIDIA DGX Station — roughly 4x faster than an equivalent autoregressive model running in the same single-user regime.

That advantage holds across NVIDIA’s full lineup, running:

Locally on the NVIDIA DGX Spark deskside personal AI supercomputer — powered by the NVIDIA GB10 Grace Blackwell Superchip with 128GB of unified memory — with the preinstalled NVIDIA AI software stack ready for prototyping, fine-tuning and fully local agent workflows.
On NVIDIA RTX PRO 6000 workstations, providing developers, researchers and AI professionals with the headroom to run local low-latency generation and agentic loops as part of a professional workflow.
On DGX Station, delivering best-in-class, local high-speed inference with up to 2,000 tokens/sec for low-latency text generation and agentic loops with 748GB of coherent memory.
On GeForce RTX GPUs, with llama.cpp support coming soon.

Get Started Locally

The fastest way to start testing and prototyping the model is through Hugging Face Transformers, which runs DiffusionGemma on a GeForce RTX 5090 or DGX Spark out of the box. For higher-throughput inference, vLLM provides day-zero serving support.

For adapting the model to a specific task or domain, fine-tuning is available through Unsloth and NVIDIA NeMo framework, with ready-made DGX Spark playbooks to get a local environment running quickly. Check out the vLLM playbooks for DGX Spark , RTX PRO and DGX Station.

Try Diffusion Gemma on Hugging Face or test it for free using NVIDIA-hosted application programming interfaces at build.nvidia.com.

Go deeper on the architecture and local deployment by reading the NVIDIA technical blog and the Google DeepMind announcement.

#ICYMI: The Latest From RTX AI Garage

🎬 NVIDIA researchers released SANA-WM, an open source world model that turns a single image and a camera path into a minute-long, 720p video with precise 6-DoF control. At just 2.6 billion parameters, its distilled version generates a full 60-second clip in 34 seconds on a single NVIDIA GeForce RTX 5090 GPU using the NVFP4 format — delivering up to 36x higher throughput than comparable open models while running on one GPU. Read the paper.

🛠️ Building Windows agents just got a full toolset — NVIDIA and Microsoft rolled out turnkey agent sandboxing on native Windows — Microsoft eXecution Containers plus the NVIDIA OpenShell runtime — alongside up to 2x faster agentic inference and native Windows support for Hermes Agent.

🤖DGX Spark goes from unboxing to a running agent in minutes — A streamlined NVIDIA NemoClaw install gets developers to a working local agent fast, with Qwen3.6-35B running up to 2.6x faster on vLLM. And the new cluster assistant in NVIDIA Sync links up to four DGX Spark units into one 512GB pool — enough for ~400-billion-parameter models.

Plug in to RTX Spark on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX Spark newsletter.

See notice regarding software product information.

Month: November 2021

At Cannes Lions, NVIDIA Partners Reshape Advertising and Marketing With AI

Decision Intelligence at Enterprise Scale

Smarter Bidding at Auction Speed

Agentic AI Across the Marketing Workflow

Contextual and Content Intelligence at Scale

Coherent Breaks Ground on Expanded Texas Facility, Scaling AI’s Optical Backbone

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

A Different Way to Generate Text

DiffusionGemma Flies on NVIDIA GPUs

Get Started Locally

#ICYMI: The Latest From RTX AI Garage

Share on Mastodon

Applying AI to Diagnostic Imaging

Imaging Insights Inform Treatment Planning

Keeping a Watchful AI on Model Accuracy

Related News

Decision Intelligence at Enterprise Scale

Smarter Bidding at Auction Speed

Agentic AI Across the Marketing Workflow

Contextual and Content Intelligence at Scale

Related News

Related News

A Different Way to Generate Text

DiffusionGemma Flies on NVIDIA GPUs

Get Started Locally

#ICYMI: The Latest From RTX AI Garage

Related News