Stretch a strand of human DNA out to its full length and it’s two meters long. Yet all that material – and the information it carries – gets balled up inside the nucleus of a single human cell.
That’s more than a just remarkable feat of biological origami, explained Erez Lieberman-Aiden in a keynote speech at our GPU Technology Conference.
Lieberman-Aiden, a Harvard fellow at the Harvard Society of Fellows, said that the way this material folds up determines how the information carried on a strand of DNA is expressed. When things go right, it generates skin cells, blood cells, and brain cells; when they don’t, it can generate cancer cells.
It turns out DNA – the very stuff of life itself – is a lot like a block of ramen noodles, Lieberman-Aiden explained. Both uncooked ramen and the DNA inside our cells are folded up into a shape called a fractal globule – a discovery he and his researchers made with the help of NVIDIA’s GPUs.
It’s a shape with some remarkable properties, allowing the human body to make use of this very dense ball of instructions.
“This is a ball of yarn with no knots…,” he said. “So, if there is a gene in there that you want to transcribe, you can pull it out, and when you’re done you can put it back. The genome’s architecture seems optimized to itself be a parallel processor.”
Kind of like a GPU.
That wild cross-disciplinary insight has shaken up the field of genomics – and it’s the kind of thinking that Lieberman-Aiden is known for. A polymath who has made a career out of mixing different disciplines to come up with new insights, Lieberman-Aiden studied math, philosophy and physics as an undergraduate at Princeton; history at Yeshiva University for a master’s degree; and math and bioengineering at Harvard and MIT for his PhDs.
Set to join the Baylor College of Medicine, Lieberman-Aiden has used mathematics and the vast quantities of data to discover how verbs regularize over time. He also launched the field of ‘culturnomics’ – which applies quantitative techniques to tease trends out of the massive amount of text that has been digitized by Google.
That data crunching helped Lieberman-Aiden and his team figure out the shape DNA takes inside the human nucleus. Lieberman-Aiden compares his approach to the way one can tease out relationships among people by looking at photographs on Facebook.
People who appear in the same photographs, more often, likely live near one another, for example. And by looking who appears in the same photographs together – and who does not – one can get insights into relationships among family and coworkers.
Lieberman-Aiden and his team use a similar technique to tease out relationships between chunks of DNA that may seem far apart, when measured along the length of a strand of DNA, but may actually be quite close to one another when bunched up inside a nucleus.
They chop up the human genome into chunks and then freeze the relationships between those chunks to generate hundreds of millions or even billions of snapshots.
Looking at which chunks of DNA appear near other chunks of DNA is what allowed Lieberman-Aiden to identify the unique shape – a fractal globule – that DNA takes within the nucleus of a cell.
The challenge: the human genome – when expressed with the letters C-G-A-T – is over three billion ‘letters’ long. And Lieberman-Aiden’s technique relies on looking at the hundreds of millions, even billions of “snapshots” generated by modern DNA sequencing techniques.
Of course, crunching those numbers simply wouldn’t be possible without GPUs. The matrices used to examine the relationships are simply too big.
“This is a massive computation… it’s completely unsolveable by a CPU in a reasonable amount of time,” Lieberman-Aiden said. No so when using GPUs, however. “On an NVIDIA Telsa GPU it takes about five minutes.”
Funny, that’s just about the amount of time it takes to cook a nice bowl of ramen.