The 13-year project to sequence the human genome’s “book of life” was proclaimed “complete” in April 2003, raising anticipation. The $3bn (£2.5bn) Human Genome Project was supposed to cure chronic diseases and reveal our genetic makeup
Even while press conferences celebrated this new scientific insight, this human life instruction handbook had already surprised me.
At the time, it was thought that most of the human genome would include instructions for producing proteins, the building blocks of all living things that conduct a wide range of functions inside and between cells. With over 200 cell kinds in the body, it seemed obvious that each would require its own genes to operate. Our species and cognitive abilities evolved from distinct protein sets. We are the only species that can sequence our genomes.
2% of the three billion human genome letters are protein-related.
Instead, just 2% of the three billion letters of the human genome are protein-related. Our DNA sequences contain just about 20,000 protein-coding genes. Humans share protein-making genes with some of the simplest animals, surprising geneticists. The scientific community was confronted with an unsettling possibility: as much of our knowledge of human nature wrong?
“I just remember the incredible shock,” says Samir Ounzain, a molecular scientist and CEO of Haya Therapeutics, which is using genetics to treat cardiovascular disease, cancer, and other chronic illnesses. “That was the moment where people started wondering,’maybe we have the wrong conceptualisation of biology?'”
Dark matter, or the dark genome, is the remaining 98% of our DNA, a meaningless mess of letters. Some geneticists thought the black genome was trash DNA, the remains of broken genes that were no longer useful.
For others, the dark DNA was always essential to comprehending mankind. “Evolution has absolutely no tolerance for junk,” says Kári Stefánsson, CEO of Icelandic startup deCODE genetics, which has sequenced the most entire genomes. “Maintaining genome size must be evolutionary.”
Two decades later, scientists are beginning to understand the dark genome. It regulates protein-making gene expression. Epigenetics controls how our genes respond to nutrition, stress, pollution, exercise, and sleep.
Proteins constitute life’s hardware, whereas the dark genome processes and responds to environmental information, explains Ounzain. Thus, understanding the dark genome helps us comprehend human complexity and how we become who we are.
“If you think of us as a species, we’re master adapters to the environment at every level,” adds Ounzain. Information processing is adaptability. We now know that the dark genome is what distinguishes humans from flies and worms.
Transposons in evolution
In the mid-2000s, scientists found transposons in the non-protein-coding portions of the human genome, which made it difficult to study. All animals have about half their genomes composed of repetitive sequences.
“Even assembling the first human genome was made more problematic by the presence of these repetitive sequences,” says Jef Boeke, director of the Dark Matter Project at New York University Langone, an academic medical center in New York City. “A unique sequence makes analyzing any sequence easier.”
Geneticists first neglected transposons. Most genetic research concentrates on the exome, the genome’s tiny protein-coding section. More advanced DNA sequencing tools have allowed geneticists to explore the dark genome in increasing depth during the past decade. Researchers eliminated a transposon fragment in mice, causing half of the pups to die before birth. This suggests that some transposon sequences may be essential to our existence.
According to Boeke, transposons may be old, stretching back to the first living forms. Other scientists believe they came from viruses that infected human DNA over time and were repurposed in the body.
“Most of the time, transposons are pathogens that infect us, and they can infect cells in the germline, the type of cells we pass on to the next generation,” explains Dirk Hockemeyer, assistant professor of cell biology at University of California, Berkeley. “They can be inherited and stabilized into the genome.”
This one-time event shaped evolution and gave rise to an entire lineage of big apes, including ourselves. Jef Boeke calls the dark genome a living relic of significant DNA changes from ancient times. Transposons may migrate from one area of the genome to another, causing or reversing gene changes with dramatic effects.
A transposon moving into a gene may have caused the great ape family to lose their tails, allowing our species to walk upright. “Here you have this one-time thing that happened which had a huge effect on evolution, giving rise to a whole lineage of great apes including us,” explains Boeke.
Our study of the dark genome is helping us comprehend evolution, but it can also explain disease emergence. According to Ounzain, most genetic variations linked to chronic diseases like Alzheimer’s, diabetes, and heart disease are in the dark genome, not the protein-coding regions.
Dark genomes and illness
Panay, a Philippine island with white shores and frequent vacationers, has a horrible past. X-linked dystonia Parkinsonism (XDP) is most common on the island. Like Parkinson’s disease, XDP affects people’s ability to move and react rapidly.
Since the 1970s, only Filipinos have had XDP, which was a mystery until geneticists discovered that they all share the same TAF1 gene variation. A transposon in the gene appears to induce symptoms by regulating its activity in a way that harms the body over time. This gene variation may have originated 2,000 years ago and spread across the population.
“The TAF1 gene is an essential gene, meaning it’s required for the growth and multiplication of all cell types,” adds Boeke. “When you tweak its expression, you get a very specific defect that manifests as this horrible form of Parkinsonism.”
This is a basic illustration of how dark genome DNA sequences may activate or repress genes in response to environmental stimuli.