Scientists decode the human genome – this time with no gaps

Updated 1/4/2022 at 21:35

  • In 2001, it was said that the human genome had been deciphered.
  • Two years later, scientists re-reported that the human genome had now been fully sequenced.
  • About 20 years later, the message appears for the third time: scientists have decoded the human genome.
  • This is what it is all about.

More topics related to science and technology can be found here

About 20 years after deciphering the human genome, scientists present a new and this time truly complete human genome sequence. The order of more than three billion individual building blocks of the DNA molecule is now complete, more than 100 scientists from international institutions report in six studies in the journal Science.

The new reference genome will not only provide new and fundamental insights into human biology, but will also improve disease research and treatment. With the help of extended genetic information, questions about human evolution and their distribution on Earth can also be explored.

In 2001, about eight percent of the genetic information was missing

In 2001, scientists at the state-funded Human Genome Organization (HUGO) and Craig Venter, a private genetics researcher and then president of Celera Genomics, unveiled the first blueprints for the human genome. They still had significant gaps. In 2003, scientists at HUGO announced the end of sequencing.


The last eight percent were deciphered.

© dpa / NHGRI

But this version was not perfect either. About 8 percent of the genetic information was missing. “We gained a tremendous understanding of human biology and disease by knowing about 90 percent of the human genome, but many important aspects remained hidden from science because we didn’t have the technology to understand these (remaining) parts of the genome,” explains David Haussler of the University of California, Santa Cruz. “Now we can stand on top of the mountain and see the entire landscape below and get a complete picture of our human genetic heritage.”

The genetic information of all living things is encrypted in the four basic DNA building blocks of the genetic molecule, the so-called principles. The order of these individual building blocks – often called letters – determines, among other things, what proteins are formed in the body, which in turn control virtually all life processes. In humans, all information is spread over 23 chromosomes, which are duplicated in almost all cells.

Sequencing machines can now identify longer sections of the genome simultaneously

To determine the order of the letters in a DNA molecule – in other words, to be able to read the genetic code – scientists first cut the DNA into small pieces. Sequencing machines then determine the order of the individual building blocks. The overlays between the pieces allow the pieces to be rearranged together. However, difficulties arise in regions of the genome where certain letter sequences are repeated frequently, the so-called repetitive sequences. They are found mainly at the ends of the chromosomes and in the centromere region – the region that divides each chromosome into a short and long arm and that plays an important role in cell division.

Also read: Scientists are experimenting with genes for dangerous viruses – why?

“The parts of the human genome that we haven’t been able to study in more than 20 years are important to our understanding of how the genome works, genetic diseases, and human diversity and evolution,” said Karen Miga of UC Santa Cruz.

Improved sequencing machines have made it possible for several years to define longer segments of the genome simultaneously. One method allows you to read up to a million building blocks at the same time with an acceptable error rate, while another manages 20,000 letters with almost no errors – “game changer”, the researchers write. To simplify sequencing, scientists also used special human cancer cells that contain only paternal DNA. Of the two sex chromosomes X and Y, only the X chromosome was analyzed.

Bugs from previous sequences have been corrected

As a result, scientists now have the complete letter sequence from end to end for each chromosome. The new sequence contains 200 million previously unknown bases; contain information on 99 genes, each of which is likely to make a protein. Numerous errors in previous sequences were corrected and the short ends of the five chromosomes were fully decoded for the first time.

In the human population, the complexity of the genome is higher than in the reference genome presented, despite its high quality, writes the American researcher Deanna Church in a commentary in Science. Nevertheless, a correctly assembled human genome, including repeat regions, facilitates the analysis of these biomedically important regions in other humans and primates.

The human genome sequence, which is now completely available, will not be the last one scientists bring to the public. Work is already underway on a genome composed of the genetic information of the mother and father. On the other hand, the Human Pangenome Consortium wants to sequence the DNA of 350 people from different regions of the world to better understand the diversity of the human population. (ff / dpa)


Scientists discovered the wreckage of the Endurance ship in Antarctica. The ship sank in 1915 during Sir Ernest Shackleton’s expedition. The wooden ship is amazingly well preserved.

trailer picture: © Getty Images / iStockphoto / cosmin4000

Leave a Comment