First full sequence of a human genome completed
After two decades, researchers have generated the first complete, gapless sequence of a human genome
Scientists have generated the first complete, gapless human genome, two decades after the Human Genome Project initially produced the draft human genome sequence. The research was completed by the Telomere to Telomere (T2T) consortium, which included leadership from researchers at the US National Human Genome Research Institute (NHGRI), part of the US National Institutes of Health; University of California, Santa Cruz; and University of Washington, Seattle, all US. Six papers encompassing the completed sequence appear in Science, along with companion papers in several other journals.
Having a complete, gap-free sequence of the roughly three billion bases of human DNA is critical for understanding the full spectrum of human genomic variation and for understanding the genetic contributions to certain diseases. Analyses of the complete genome sequence will significantly add to the knowledge of chromosomes, including more accurate maps for five chromosome arms, which opens new lines of research. This helps answer basic biology questions about how chromosomes properly segregate and divide. The T2T consortium used the now-complete genome sequence as a reference to discover more than two million additional variants in the human genome. These studies provide more accurate information about the genomic variants within 622 medically relevant genes.
“Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint,” said Dr Eric Green, director of NHGRI.
According to the researchers, over the past decade, two new DNA sequencing technologies emerged that produced much longer sequence reads. The Oxford Nanopore DNA sequencing method can read up to one million DNA letters in a single read with modest accuracy, while the PacBio HiFi DNA sequencing method can read about 20,000 letters with nearly perfect accuracy. Researchers in the T2T consortium used both DNA sequencing methods to generate the complete human genome sequence.
The full sequencing builds upon the work of the Human Genome Project, which mapped about 92 percent of the genome and research undertaken since then. Thousands of researchers have developed better laboratory tools, computational methods and strategic approaches to decipher the complex sequence.
That last eight percent includes numerous genes and repetitive DNA and is comparable in size to an entire chromosome. Researchers generated the complete genome sequence using a special cell line that has two identical copies of each chromosome, unlike most human cells, which carry two slightly different copies. The researchers noted that most of the newly added DNA sequences were near the repetitive telomeres and centromeres.
The cost of sequencing a human genome using “short-read” technologies, which provide several hundred bases of DNA sequence at a time, is only a few hundred dollars, having fallen significantly since the end of the Human Genome Project. However, using these short-read methods alone still leaves some gaps in assembled genome sequences. The massive drop in DNA sequencing costs comes together with increased investments in new DNA sequencing technologies to generate longer DNA sequence reads without compromising the accuracy.
“This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which in turn will empower genetic studies of human disease,” concluded Green.
Dr Eric Green