First report from the world’s largest whole genome sequencing effort

Scientists have published a report on the whole genome sequences of 150 thousand participants in the UK biobank.

DNA sequencing analysis software on a laptop from the genetic engineering Lab. computer with genetic sequencing software screen on Genetic Research Laboratory Workbench

Researchers from deCODE genetics, Iceland, have published a report in Nature on the whole genome sequences of 150 thousand participants in the UK biobank. This is the first report from the largest whole genome sequencing effort to date where scientists are set to sequence 500 thousand whole genomes in three years.

The scientists found 600 million single-nucleotide polymorphisms (SNPs) and indels in these 150 thousand genomes corresponding to seven percent of those that can theoretically occur in the human genome. It is however likely that some of the theoretically possible variants are incompatible with life.

This large dataset allowed the scientists to separate regions that are tolerant to large diversity in sequence from those that are not. The assumption is that regions that are intolerant to sequence diversity are important to human survival and procreation. It has long been thought that coding exons are the region’s most important to human survival. However, when the one percent of the genome with sequences that are best conserved are examined, only 13 percent of them are coding exons.

“Data of this type and quantity are going to revolutionise our ability to identify and characterise intergenic sequences of importance to human diversity, be it to risk of disease and response to treatment or some other attributes,” said Kari Stefansson, one of the authors of the paper.

The researchers also report on the association of variants that were not identified through whole exome sequencing with diseases and other phenotypes.

The scientists determined that 85 percent of the participants could trace most of their ancestry to the British Isles. The scientists also found a large group of participants who can trace their ancestry mostly to Africa and South Asia. This study is likely to represent the largest set of whole genome sequenced individuals of African and South Asian origin. However, the imbalance in the ethnic mix of those contributing sequences to this study as well as to other studies already published is unfortunate from both societal and scientific point of views.

Closing the diversity gap in genomics: the importance of diverse and inclusive samples in genomic studies
Read the full article

Therefore, the scientists are determined to work towards more ethnically balanced sequencing cohorts in the future.

Related organisations

Related people

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.