Statistical foundation for Next Generation Sequencing built

Posted: 24 July 2018 | | No comments yet

A statistical foundation for Next Generation Sequencing has been built from samples of 1,036 patients…


Scientists at the National Institute of Standards and Technology (NIST) have identified a statistical foundation when using Next Generation Sequencing (NGS) for DNA profiling.

Dr Katherine Gettings, NIST biologist who led the study said, “The data we’ve published will make it possible for labs that use NGS to generate…statistics.”

To create a DNA profile, sections of DNA where the genetic code repeats itself are analysed. These areas are genetic markers, called short tandem repeats (STRs). The number of repeats varies from person to person, and as such can be used to identify people. This approach began in the 1990’s and developed onto NGS which makes sequencing cost effective for biomedical research.

NGS can also be used to create DNA profiles which include specific genetic sequences inside the markers. This approach is useful if only a minute amount of DNA has been successfully collected, or if it has begun to break down, which means only a partial profile of the DNA is maintained.

DNA analysts have measured the STR-based profiles and have calculated match statistics, because previous scientists have measured how frequently different versions of the markers occur in the population. With these frequencies, the chances of randomly encountering a DNA profile can be calculated.

These frequencies were measured by NIST using a library of DNA samples from 1,036 individuals. To calculate the gene frequencies, Dr Gettings and her team used the original anonymised samples, donated by people consenting for their DNA to be used in research. The researchers generated NGS-based profiles for them by sequencing 27 markers and calculating the frequencies of them for the various genetic sequences found at each marker.

Since the team sequenced 27 markers from 1,036 samples, with each marker occurring twice per sample, the number of markers tested was more than 55,000.

The data needed to generate match statistics for NGS-based profiles has been published, however labs will need to further develop methods of managing the greater amounts of data generated by NGS.

Researcher Dr Peter Vallone said, “We’re laying the foundation for the future.”

The research was published in Forensic Science International: Genetics.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.