NGS: enabling biomarker discovery at single-cell resolution

Investigations using Next-Generation Sequencing (NGS), once the domain of well-funded labs, have now become mainstream in the hunt for biomarkers for various disorders – ranging from breast and colon cancers to cardiomyopathies, diabetes, congenital cataracts, liver diseases and mitochondrial disorders. The NHS Executive (NHSE) has recently announced the use of NGS-based diagnostic tests for a limited number of disorders, including certain types of cancers, starting from October 2018.1


These tests will be available in England; however, as we adopt NGS techniques at the frontline of providing care for patients, researchers like myself must look toward future applications of current technologies to precision medicine. In 2010, I wrote a review article on NGS for RNA wherein I proposed the need for single cell NGS to support the transition into personalised medicine.2 Today, we see single cell NGS beginning to gather pace in clinically relevant investigations3–7 and new long read NGS platforms are being used to improve diagnostic accuracy.7–10

NGS plays a crucial role in identifying differences between samples from disease and normal subjects at the level of DNA, RNA and epigenome. The aim of such pre-clinical investigations is not only to identify disease signatures to devise tests for diagnosis and monitor prognosis, but also to uncover actionable targets to stabilise or reverse molecular changes causing or resulting from disease.

As we move towards personalising treatment for individual patients, it has become clear that single cell NGS and long read sequencing can help us identify new disease biomarkers and create a change in how we treat patients.12,13

Shortcomings in current NGS datasets, caused by sequencer and biological limitations

NGS investigations, conducted on extracted DNA or RNA, require extensive sample processing to meet the requirements of current second-generation sequencing instruments. Due to the limitations of sequencers, long fragments of high-quality DNA or RNA can become fragmented via sonication, restriction digestion, tagmentation or chemical treatment. Short read NGS is a problem because it precludes the detection of structural variants (inversions, translocations, large deletions) and hampers distinction between pseudogenes and genes.

For RNA sequencing, we reverse-transcribe RNA into DNA because current NGS platforms are unable to directly read RNA. This erases information on RNA modifications; thus, despite deep RNA sequencing, we remain unaware of disease-related RNA modifications.

Library preparation for the investigation of methylation signatures is more complex. Fragmented DNA is chemically modified through bisulfite conversion, where methylated cytosines are protected while unmethylated cytosines are converted to uracils through de-amination. Sequencers read uracils as thymines and this conversion results in low-diversity libraries, which are difficult to sequence and require complex analysis algorithms. Due to its complexity, NGS-based methylation investigations remain underutilised such that we are uninformed about the methylation profiles of most disorders.

In addition, such pre-sequencing multi-step multi-sample library preparations are prone to technical biases and are preferably conducted on liquid handling systems in a controlled setup. Indeed, within the Genomics Research Platform at the NIHR Guy’s and St Thomas’ Biomedical Research Centre (BRC), we conduct library preparations following strict standard operating protocols on liquid handling systems with numerous quality control steps, to avoid introducing artefacts and to minimise sample-to-sample variations and batch effects.

Despite extensive quality control measures, the investigations described above may not accurately reveal disease biomarkers or druggable targets for some disorders. This is because the sample input requirements for such library preparation warrants nucleic acid extraction from between hundreds and thousands, to millions of cells. Consequently, the output of sequenced reads represents the collective average from a mixture of cells or cell types, which may fail to unmask true indicators of disease. For example, screening DNA, RNA or methylation status in tumours comprised of 10% malignant cells and 90% non-malignant heterogeneous cell types is unlikely to uncover malignant cell-specific mutations, disease-specific rare transcripts or methylation signatures. Selective isolation of rare malignant cells from such tumours may not be possible due to lack of surface biomarkers or because the low cell yield may fail to meet sample input requirements.

The solution lies in single cell investigations, which address biological limitations, alongside long read sequencing, which addresses current sequencer limitations.

The promise of single-cell preparations

In recognition of the importance of single-cell genomics in translational research, in late 2015 the Medical Research Council in the UK provided infrastructure and research grants to a few universities to establish single-cell facilities, which has ushered the current single-cell genomics rush in the UK. Over the last two years, translational research investigations have transitioned from NGS of extracted nucleotides from bulk cells to single cells, thus bypassing the extraction process.

Researchers generate debris-free, single-cell suspensions of live cells through tissue dissociation, magnetic beads, fluorescence-activated cell sorting (FACs), laser capture microdissection or size-specific cell isolation platforms. These cell suspensions are then loaded on specialised single cell instruments capable of separating individual cells into microfluidics valve-controlled chambers, or into oil partitions or onto etched wells on microchips, to enable molecular reactions. Alternatively, individual single cells can be sorted via FACs directly into 96 well plates. It is crucial to perform the first steps of single cell library preparation in incredibly low volumes; this has demonstrated to reduce noise in single-cell datasets.14 The specialised single-cell instruments comply with this requirement and for cells sorted directly into plates in my lab, library preparation is conducted on a high precision contactless acoustic droplet transfer-based liquid handling system capable of transferring nanolitre volumes in keeping with the low volume requirement.

Challenges in single-cell preparations and their solutions

Since each human single cell contains a fixed amount of DNA (~6.6pg) and limited RNA (0.1pg to 20pg), single-cell library preparation protocols require liberal PCR amplification to acquire sequenceable amounts of the library. This results in non-quantitative datasets. To address this, some methods have incorporated unique molecular identifiers (UMIs) to tag each molecule so PCR duplicates can be identified and excluded from downstream analysis to obtain a quantitative readout.16,17

DNA NGS on single cells is conducted by isothermal or PCR amplification of the whole genome of a single cell, while single-cell polyadenylated RNA NGS is available in many formats (3’ end, 5’ end and full length) on multiple instruments of low throughput (96 individual cells) and high throughput (up to 10,000 individual cells). While low-throughput protocols enable deeper sequencing of individual cells and aid identification of cell-specific transcripts missed in bulk RNA sequencing, high throughput methods generate low complexity libraries to broadly categorise cells into subtypes. Thus, in the future, comprehensive single-cell investigations may require a combination of high-throughput NGS of RNA ends to identify cell subtypes followed by deeper mRNA sequencing to identify all transcripts expressed in a cellular subtype.

Single-cell investigations and long-read NGS provide higher resolution

Methods such as ATACseq are being utilised to generate chromatin accessibility maps for individual cell types.17 Importantly, combined investigations such as genome and transcriptome (G&Tseq),18–22 epitope and transcriptome, 23 mRNA and TCR (in-house method developed in the Genomics Research Platform within our BRC) at single-cell resolution have enabled investigation of genetic, phenotypic and epigenetic features of disease alongside transcriptional changes, which is critical in translational investigations. Furthermore, it is now possible to screen cell-specific effects of perturbations at a single-cell resolution.24

NGS on single cells is currently conducted on short-read second-generation sequencers. NGS of selected single-cell libraries on third-generation long-read single-molecule sequencers can facilitate transcript assembly from single cells, aid identification of structural variants, and allow distinction between mutations in genes and pseudogenes. Some third-generation sequencers can directly sequence DNA, thus enabling methylation analysis without bisulfite conversion.25 Direct RNA NGS, which is possible on a third-generation sequencing platform, has the potential to uncover RNA modifications that have hitherto eluded us due to limitations of second-generation sequencers.26 Thus, clinically relevant investigations are gradually evolving to include single-cell outputs and long-read sequencing. Bearing this in mind, most genomics research platforms including ours have initiated service offerings on single-cell NGS with short-read as well as long-read sequencers to enable comprehensive investigations in translational projects.

The future

Today, NGS of single cells is a rapidly advancing field. It is also gaining popularity among investigators as data analysis pipelines become more standardised and user-friendly. However, as the transition to personalised medicine becomes a reality, a comprehensive catalogue of the normal genetic/epigenetic/transcriptional landscape is needed at a single cell resolution for every cell type, to discern changes that occur in disease states or after drug treatment. Our Genomics Research Platform is actively contributing to this transition. Through a unique combination of our genomics expertise in devising bespoke single-cell library preparations and our ‘research aware’ clinical specialists in initiating clinically-relevant single-cell investigations, our goal is to fulfil the vision of our BRC to support translational research. We believe that high-resolution NGS of single cells will be a key contributor to personalised medicine.


Dr Alka Saxena studied medicine at Shivaji University, India. After seven years of clinical practice, Alka completed her PhD at Melbourne University Australia and studied monogenic disorders at the University of Western Australia, where she became an Assistant Professor, and RIKEN Omics Science Centre.

Since 2013 Alka has been Head of Genomics at the NIHR Guy’s and St Thomas’ BRC and Honorary Senior Research Fellow at King’s College London. Alka’s team develop cutting-edge Single Cell Genomics technology and provide library preparation and sequencing services.

Alka is co-founder of the Samartha Saxena Foundation, a charitable Trust established in her son’s memory.


  1. NHS England. National Genomic Test Directories,
  2. Saxena A, Carninci P. Whole transcriptome analysis: What are we still missing? Wiley Interdiscip Rev Syst Biol Med. 2011;3(5):527–43.
  3. Liang SB, Fu LW. Application of single-cell technology in cancer research. Biotechnol Adv. 2017;35(4):443–9.
  4. Krasnitz A, Kendall J, Alexander J, Levy D, Wigler M. Early Detection of Cancer in Blood Using Single-Cell Analysis: A Proposal. Trends Mol Med [Internet]. 2017;23(7):594–603. Available from:
  5. Wang L, Livak KJ, Wu CJ. High-dimension single-cell analysis applied to cancer. Mol Aspects Med [Internet]. 2018;59:70–84. Available from:
  6. Tsoucas D, Yuan GC. Recent progress in single-cell cancer genomics. Curr Opin Genet Dev [Internet]. 2017;42:22–32. Available from:
  7. Hedlund E, Deng Q. Single-cell RNA sequencing: Technical advancements and biological applications. Mol Aspects Med [Internet]. 2018;59:36–46. Available from:
  8. Ammar R, Paton TA, Torti D, Shlien A, Bader GD. Long read nanopore sequencing for detection of HLA and CYP2D6 variants and haplotypes. F1000Research. 2015.
  9. Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, Von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods.
  10. Toma I, Siegel MO, Keiser J, Yakovleva A, Kim A, Davenport L, et al. Single-molecule long-read 16S sequencing to characterize the lung microbiome from mechanically ventilated patients with suspected pneumonia. J Clin Microbiol. 2014.
  11. Nakano K, Shiroma A, Shimoji M, Tamotsu H, Ashimine N, Ohki S, et al. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area. Human Cell.
  12. Picelli S. Single-cell RNA-sequencing: The future of genome biology is now. RNA Biol [Internet]. 2017;14(5):637–50. Available from:
  13. Heath JR, Ribas A, Mischel PS. Single-cell analysis tools for drug discovery and development. Nat Rev Drug Discov [Internet]. 2016;15(3):204–16. Available from:
  14. Wu AR, Neff NF, Kalisky T, Dalerba P, Treutlein B, Rothenberg ME, et al. Quantitative assessment of single-cell RNA-sequencing methods. Nat Methods. 2014;11(1):41–6.
  15. Kivioja T, Vähärautio A, Karlsson K, Bonke M, Enge M, Linnarsson S, et al. Counting absolute numbers of molecules using unique molecular identifiers. Nat Methods.
  16. Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M, et al. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods. 2014.
  17. Buenrostro JD, Wu B, Litzenburger UM, Ruff D, Gonzales ML, Snyder MP, et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature. 2015.
  18. Hu Y, Huang K, An Q, Du G, Hu G, Xue J, et al. Simultaneous profiling of transcriptome and DNA methylome from a single cell. Genome Biol. 2016;17(1):1–11.
  19. Macaulay IC, Teng MJ, Haerty W, Kumar P, Ponting CP, Voet T. Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq. Nat Protoc. 2016;11(11):2081–103.
  20. Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng MJ, et al. G&T-seq: Parallel sequencing of single-cell genomes and transcriptomes. Nat Methods. 2015;12(6):519–22.
  21. Hou Y, Guo H, Cao C, Li X, Hu B, Zhu P, et al. Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas. Cell Res. 2016;26(3):304–19.
  22. Dey SS, Kester L, Spanjaard B, Bienko M, Van Oudenaarden A. Integrated genome and transcriptome sequencing of the same cell. Nat Biotechnol. 2015;33(3):285–9.
  23. Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, Swerdlow H, et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods. 2017.
  24. Dixit A, Parnas O, Li B, Chen J, Fulco CP, Jerby-Arnon L, et al. Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens. Cell. 2016.
  25. Rand AC, Jain M, Eizenga JM, Musselman-Brown A, Olsen HE, Akeson M, et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat Methods. 2017.
  26. Garalde DR, Snell EA, Jachimowicz D, Sipos B, Lloyd JH, Bruce M, et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat Methods. 2018.

The rest of this content is restricted - login or subscribe free to access

Screening the future innovations in drug discoveryThank you for visiting our website. To access this content in full you'll need to login. It's completely free to subscribe, and in less than a minute you can continue reading. If you've already subscribed, great - just login.

Why subscribe? Join our growing community of thousands of industry professionals and gain access to:

  • quarterly issues in print and/or digital format
  • case studies, whitepapers, webinars and industry-leading content
  • breaking news and features
  • our extensive online archive of thousands of articles and years of past issues
  • ...And it's all free!

Click here to Subscribe today Login here