Next Generation Sequencing (NGS)-Introduction
July 12, 2019NEXT GENERATION SEQUENCING (NGS)
Dominant approach and the gold standard for DNA sequencing have been the Sanger method for the past 30 years. The commercial launch of the first massively parallel platform for pyrosequencing in 2005 marked the beginning of the new era of high-throughput genomic analysis now called next-generation sequencing (NGS).
Next-generation high-throughput DNA sequencing techniques that open up fascinating new opportunities in biomedicine were chosen as the method of the year in 2007 by Nature Methods [1]. The biological sciences have been revolutionized by the massively parallel sequencing technology known as next-generation sequencing (NGS). NGS allows researchers to perform a wide variety of applications and study biological systems at a level never before possible with its ultra-high throughput, scalability, and speed. Today’s complex genomic research questions require a depth of information beyond the capabilities of traditional technologies for sequencing DNA. Next Generation Sequencing (NGS) has filled that gap and has become an everyday tool of research to address these issues, which is a method for determining the exact order of nucleotides present in a given DNA or RNA molecule with very shortest time [2]. However, it was not an easy path to gain acceptance of the novel technology. The Sanger enzymatic dideoxy technique first described in 1977 was the methods used for sequencing until a few years ago [3].
The Sequence concept of biology:
Frederick Sanger developed DNA sequencing technology in 1977 based on the method of chain-termination (also known as Sanger sequencing), and Walter Gilbert developed another sequencing technology based on chemical DNA modification and subsequent cleavage at specific bases. Sanger sequencing has been adopted as the primary technology in the “first generation” of laboratory and commercial sequencing applications due to its high efficiency and low radioactivity [4]. The Sanger technique finally enabled the completion of the first human genome sequence in 2004 thanks to these advances [5]. However, the Human Genome Project required vast amounts of time and resources and it was clear that it required faster, higher throughput and cheaper technologies. That is why the National Human Genome Research Institute (NHGRI) launched a funding program in the same year (2004) to reduce the cost of sequencing human genomes to US$ 1000 in ten years [6]. In contrast to the automated Sanger method, which is considered a first-generation technology, this stimulated the development and marketing of next-generation sequencing (NGS) technologies. These new methods of sequencing share three significant improvements. First, they rely on preparing NGS libraries in a cell-free system instead of requiring bacterial cloning of DNA fragments. Second, sequencing reactions are produced in parallel rather than hundreds of thousands to many million. Third, the sequencing output is detected directly without electrophoresis; baseline interrogation is conducted cyclically and in parallel. The huge number of readings generated by NGS enabled complete genomes to be sequenced at the unprecedented speed [7].
Next generation sequencing analysis:
Rapid and inexpensive next-generation NGS sequencing methods offer profiling, genome annotation or non-coding RNA discovery of high-performance gene expression. DNA sequencing is generally one of the most important platforms for the study of molecular biology. Sequence decoding is usually done using termination technology for the dideoxy chain. Different methods are well established today to obtain sequence information. An enzymatic reaction is used in the classical dideoxy method invented by Friedrich Sanger in the 1970s. Beginning with short random primers, the complementary strands are elongated by DNA polymerase. The addition of one of the four differently labeled dideoxynucleotides results in detectable chain termination and therefore enables the identification of the unknown DNA. In contrast to the Sanger method, which is similar to natural DNA replication [8]. NGS platforms perform massively parallel sequencing, where millions of DNA fragments from a single sample are produced. Massively parallel sequencing technology facilitates high – performance sequencing, enabling the sequencing of an entire genome in less than one day. These are some uses of NGS, where it provides an alternative to sequencing DNA that is much cheaper and more efficient than traditional Sanger sequencing. It is now possible to sequence entire small genomes in a day compared to Sanger sequencing that can only analyze one sample in one day, it is also can discover disease-related genes and regulatory elements. Furthermore, RNA-seq can provide information on a sample’s entire transcriptome without requiring prior knowledge of an organism’s genetic sequence. This technique offers a powerful alternative in gene expression studies to the use of microarrays [9].
Transcriptome sequencing
Since the development of microarray technology and the complete sequencing of the human genome, methodologically sound technologies for transcriptome analysis are available and widely used. MRNA expression was previously measured using microarray techniques or PCR techniques in real time [10]. In recent years, RNA-Seq is rapidly emerging as the major quantitative transcriptome profiling system [12]. In recent years, high-throughput sequencing technologies have evolved rapidly. In a relatively short time and at low cost, these technologies can generate millions of reads. The use of such platforms for sequencing cDNA samples (RNA-seq) has been shown to be a powerful method for analyzing eukaryotic genomic transcriptome [13]. RNA-seq can provide measurement of digital gene expression and is considered to be an attractive approach competing to replace microarrays for unbiased and comprehensive analysis of the transcriptome. The transcriptome is the complete set of transcripts for a specific stage of development or physiological condition in a cell, and their quantity. Understanding the transcriptome is essential for interpreting the genome’s functional elements and revealing cell and tissue molecular constituents, as well as understanding development and disease. The key objectives of transcriptomics are: to catalog all transcript species, including mRNAs, non-coding RNAs and small RNAs; to determine the transcriptional structure of genes with respect to their starting sites, 5′ and 3′ ends, splicing patterns and other post-transcriptional modifications; And quantify the changing levels of expression of each transcript under different conditions during development. Initially, cDNA or EST libraries Sanger sequencing was used [14,15], but this approach is relatively low throughput, expensive and not quantitative in general. To overcome these limitations, tag-based methods have been developed, including serial analysis of gene expression (SAGE), cap analysis of gene expression (CAGE) and massively parallel signature sequencing (MPSS). These tag-based approaches to sequencing are high performance and can provide accurate levels of ‘ digital ‘ gene expression. Most, however, are based on expensive Sanger sequencing technology, and the reference genome can not be uniquely mapped to a significant portion of the short tags. In addition, only a portion of the transcript is analyzed and isoforms are generally indistinguishable. These drawbacks limit the use of traditional sequencing technology to annotate the transcriptome structure. The development of novel high-throughput DNA sequencing methods has recently provided a new way to map and quantify transcriptomes.This method, called RNA-Seq (RNA sequencing), has clear advantages over existing approaches and is expected to revolutionize how to analyze eukaryotic transcriptomes. [13].
Genome sequencing
The Human Genome Project (HGP) 2001 draft sequence of the human genome was undoubtedly a major scientific achievement, a turning point for human genetics, and a starting point for human genomics [16]. By mapping the individual reads to the human reference genome, bioinformatics analyzes are used to compile these fragments. NGS can be used to sequence complete genomes or be restricted to specific areas of interest, including all 22 000 coding genes (an entire exome) or a small number of individual genes. In clinical practice, there are numerous opportunities to use NGS to improve patient care, including: In a human genome, the spectrum of DNA variation includes small base changes (substitutions), DNA insertions and deletions, large genomic deletions of exons or whole genes, and rearrangements such as inversions and translocations. Compared to traditional Sanger sequencing, the discovery of replacements and small insertions and deletions are restricted. It clearly shows that the NGS captures a wider range of mutations than the Sanger sequencing [17]. NGS also provides SNP Discovery through Next-Generation Sequencing. SNP marker applications have been clearly demonstrated in human genomics, where complete sequencing of the human genome has resulted in the discovery of several million SNPs [18]. SNPs have been applied in areas as diverse as human forensics, diagnostics, and functional genomic studies have capitalized upon SNPs located within regulatory genes, transcripts, and Expressed Sequence Tags (ESTs) [19]. It also includes SNP Genotyping. SNP genotyping is SNP discovery’s downstream application to identify genetic variations. SNP applications include phylogenic analysis, marker-assisted selection, quantitative trait loci (QTL) genetic mapping, bulked segregant analysis, selection of genomes, and genome-wide association studies (GWAS). All of this allowing Genetic mapping SNPs. A genetic map refers to the arrangement of each other’s traits, genes, and markers as measured by their recombination frequency. Genetic maps are essential tools for plant genetic improvement in molecular breeding, as they allow for gene location, map-based cloning and QTL identification [20]. In addition, Next-generation sequencing as a tool to study microbial evolution based on their genomics study was also include in NGS techniques. Microbes are evolving rapidly thanks to their short generation times and large population sizes. This was exploited by evolutionary biologists to observe evolution in real time. The advent of next-generation sequencing (NGS) provides an unparalleled opportunity to gain a sequence-level view of these evolutionary processes in eukaryotic and prokaryotic organisms with larger, more complex genomes encoding the history and metabolism of life more complex. NGS enhances our understanding of both general and microbial evolutionary processes. The mutation is the ultimate source of all genetic variation, providing evolutionary fuel. Consequently, in determining the properties of an evolving population, the rate of mutation and the distribution of mutational effects are key parameters. Reliable estimates of these microbial parameters will provide important insight into the evolution of microbial populations. [21].
Limitations:
However, in the clinical setting, the main disadvantage of NGS is the establishment of the required infrastructure, such as computer capacity and storage, as well as the staff expertise needed to analyze and interpret the subsequent data in full. Next generation sequencing methods are capable of detecting a wide range of mutation types, mostly limited by the ability to properly process the sequencer’s raw data, as some scientist had this kind of problem when processing a data [22]. In conclusion, sequencing next-generation is a powerful approach to analyzing the biological issue. The technologies are incredibly robust and are still improving. Even in the presence of immature processing of data. There have been countless findings and an unprecedented number of diagnoses. As the bioinformatics sector matures and starts producing clinical-grade solutions and is successful.
References:
[1] Ansorge, W. J. (2009). Next-generation DNA sequencing techniques. New Biotechnology, 25(4), 195–203. doi:10.1016/j.nbt.2008.12.009
[2] 2. doi 10.1038/jid.2013.248 Next-Generation Sequencing: Methodology and Application(Ayman Grada and Kate Weinbrecht 2013)
[3] Sanger, F., Nicklen, S., & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74(12), 5463–5467. doi:10.1073/pnas.74.12.5463
[4] Comparison of Next-Generation Sequencing Systems Lin Liu, Yinhu Li, Siliang Li, Ni Hu, Yimin He, Ray Pong, Danni Lin, Lihua Lu, and Maggie Law
[5] Human Genome Sequencing Consortium, I. (2004). Finishing the euchromatic sequence of the human genome. Nature, 431(7011), 931–945. doi:10.1038/nature03001
[6] Schloss, J. A. (2008). How to get genomes at one ten-thousandth the cost. Nature Biotechnology, 26(10), 1113–1115. doi:10.1038/nbt1008-1113
[7] Van Dijk, E. L., Auger, H., Jaszczyszyn, Y., & Thermes, C. (2014). Ten years of next-generation sequencing technology. Trends in Genetics, 30(9), 418–426. doi:10.1016/j.tig.2014.07.001
[8] Sanger F, Nicklen S, Coulson AR: DNA sequencing with chainterminating inhibitors. Proc Natl Acad Sci USA 1977, 74:5463-5467.
[9] 2. doi 10.1038/jid.2013.248 Next-Generation Sequencing: Methodology and Application(Ayman Grada and Kate Weinbrecht 2013)
[10] Mutz, K.-O., Heilkenbrinker, A., Lönne, M., Walter, J.-G., & Stahl, F. (2013). Transcriptome analysis using next-generation sequencing. Current Opinion in Biotechnology, 24(1), 22–30. doi:10.1016/j.copbio.2012.09.004
[12] Wang, L., Feng, Z., Wang, X., Wang, X., & Zhang, X. (2009). DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics, 26(1), 136–138. doi:10.1093/bioinformatics/btp612
[13] Wang,Z. et al. (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet., 10, 57–63.
[14] Gerhard, D. S. et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 14, 2121–2127 (2004).
[15] Boguski, M. S., Tolstoshev, C. M. & Bassett, D. E. Jr. Gene discovery in dbEST. Science 265, 1993–1994 (1994).
[16] International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the
human genome. Nature 409:860–921
[17] Behjati, S., & Tarpey, P. S. (2013). What is next generation sequencing? Archives of Disease in Childhood – Education & Practice Edition, 98(6), 236–238. doi:10.1136/archdischild-2013-304340
[18] K. A. Frazer, D. G. Ballinger, D. R. Cox et al., “A second generation human haplotype map of over 3.1 million SNPs,” Nature, vol. 449, no. 7164, pp. 851–861, 2007.
[19] Kumar, S., Banks, T. W., & Cloutier, S. (2012). SNP Discovery through Next-Generation Sequencing and Its Applications. International Journal of Plant Genomics, 2012, 1–15. doi:10.1155/2012/831460
[20] J. C. Nelson, “Methods and software for genetic mapping,” in The Handbook of Plant Genome Mapping, pp. 53–74, WileyVCH, Weinheim, Germany, 2005.
[21] BROCKHURST, M. A., COLEGRAVE, N., & ROZEN, D. E. (2010). Next-generation sequencing as a tool to study microbial evolution. Molecular Ecology, 20(5), 972–980. doi:10.1111/j.1365-294x.2010.04835.x
[22] Daber, R., Sukhadia, S., & Morrissette, J. J. D. (2013). Understanding the limitations of next generation sequencing informatics, an approach to clinical pipeline validation using artificial data sets. Cancer Genetics, 206(12), 441–448. doi:10.1016/j.cancergen.2013.11.005