In my role as account manager at BaseClear I encounter many scientists and R&D managers that cannot see the wood for the trees when it comes to choosing a suitable sequencing platform. For this reason, I wanted to provide a short overview of what is available at the moment. In this article I will provide a short overview of the history of DNA/RNA sequencing, the market leading sequencing platforms and a brief selection of emerging technologies.
THE HISTORY OF DNA SEQUENCING
DNA was discovered in 1869 by the Swiss physician and biologist Johannes Friedrich Miescher. Strangely enough, it took almost 80 years before people started to experiment with the genetic properties of DNA. A possible explanation for this can be found in the paradigm of that time that the structure of DNA was thought not to be complex enough to function as the genetic blueprint of life. Back then, most scientist pointed towards proteins as the most promising candidate. In 1953 Francis Crick and James Watson presented their double-helix model. 24 years later, in 1977, Frederik Sanger developed a DNA sequencing method based on chain-terminating inhibitors. This was the beginning of the ‘Gel-Based’ sequencing era. In the years following, most of the DNA sequencing was done using radioactive probes and (PAGE) gel electrophoresis. Here, each lane represented one of the four nucleotides. If one worked really hard, the maximum out-put was a couple of hundred nucleotides a day.
The first automated sequencer was based on capillaries and introduced by Applied Biosciences in 1987. The AB370 was the beginning of the capillary- based sequencing period. Instead of the time consuming gels and the dangerous (radioactive) probes, fluorescent labels and a laser are used. With this new automated Sanger sequencing system, up to 1100 bp per reaction is reachable. With the second generation 96 capillaries parallel sequencing it is even possible to sequence up to ~1 Mb per day.
The output per machine took a flight with the introduction of massive parallel sequencing, also known as Next-Generation Sequencing (NGS). NGS was introduced in 1994-1998, but, only became commercially available in 2005. In contrast to the traditional chain-terminating inhibitors method (Sanger Method), NGS is based on spatially separated, clonally amplified DNA templates or single DNA molecules on a flow cell.
The current marked leader in the area of Next Gen sequencing is Illumina. They have around 70% of the market in hands. They are dominating the market with the iSeq, MiniSeq, MiSeq, NextSeq, HiSeq, HiSeq X and NovaSeq series. The principle of Illumina sequencing is based on the incorporation of fluorescently tagged nucleotides into the clonally amplified DNA templates. Each of the four bases has a unique emission. The Illumina platform characterizes itself through short reading lengths (50-300 nt) and low to high output (0.5 M – 20,000 M reads). With the Illumina platform you can also make use of, so called paired-end reads. Here, clonally amplified DNA templates are read in two directions. Through this, a longer reading length can be reached of up to 2×300 nt. The Illumina platform can be considered as the best price per nt and can be used for all forms of DNA and RNA sequencing projects – ranging from amplicon sequencing and genome analysis to metagenomics and transcriptomics.
THERMO FISHER SCIENTIFIC
The runner-up, with around 15% market share, is Thermo Fisher Scientific (TFS). At the moment TFS has four devices on the market, namely the Ion Proton, Ion Personal Genome Machine (PGM), Ion S5 and Ion S5 XL system. The platform of TFS is based on ion semiconductor sequencing. Here, DNA sequencing is based on the detection of hydrogen ions that are released during the polymerization of DNA. The out-put ranges from 100 Mb to ~32 Gb. Read lengths range from 100 to 400 nt. The applications for this platform ranges from amplicon sequencing to RNA sequencing, in other words, diverse.
The 454 pyrosequencing platform from Roche was once the golden standard for amplicon sequencing. However, Roche stopped the support for this platform in 2016. Therefore, most users have switched to Illumina sequencing (most often MiSeq). The output from the 454 Genome Sequencer FLX (GS FLX) and GS Junior ranged from 35 Mb to 700 Mb with read lengths of 700 nt up to 1kb.
Pacific Biosciences (PacBio) brought the fourth platform on the market. They have a market share of ~5%. At the moment they have two devices on the market. In chronological order, PacBio RSII and the Sequel system. The PacBio platform differs from the aforementioned NGS platforms. In contrast to the other NGS platforms, the PacBio platform does not work with a spatially separated, clonally amplified DNA templates, but with single DNA molecules. The full name of this technique is Single molecule real time sequencing (SMRT). Here, a single DNA polymerase is fixed on the bottom of a zero-mode waveguide (ZMW) which is part of a SMRT cell. ZMW is an optical wave guide that guides light energy into a volume that is small compared to the wavelength of the light. When a nucleotide is incorporated by the DNA polymerase, the fluorescent tag is cleaved off and diffuses out of the observation area of the ZMW. The base call is made according to the corresponding fluorescence of the dye. Unique of the PacBio is the read-length. This platform produces on average 5-6 kb reads with peaks of up to 30kb. One SMRT cell of the PacBio RSII can produce on average 500 Mb data and up to 1 Gb. Because this platform is relatively expensive and has a relatively low yield, it is very useful for complementing the shorter reads of Illumina. Through this complementing technique, it is a cost-effective manner to create closed genomes, especially smaller genomes. The PacBio platform can also be used to detect certain forms of DNA methylation.
Oxford Nanopore Technologies, with their SmidgION, MinION, GridION and PromethION platform, can be considered the next step in NGS. Their technology is based on protein nanopores that are set in an electronically resistant membrane. An ionic current is sent through the nanopore by setting a voltage across the membrane. When a DNA strand is passed through the nanopore the current makes is possible to identify, per three nucleotides, which nucleotides are passing through. More and more results about the long read length and high data output are being published. There have been mentions of outputs of 1Gb to 2 Gb per MinION and read-lengths of almost 1 million base pairs. What makes this platform really promising is the expected low price and the compact size. Because it is so small, hand-held sequencing devices are becoming a reality. In 2017, BaseClear became the first worldwide certified service provider for Nanopore sequencing.
BioNano Genomics with their Iris system was commercially launched in 2012. It is still in an emerging phase in the market and isn’t really a Next Generation Sequencer. The Iris System makes use of single-molecule next-generation mapping (NGM) by linearizing DNA molecules through massively parallel Nano-channels. Instead of sequencing individual bases, the multi-colour imaging instrument detects sites that have been fluorescently labelled via a site-specific nicking endonuclease/polymerase repair reaction. One application is in complementing NGS genome assemblies. Because of its ability to analyse very long (at least up to 1Mb) unamplified stretches of DNA at a rate of several Gb per hour, it is especially useful with very large and complex genomes. Here, assembled contigs / scaffolds can be aligned against the genome map of the Iris System.
WHAT ABOUT THE FUTURE OF SEQUENCING TECHNOLOGY?
This is of course just a selection of the most important available sequencing platforms. At the moment there are many emerging technologies. Most of them are based on nanopore-like technologies, but, there are also some that have the potential to change the market for ever. Of course, BaseClear and myself are more than happy to keep you informed about the latest developments. Do not hesitate to contact us if you would like more information.