NGS & Clinical Diagnostics | Industry Spotlights & Insight Articles

Genome Sequencing Approaches: Choosing Between Long-Read and Short-Read Sequencing

Short-read sequencing approaches are highly accurate for small DNA sequences, while long-read sequencing allows for the study of complete genomes without fragmentation. Combining both approaches can address individual constraints and yield more comprehensive results in DNA analysis.

Much has been made of the advantages and disadvantages of inherent to long-read and short-read genome sequencing.

Both have their own strengths and appropriate use cases: given the high degree of accuracy it offers, short-read sequencing is ideal for the investigation of smaller sequences of DNA. 

While it offers a powerful means of analysing DNA sections, large strands of DNA – such as a complete human genome – have to be fragmented and amplified, which may introduce biases in the samples. 

Here, long-read sequencing is more valuable, as it enables the study of a complete genome without fracturing or segmenting it. 

However, the level of accuracy associated with long-read sequencing can be much lower per read compared to data generated by short-read sequencing. 

Some challenges in imaging can be resolved through a combination of short-read and long-read technologies: here, we explore the strengths and weaknesses of both to assess areas where overlap may be crucial to overall success.  

Short-Read Sequencing Approaches

Short-read technologies carry out sequencing by either synthesis or ligation: each strategy uses DNA polymerase or ligase enzymes to extend numerous DNA strands in parallel. 

They may either be single molecule-based or ensemble-based – sequencing one molecule or multiple identical copies of a DNA molecule that have been amplified together on isolated beads. 

These methods can be synchronously controlled or run in real time, subject to the approach being used. 

As one example, real-time short-read sequencing consists of a free-running DNA polymerase that catalyses all possible nucleotides.

The method subsequently requires the identification of newly-sequenced nucleotides as they are being incorporated. 

Meanwhile, synchronously-controlled approaches use genetic information to facilitate the identification process in an interrupted fashion. 

This can be achieved by adding a single type of nucleotide at once, or by using nucleotide-reversible terminators. 

Some examples of short-read sequencing technologies include Illimunia’s DNA single-stranded amplification approach, as well as clonal amplification techniques such as 454 pyrosequencing, Ion Torrent, and SOLiD.

The latter two approaches utilise emulsion PCR (polymerase chain reaction), generating microbead-bound DNA clones. 

Long-Read Sequencing Approaches 

As previously indicated, long-read sequencing technologies are capable of reading much longer lengths of base pairs than short-reads – typically between 5,000 to 30,000 pairs.

These longer sequences eliminate the amplification bias that can be associated with short-read sequencing approaches, generating reasonable lengths to overlay with other sequences for better sequence assembly. 

 At present, there are two key long-read sequencing technologies on the market: Oxford Nanopore’s long-read sequencing platform, and Pacific Biosciences’ Single-Molecule Real-Time (SMRT) sequencer. 

The SMRT sequencer can generate reads in excess of 10,000 bases in less than two hours.

Sequencing takes place on a zero-mode waveguide chip – tiny structures which create highly confined optical observation volumes, with DNA polymerase fixed at the bottom. 

DNA polymerase is used to sequence a complementary strand, with the fluorescence measured to identify corresponding nucleotides.

Oxford Nanopore’s platform is capable of producing reads of up to one million base pairs: the method utilises changes in molecular ion flow as nucleotides pass through a nanopore – hence the name.

The DNA molecule is threaded through a bioengineered channel in a biological membrane. 

Electrical current across the channel is dependent on the specific nucleotide passing through at any given moment, and this charge is then used to determine the base sequence.

Short-read genome sequencing is better-suited to detailed investigations of a concise section of DNA, while long-read approaches allow researchers to develop a more complete picture of the molecular information being investigated. 

Get your weekly dose of industry news and announcements here, or head over to our Omics portal to catch up with the latest advances in spatial analysis and next-gen sequencing. To learn more about our upcoming NextGen Omics UK conference in London, click here to download an agenda or register your interest.