Commentary

NGS Data Analysis and Use: Turning Challenges Into Success

By Oliver Picken |
03 November 2021
The applications of NGS are swiftly expanding, and new methods for data storage, analysis, and visualisation are needed. However, the complexity of sample processing for NGS has created problems in managing, analysing, and storing the datasets. 

The global next-generation sequencing data analysis market is predicted to reach USD 1.72 billion by 2028. Even though the market has evolved thanks to the latest technological developments, many research organisations are facing challenges to adopt advanced tools & platforms to perform and speed up NGS data analysis. One of the key issues is a lack of sufficient computational infrastructure and bioinformatics tools.  

The NGS data analysis process includes three main steps: primary, secondary, and tertiary analysis. Some stages can be automated through the sequencing tool, while others are completed after. Generally, NGS data analysis includes assembly, alignment, identification of mutations, verification, and visualisation.  

Next-generation sequencing (NGS) is an emerging technology used to analyse DNA and RNA sequences. It can be used for whole-genome or specific regions of interest at a much lower cost than traditional sequencing. 

The applications of NGS are swiftly expanding, and new methods for data storage, analysis, and visualisation are needed. However, the complexity of sample processing for NGS has created problems in managing, analysing, and storing datasets. 

History of NGS: Data and Cost 

Courtesy of National Human Genome Research Project – The Cost of Sequencing a Human Genome

NGS began to become common place in the 2000’s. Now, it provides an efficient, rapid, low-cost approach to DNA sequencing, compared to traditional Sanger methods. 

The “$1,000 genome” catchphrase began to gain attention in 2001 following the first draft of the Human Genome Project. The phrase highlighted the main issue at the time. Estimated costs of mapping the entire human genome were around $2.7 billion while $1000 was the goal for affordable personalised genome sequencing.  

This goal was reached around 2015, making genome sequencing accessible to millions of people worldwide and prices have continued to drop. While these rapid advancements have opened new opportunities in life science and pharmaceuticals, the volume of data created has created problems for analysis. Below we look at some of the key ways researchers are working on ways to overcome the challenges of NGS data analysis.  

NGS Data Analysis: Volume and Standardisation  

Single experiments can produce terabytes of data. Such high volumes of data can provide incredible insights, but only if it is processed and analysed correctly. Given sufficient computational resources, the overall workflows can be streamlined and accelerated by establishing centralised standard pipelines. 

The aims of a study define the analysis methods. With that said, scientists around the globe have made some progress in creating standardised pipelines for data processing and analysis. However, analysis workflows need to strike a balance between standardisation and flexibility. They still need to be adaptable enough for use in customised analyses and to adopt novel methods quickly.  

Larger datasets can require considerable time and computing power to analyse, which is often cost-prohibitive all but the wealthiest laboratories. Therefore, the majority of NGS software systems are deployed as cloud-based services distributed over cloud-based platforms. Cloud computing is an emerging technology that provides a different infrastructure for tackling computational challenges in NGS data analysis. Cloud computing has created new possibilities to analyse NGS data at reasonable costs, especially for laboratories lacking a dedicated bioinformatics infrastructure. 

NGS Data Sharing: Consistency and Quality Standards

The lack of data sharing is becoming a significant problem. One of the critical problems with data sharing is data quality and consistency. Peter Causey-Freeman (Lecturer in Healthcare Sciences, Clinical Bioinformatics – University of Manchester) explains: 

There are all sorts of different data sets coming out, particularly in clinical genomics. It’s very difficult to interpret of whether a variant is damaging or not because we need to access data from multiple sources, however a lack of consistency in nomenclature and quality standards makes this very difficult. Open-source databases such as ClinVar and LOVD are great, but they don’t really apply any stringent criteria on how the decisions made by diagnostic labs are presented, and if you can’t map the decision-making process, it is very difficult to see how relevant the data is and whether it can be used in your setting.  

Developing a more a more cohesive data sharing strategy so we can bring together clinical data more efficiently would be useful and could be adopted by these databases enabling them to enforce stricter standards, ultimately making diagnostic decisions faster, more efficient, and more reliable in the future.” 

An additional concern with sharing NGS data sharing is privacy.  

Data Privacy: A Balancing Act 

NGS DATA Analysis

The balance between supporting individual privacy and sharing genomic information for research purposes has been a topic of considerable controversy. Gathering useful insights require large amounts of sequence data. This includes health and demographic information to assess which genetic variants correlate with particular phenotypic outcomes. This requires large numbers of people who are willing to share their whole genome sequence data and information. 

NGS data has features that create more challenges for protecting the privacy of research participants and patients. Sharing additional information along with, for example, whole genome sequence makes it easier to identify an individual and connect that to their private health information.  

Conclusion  

NGS data analysis and usage are enabling new, pioneering treatments but there are growing concerns that the volume, quality and identifying features create problems for ethical and representative utilization. Despite these challenges, some of the industries brightest minds are applying themselves to solutions and each year rapid progress has been made. For more on this subject, consider attending one of our upcoming Next Gen Omics series events where industry leaders share their latest research and innovations.  

Speaker Biographies

Peter Causey-Freeman (Lecturer in Healthcare Sciences, Clinical Bioinformatics – University of Manchester)

Peter Causey-Freeman is a lecturer and researcher in healthcare sciences specialising in medical genomics (bioinformatics) at the University of Manchester. He works in the Division of Informatics, Imaging and Data Sciences and the Integrated Interdisciplinary Innovations in Healthcare Science Hub. He teaches in the Manchester Academy for Healthcare Science Education, teaching trainee NHS Scientists clinical genomics and genomics bioinformatics, also teaching on a wider Masters programme and online PGCert in genomics medicine. He also co-leads the unit Introduction to programming for clinical bioinformatics

Share this article

Share on facebook
Share on twitter
Share on linkedin

You may also be interested in...

Post-Event Report
The NextGen Omics UK Series 2020 is one of Oxford Global's most popular events and saw another year of pioneering research in genomics.
01 November 2021

Continue browsing

Share this article

Share on facebook
Share on twitter
Share on linkedin

Join our Omics mailing list

We produce cutting edge congresses and summits for the Life Sciences Industry, bringing together industry leaders and solution providers at a senior level, creating the opportunity to partner, network and knowledge share.

Contact Us:

Copyright Oxford Global Marketing Limited. All rights reserved.

Stay up to date

Sign up for our monthly Editorial Newsletter to keep up with all things Omics

Submit your details to receive the monthly newsletter & to be kept up to date about relevant events, monthly discussion groups and portal membership offers. You may opt-out at any time. Please check our Privacy Policy to see how Oxford Global protects and manages your data.