The Sound of Streamlined Diagnosis
How HiFi sequencing could revolutionize how we diagnose rare diseases – in just a single test run
Markella Loi | | 4 min read
Many rare diseases have a complex presentation with overlapping symptoms – and patients are often misdiagnosed or subjected to multiple “process-of-elimination” tests. Microarray analysis, optical genome mapping, and short-read exome sequencing make up the standard approach to identifying rare disease markers. But these methods fail to provide answers in more than 50 percent of cases, necessitating further tests and medical procedures.
Long-read high-fidelity (HiFi) sequencing is a promising tool that could help potentially eliminate this complex multi-stage testing process and deliver a more comprehensive view of the genome in a single test. Here, Neil Ward, VP EMEA at PacBio, explains how this technology could be a time- and cost-effective alternative to current diagnostic tools for rare eye diseases and other complex genetic conditions.
Could you describe the key features of this sequencing technology?
HiFi sequencing has its origins in the nanofluidic designs and single molecule real-time chemistry developed by Stephen Turner and Jonas Korlach at Cornell University in the early 2000s.
HiFi sequencing begins when circularized fragments of sample DNA are applied to the surface of a nanofluidic chip called a SMRT (single molecule, real-time) cell. Once on the chip, copies of the DNA molecule are made with fluorescently labeled, free-floating nucleotides. As these nucleotides are added to the newly replicated strand, light is emitted – with different nucleotides emitting different light waves. The system measures these light waves to determine the nucleotides incorporated into the new strand and therefore the DNA sequence. Because the DNA used in HiFi sequencing is circular, multiple copies of each piece of DNA are generated, allowing the sequencing system to cross-reference each copy and maximize accuracy.
The applications of HiFi technology have expanded beyond sequencing just the genome to include the methylome, epigenome, and transcriptome – and, as the technology advances, it is becoming faster and more affordable without compromising on accuracy. It can provide researchers with very consistent sequencing performance with reads that are 15,000 to 20,000 bases or more in length, with 99.99 percent accuracy.
How does long-read sequencing compare with short-read in terms of time and cost?
Short reads and long reads are very similar in time per genome, being approximately a day of sequencing time. One big difference is that short-read sequencing scales more effectively. Long-read instruments can currently measure millions of large DNA fragments simultaneously whereas high-throughput short-read sequencers are capable of measuring billions of small DNA fragments at the same time. That higher density enables a lower price per sample when multiple samples are sequenced in the same run.
Another important difference is that all short-read platforms rely on amplifying copies of the starting DNA and this introduces bias. Long-read sequencing does not need to amplify the starting DNA and that gives more uniform representation of the genome. Short read lengths can also make data analysis challenging as many reads can align to more than one location in the reference genome. Reads >10,000 bases are long enough to align uniquely in the human genome and therefore deliver a significantly more comprehensive view of genetic variation across multiple regions of the genome, which is essential for complex rare disease cases. So, in the context of diagnosing rare diseases, long reads can be more cost-effective as only a single test is required.
What does this mean for patients with rare eye diseases or impairments in vision?
More than 70 percent of rare diseases have genetic origins, meaning that genomic technologies should be able to provide insights into the causal mechanisms of many of these diseases. Yet when short-read exome sequencing is used, only 30-40 percent of patients receive a diagnosis. With its higher quality data, it is no surprise that long-read sequencing has been so helpful in finding explanations in rare diseases and delivering insights for people faster.
There are hundreds of genes associated with vision impairment. Because of this, it’s often more cost effective to sequence the whole genome and focus the bioinformatic analysis on those known genes. However, many of these genes are challenging for different reasons. For example, the RPGR gene is involved in retinitis pigmentosa but it is difficult to amplify and therefore isn’t well sequenced by short-read approaches. There are also families of genes, such as those encoding opsins, where the genome may have multiple copies of very similar genes, which can be difficult to distinguish from one another with short reads.
How would you describe the future of this technology?
There is still research to do before HiFi can be integrated into routine practice but many researchers are pioneering this work. It is quickly becoming the gold standard for genomic sequencing, and previously cost and time have both been barriers to adoption at scale. As the technology continues to become more advanced, the cost is decreasing and throughput increasing. Long-read whole genome sequencing is becoming competitive with other sequencing systems on the market that offer lower-quality reads with reduced accuracy for a lower cost.
We hope that eventually HiFi sequencing will replace current workflows and inform new avenues of research that could explain the genetic origins of unsolved rare disease mysteries that researchers are still grappling with. The difference in data quality and accuracy when HiFi is used is substantial, and we really do believe that these new insights will make a real difference to the lives of those currently waiting for an explanation.
Associate Editor, The Analytical Scientist