Using Machine Learning to Enhance Precision Oncology

Precision medicine instruments produce incredible amounts of data, sometimes measured in terabytes or even petabytes. Machine learning (ML), a subfield of artificial intelligence, learns from data to make more accurate predictions.

Supervised ML algorithms use training data to make classifications. For example, ML can identify low prevalent mutations in circulating tumor DNA (ctDNA) by identifying real mutations from artifacts that arise from sequencing technologies. The ML algorithm learns by examples of mutations and artifacts that are provided in the training data. The predictions made by these algorithms are evaluated in independent data, different from the training set. This allows scientists to determine if the algorithm can provide more generalized results based on the initial data and whether it spots patterns that can be applied to an individual’s ctDNA to produce more accurate results.

ML offers a robust way to harness the predictive qualities of genomic data and advance precision oncology. The first advantage is speed. Instead of being a hindrance, large datasets can be an advantage, as they provide more information to teach the system.

ML can also generate new knowledge and improve accuracy. The complex relationships embedded in data can be challenging to unwind, making it difficult to draw conclusions. However, with the right algorithms, we can identify those relationships to make accurate predictions and to expand our biological understanding.

Collectively, these advantages improve precision oncology’s cost-effectiveness and our faith in molecular testing. Being able to make better predictions based on existing data can both accelerate the identification of genetic biomarkers that could match patients with targeted treatments demonstrated to improve outcomes and alleviate the need for expensive and time-consuming orthogonal validation tests.

Applications in Oncology

Projecting tumor evolution is one example of how ML predictions can be brought into patient care. Phylogenetic trees closely track this process, as early mutations become tumor subclones and then subclones of those subclones. Humans have a difficult time determining how a tumor has evolved and where tumor evolution will go, but ML has the requisite capabilities to make those predictions.

One of the most important ML applications is predicting which mutations may be driving a particular cancer. This is not always clear-cut, as there may be several oncogenes and each one could potentially play the lead role. ML can help untangle which ones are dominant and should be targeted for treatment.

The next step is drug selection. In this scenario, scientists contribute the genomic information from the tumor along with a drug’s chemical structure to determine whether that therapy will be effective.

The same approach can be applied to combination therapies, as ML could delineate how two or more drugs work together against cancer. Machine learning can help predict whether a specific combo will be effective, as well as the side effects a patient might expect. In addition to improving patient care, this could also accelerate clinical trials.

Another very important application is prognostics. On a statistical level, cancer recurs quite frequently; however, on an individual patient level, it’s far from certain a patient’s cancer will return. Clinicians are often forced to take a wait and see approach, using scans, biopsies and ctDNA tests to determine if tumors are back.

Machine learning could help alleviate this uncertainty, providing more accurate prognostics to guide treatment. If a patient is at much higher risk for a recurrence, clinicians might pursue more aggressive treatments or other strategies that could mitigate risk. On the other hand, a patient with a low chance of recurrence could avoid unnecessary treatments.

Next Steps

One of the major bottlenecks for ML is data labeling, during which biologists and/or computer scientists tag the data, providing context to help the ML algorithm learn. Unfortunately, data labeling is often a manual process that can be cumbersome, expensive and time-consuming. It forces scientists to make choices: Do I take the extra time to include the maximum amount of data to inform the ML model, or do I move forward, more rapidly, with what I have?

Canexia Health and others are exploring semi-supervised learning, in which unlabeled and labeled data are combined into training sets. Even though much of the data is unlabeled, combining it with the labeled variety gives the algorithm important information to make accurate decisions.

Also, one of the downsides of classical machine learning is the need for human intervention to transform the raw data into a set of engineered features usable for model training. However, that is less of a problem with deep learning, a form of machine learning that uses artificial neural networks (machine learning algorithms designed to replicate actual brains). Deep learning models are end-to-end, meaning that they automatically extract relevant features from the raw data.

This approach could be particularly useful to improve ctDNA “limits of detection,” the lowest concentrations of detectable genetic material that will provide accurate results, which could expand the number of patients who can benefit from this technology.

Ultimately, the goal is to include data from diverse sources, multi-modal data, such as ctDNA and imaging, to better appreciate where a cancer is now and where it could be going. While a challenging application, we are excited to be working on this and look forward to sharing more in the future.

Facebook Tweet LinkedIn

Back

Recent Resources

Blog

June 5, 2023

Revolutionizing Endometrial Cancer Classification: The Impact of the Endometrial Promise Classifier

To discuss the development of the Endometrial ProMisE Molecular Classifier, and the new NSMP study further, we are delighted to have Dr. Jessica McAlpine and Dr. Amy Jamieson, esteemed gynecological cancer surgeons at Vancouver General Hospital, who have been instrumental in the development and implementation of this innovative tool.

Blog

May 15, 2023

Poster sessions at AACR: a recap

We were grateful for the opportunity to attend and present at the AACR Annual Meeting last week, where three of our scientists highlighted some of the exciting work happening in our Vancouver lab.

Press

March 28, 2023

Imagia Canexia Health Announces Three Abstracts to be Presented at AACR Annual Meeting 2023 Highlighting the Company’s Latest Research Findings

Imagia Canexia Health announced that three abstracts highlighting the company’s new learnings and tools will be presented at the American Association for Cancer Research (AACR) Annual Meeting 2023, taking place April 14-19, 2023, at the Orange County Convention Center in Orlando, Florida.

View All Resources

Together, we’re making precision oncology accessible

Together, we’re making precision oncology accessible

Together, we’re making precision oncology accessible

Helping you deliver precision care

Your partner in precision oncology

Molecular oncology, streamlined

Detecting Genomic Aberrations in Solid Tumor Tissue

Detect mutations in solid tumor tissue

Detect known and novel fusions in solid tumors

Detect Mutations in Plasma

Blog

Using Machine Learning to Enhance Precision Oncology

Recent Resources

Blog

Revolutionizing Endometrial Cancer Classification: The Impact of the Endometrial Promise Classifier

Blog

Poster sessions at AACR: a recap

Press

Imagia Canexia Health Announces Three Abstracts to be Presented at AACR Annual Meeting 2023 Highlighting the Company’s Latest Research Findings

Together, we’re making precision oncology accessible

Helping you deliver precision care

Helping you deliver precision care

Detect mutations in solid tumor tissue

Detect Mutations in Plasma

Detect known and novel fusions in solid tumors

Molecular oncology, streamlined

Your partner in precision oncology