Cancer Diagnosis from Blood Samples
At a Glance
Client
Industry
Challenge
Design an AI capable model to analyze data from Matrix-assisted laser desorption/ionization (MALDI) and custom LC-MS/MS systems to determine whether patient samples tested positive for certain types of lung cancer.
Solution
Built a classification XGBoost model paradigm to predict binary values for new spectra for each of the four classifiers. Built and tested deep learning models that achieved accuracy scores that exceeded legacy approaches that had been developed for the past 20 years with little and noisy training data.
Expertise & Technology
Business Challenge
Design and build four AI capable models to analyze data from Matrix-assisted laser desorption/ionization (MALDI) and custom LC-MS/MS systems to determine whether patient samples tested positive for certain types of lung cancer while achieving robust accuracy against different instruments and operators.
SFL Scientific Solution
SFL Scientific used a classification XGBoost model paradigm that was built to predict binary values for new spectra for each of the four classifiers. The classifiers were labels designated based on processed MALDI-TOF spectra data features.
The classifiers are intended to be combined via a decision tree to generate a final diagnostic test result. This is designed to focus on the accuracy and simplicity of predicting the four underlying classifiers in order to retain flexibility in the future generation of the decision tree to predict the final stage diagnostic.
The models were trained and tested using tabular numerical spectra data with 282 features in the data.
Results
The final models all achieved relatively high accuracy of 80%+. Once integrated and approved by regulatory agencies, the techniques will significantly enhance the diagnostics capability such as reducing the time and variability diagnosis and dramatically improve patient treatment planning.