Developing deep learning tools for medical image classification
For many organizations, leveraging Artificial Intelligence’s full capability and effectiveness begins by exploring and creating an effective roadmap. As AI can itself create new offerings, products, and services, here we consider deploying machine learning and other tools on medical image data to systematically improve diagnostic tools. Health problems are fundamental issues for people worldwide and improving services, delivery methods, and diagnostic accuracy is critical for improving health outcomes. The collection and automation of health data play an equally critical role in addressing health problems as the industry works towards automated diagnosis and support systems for better patient care.
Examining challenges in medical imaging
Deep learning techniques, based on neural networks and other algorithms, have the tremendous potential to influence the practice of radiology and medical imaging at large. Unlike many other fields of medicine, nearly all of the primary data collected in medical imaging is now digital, and with advances in modern electronic health records, cloud networking, infrastructure, and clinical data distribution, these digital data lend themselves to analysis by Artificial Intelligence.
The result is an increasingly large amount of image data available to answer clinically meaningful questions and to develop advanced medical applications. Deep learning models, coupled with image processing, can be used for detection, diagnosis, segmentation, and simulation using raw MRI, CT, X-ray, and digital pathology data. This boom in innovation is partly due to the significant advances in the computational processing of very large datasets. NVIDIA is at the forefront of this computing renaissance, with its GPU architecture, and collectively, these advances constitute a disruptive force that will create new markets and drive transformation for many medical device, provider, and payer organizations.
There have been several recent successes in applying deep learning to medical imaging tasks, notably in projects developing algorithms for the detection of diabetic retinopathy, detection of skin lesions using dermatoscopic images, detection of cancer metastases in pathology and radiology, and numerous anatomical and bone fracture conditions in X-ray images. Medical imaging is an optimal opportunity for deep learning-based success because there is often a direct mapping of the input image pixel data to a specific diagnosis and the increasing regulatory approval for AI-based tools.
SFL Scientific works to partner with organizations to accelerate the development of these novel tools by combining medical image analysis and deep learning. SFL Scientific teamed with InformAI, a Houston-based AI-company, to focus on applying data-centric strategies and technology solutions to create clinical decision support tools with medical-grade accuracy that meets the confidence and consistency standards necessary for rigorous healthcare use. These tools will assist clinicians with “augmented intelligence” by providing speed, filtering, and diagnostic capabilities, as well as helping mitigate the black-box mysticism of AI.
InformAI and SFL Scientific are fortunate to be working on applications that have a clinical impact and improve patient outcomes. The two companies are poised at the tip of the iceberg of a market for scientists, physicians, and researchers to extract critical care information from increasingly available unstructured data.
Medical applications for deep learning
Every year, over 700,000 surgical procedures are performed to remedy sinus-related conditions, and $10B is spent annually in the United States on such medical care. The use of computerized tomography (CT) scans is the primary imaging modality used to diagnose these conditions and surgeries are performed to address conditions involving soft tissue inflammation, nasal deviations, fractures, and tissue masses. Together with InformAI’s R&D team, SFL Scientific worked towards developing solutions to classify and predict sinus-related conditions and disease states from 3D CT head scans. The goal was to create a suite of diagnostic tools centered around deep learning image classifiers to predict the occurrence of certain diseases and to assist radiologists and physicians by accelerating the diagnostic process.
Deep learning applications often rely on extremely large datasets, however, the availability of ground truth data annotated by expert radiologists and physicians is not easily available for many reasons, including privacy concerns, commercialization aspects, time and manpower considerations, and, of course, cost. Simply, annotation of medical data is expensive, tedious, and requires a dedicated team with access to archived patient cases. For certain rare medical conditions, it is nearly impossible to collect enough representative data examples to use for training datasets.
In late 2017, InformAI and SFL Scientific started working together to develop a deep learning-based system for computer-aided diagnostics, bringing a new level of intelligence to legacy imaging solutions and clinics. To accomplish these goals, InformAI assembled some of the largest patient image datasets in the industry for AI model training development.
One of the biggest issues facing healthcare radiology professionals is the operational fatigue that comes with information overload and visual strain associated with the review of medical images. In fact, this problem represents the “weak link” of a medical diagnosis that often slows downtime to decisions, and furthermore, may affect both the volume and quality of those medical decisions. To complicate matters, many parts of the world lack the volume of trained clinical staff to diagnose these indications within their local communities. For all these reasons, computer-aided diagnostic tools can affect the overall success of patient treatment and potentially reduce medical errors by working as an unbiased “assist tool” to reduce variability and increase the specificity of readings. The amount of information available to clinicians is overwhelming, so systems need to aim at reducing the time to identify abnormalities and help improve the productivity of clinicians and the associated diagnostic accuracy. Similarly, automating routine image examination stands to reduce the burden placed on clinical staff who are already pressed for time, especially in emergency or surgical scenarios.
How do you teach a machine to behave like a radiologist and be trusted as an assist tool in a clinical setting? When radiologists study patient scans, they consider patient history and external variables, leveraging years of training and experience to identify patterns in medical images that historically are indicative of a medical issue. Today, there are vast archives of patient records that can be used for such an algorithmically driven project; this vault of normal and anomalous points on a medical record and corresponding imaging tests are used as the training input to build custom AI algorithms. Leveraging its unique position with its clients, InformAI worked with its Texas Medical Center partners to amass an image library consisting of 18 million images and using experts, structured the medical labeling of over 20,000 patient CT studies. Jim Havelka, CEO, believes it is the largest labeled image library and annotated datasets of its kind involving paranasal sinus conditions.
Classification of Disease in 3D
For machine learning, good predictive performance is only as good as the underlying criteria, data quality, and definitions of the annotations. A framework was created to review and segment over 23 diseases within the sinus scans into specific regions and groups, greatly reducing the region of interest of the resultant 3D image stack. The non-scalability of maintaining an expert team, and the volume of data required to do so, motivated the development of learning methods that can use weakly-labeled training sets with global and binary labels in those regions to create algorithms for rapid differentiation of abnormalities from the healthy state. By feeding increasingly complex data sets to the neural network, preprocessing, subsampling, and using augmentation techniques to computationally produce a larger volume of data, over time, these algorithms begin to behave like a trained radiologist to quickly identify anomalies and provide confidence intervals for areas which will require a second, human opinion. Despite these recent advances, standalone tools having no physician involvement using direct applications of machine learning to healthcare remains a challenge inherent in the goal of making personalized predictions from large amounts of noisy, biased, and unstructured data.
A thorough exploratory data analysis is critical, and inspection of the available images must precede solution design; understanding the variability, image size, disease incidence, modalities, and other parameters, directly affecting the final solutions.
Instead of the normal shades of gray that a human radiologist sees in CT scans, a computer represents each image as a matrix of numbers representing the pixel brightness. Traditional computer vision techniques typically involve computing the presence of numerical patterns in this matrix, such as boundaries for low-level features, and applying machine learning algorithms designed to distinguish images based on these features. Significant expertise and time is required to engineer the best features for distinguishing specific conditions and separating classes of images. This optimization problem of distinguishing features is traditionally difficult, but is the basis of deep learning; it uses hierarchical abstractions and different functional layers to learn representations and features from the data.
Convolutional neural networks (CNN) have been very successfully used for image classification and other types of imaging tasks. Convolution operations that extract image features produce matrices, and these “layers” in a CNN generate output matrices stacked in a volume. This volume can then serve as the input for another layer that may detect more complex features in the input image. Each layer can then be calculated and converted within the output nodes of the network into probabilities for classification. As no pre-trained deep networks are readily available for 3D image datasets for CT or MRI for baseline testing, CNNs for these images need to be trained from raw patient data with large labeled data sets to achieve the training/predict process desired.
InformAI compiled large volumes of 3D CT images in DICOM image format from its partner network of Houston-based medical centers and recruited expert radiologists to assist in annotation and diagnosis. SFL Scientific worked with InformAI to transform the input images and build the training data set with extensive work to tune the locations of the disease zones in response to the feedback from the radiologist teams and clinical researchers. To begin the process of creating a functional deep learning pipeline, a suite of tools was developed to take patient anonymized studies and annotate and process the raw DICOM data into a suitable format for AI model development. Tools were developed to automate extracting the target 3D image segments from the series of 3D CT scans to prepare and ingest into the CNN Tensorflow models. The extracted segments corresponded to regions that radiologists use to make diagnoses from different coronal, sagittal, and axial views of the human head. The software tools also reduced scan size, segmented the disease region of interest, lowered image noise to improve AI model prediction accuracy, and lowered the memory requirements and time required for training. As these voxel sizes are quite large, with scans approximately 400 x 400 x 300 pixels, the memory required surpassed what is typically available with consumer GPU hardware.
Overcoming the complexity of 3D image data
As part of this image preprocessing, registration algorithms were developed to accurately align anatomical features within regions of interest, resampling and normalizing the data in the process. As the disease features are very small, sometimes only a few voxels, training a CNN to detect and begin to generalize a particular disease in a volume of approximately 300,000 times the disease size requires many training epochs and computational scalability; using a typical computational platform to train these large CNNs is not practical. SFL Scientific brought together partners such as NVIDIA and AWS on securing a mini-cluster of high-end GPUs such as the NVIDIA V100s to serve as the computational platform that enabled project development. With the data ingestion and transfer requirements for this large volume of data, teams leveraged Amazon S3 Storage as a central repository, as using other storage schemes was not practical for maintaining the high data throughput required during training operations. By developing all code and software initially in Python and Tensorflow, the systems support transferability, ease of deployment, and operate in easily provisioned environments across locations.
Together, InformAI and SFL Scientific developed the Deep Learning technology stack, involving 3D CNN models consisting of approximately 400 million parameters, to detect the targeted list of medical conditions. As the accuracy of the final model is reliant on the quality of the dataset used for training the deep neural network, the extended curation, assembly, and pre-processing development was critical when considering performance. To refine the InformAI model, SFL Scientific conducted an extensive exploratory data analysis (EDA) on the datasets to understand the incidence and correlations of the medical conditions, concluding that despite the availability of thousands of annotated scans available, certain class imbalances, rare conditions, and positive examples of conditions of interest would still be underrepresented and would prevent achieving the necessary accuracies for viability in a real–world clinical setting.
To solve this problem, SFL Scientific used data augmentation techniques to increase the effective size of the training sets by exploiting all the symmetries of 3D space using random rotations, random transformations, random zooming by small amounts, mirroring, etc. Typically, there are two types of data augmentation that can be deployed: Lossless and lossy. Given the 3D nature of the data, the lossless augmentation refers to the symmetries of the cube where rotations by 90º in any direction, brightness/contrast adjustments, or adding de-noising steps preserves the fidelity of the image data. Lossy augmentation consists of small rotations (less than 90º), zooming, and down–sampling of that data for multiple views. While lossless augmentation can be generated quickly by exploiting the processing power of GPUs, as the data is being read into the network, lossy augmentation requires slow operations that create new data on storage, as each 3D transformation needs to be individually computed and applied to each voxel.
For high-throughput performance, lossy data needs to be generated ahead of time and saved independently, effectively increasing the amount of data to be stored. By creating these various data assets with smaller input and features, curriculum learning can then be employed to progressively train the model in the same manner that humans learn increasingly complicated tasks by incrementally adding difficulty to the process. Specifically, learning performance is typically much better when the examples are not randomly presented in training, but feed in a meaningful order which illustrates gradually more complex concepts and produces a much more robust training strategy. Further, curriculum learning can be seen as a general strategy for the global optimization of non-convex functions.
For these novel and complex identification tasks, building network architecture and training from scratch is the only option to achieve state-of-the-art results and create clinically viable products. Each process can take several weeks to months and require numerous iterations between pre-processing, annotation and label quality, and benchmarking the network across different sets of parameters, images, or diseases. Hyperparameter tuning is difficult for deep learning models because the number of iterations combined with iteration time makes the problem intractable. Semi-automated methods that progressively iterate over different combinations are beginning to be developed, however, there is no substitute for a deep understanding of the mathematical options and leveraging cutting-edge techniques, similar to executing traditional R&D processes. Tuning in this case will usually involve manual inspection of performance, identification of areas of confusion, and dependable experimental design.
Models can be deployed as cloud services or on-premise alongside screening areas and surgical suites. If organizations choose a cloud-based solution (all major cloud providers have extensive GPU resources and can support data pipelines and deep learning environments), the generic paradigm to effectively address scaling is to host the model, training, inferencing, model feedback loop, and model versioning remotely. Then, each model solution developed can become an “edge deployment” of the best model trained. Typically, such deployment-at-the-edge solutions address potential latency constraints, as well as scalability ones: The necessary edge software is installed locally, and scalability can be achieved with proper cloud configuration to respond elastically to on-demand use.
InformAI worked with its healthcare partners to amass an image library consisting of 18 million images and structured medical labeling of over 20,000 patient studies. InformAI is focused on building AI-enabled tools to assist healthcare organizations with improving operational efficiency, patient outcomes, and medical diagnosis. By working as an assist tool, the developed InformAI Sinus Classifier can help radiologists and physicians speed up the evaluation and triage of 3D CT scans, from workflow to filtering, and diagnosis of sinus medical conditions using an image viewer/AI algorithm product at the point-of-care.
SFL Scientific develops corporate and technical vision and provides implementation, model development, and computing capabilities that are instrumental to building these products and services. As a preferred service provider of NVIDIA and AWS, SFL Scientific creates enterprise-grade data and computing architectures, where organizations are able to assemble and deploy accurate production tools backed by expertise in a healthcare and biotechnology environment.
The ability to couple large medical imaging datasets from our leading healthcare partners with the exceptional computational performance of the NVIDIA V100 GPUs and novel deep learning model development from SFL Scientific have been critical in building our portfolio of AI image classifier applications.
—Jim Havelka, Founder & CEO