TY - JOUR TI - Temporal stability assessment in shear wave elasticity images validated by deep learning neural network for chronic liver disease fibrosis stage assessment AU - Gatos, I. AU - Tsantis, S. AU - Spiliopoulos, S. AU - Karnabatidis, D. AU - Theotokas, I. AU - Zoumpoulis, P. AU - Loupas, T. AU - Hazle, J.D. AU - Kagadis, G.C. JO - Medical Physics PY - 2019 VL - 46 TODO - 5 SP - 2298-2309 PB - John Wiley and Sons Ltd SN - 0094-2405 TODO - 10.1002/mp.13521 TODO - Clustering algorithms; Convolutional neural networks; Deep learning; Deep neural networks; Diagnosis; Fuzzy clustering; Fuzzy systems; Learning algorithms; Medical imaging; Shear flow; Shear waves; Stiffness; System stability; Wavelet transforms, Chronic liver disease; Fuzzy C mean; Fuzzy c-means clustering algorithms; Interclass correlation coefficients; Interobserver variability; Learning neural networks; Measurement reliabilities; Shear wave elastography, Image enhancement, Article; artificial neural network; chronic liver disease; clinical evaluation; clinical examination; comparative study; controlled study; convolutional neural network; deep learning; diagnostic accuracy; fuzzy system; human; human tissue; image processing; image quality; liver cirrhosis; liver fibrosis; liver stiffness; major clinical study; radiologist; shear wave elastography; transfer of learning; wavelet transformation; case control study; chronic disease; diagnostic imaging; elastography; fibrosis; liver; pathology; procedures; reproducibility; time factor, Case-Control Studies; Chronic Disease; Deep Learning; Elasticity Imaging Techniques; Fibrosis; Humans; Image Processing, Computer-Assisted; Liver; Liver Cirrhosis; Reproducibility of Results; Time Factors TODO - Purpose: To automatically detect and isolate areas of low and high stiffness temporal stability in shear wave elastography (SWE) image sequences and define their impact in chronic liver disease (CLD) diagnosis improvement by means of clinical examination study and deep learning algorithm employing convolutional neural networks (CNNs). Materials and Methods: Two hundred SWE image sequences from 88 healthy individuals (F0 fibrosis stage) and 112 CLD patients (46 with mild fibrosis (F1), 16 with significant fibrosis (F2), 22 with severe fibrosis (F3), and 28 with cirrhosis (F4)) were analyzed to detect temporal stiffness stability between frames. An inverse Red, Green, Blue (RGB) colormap-to-stiffness process was performed for each image sequence, followed by a wavelet transform and fuzzy c-means clustering algorithm. This resulted in a binary mask depicting areas of high and low stiffness temporal stability. The mask was then applied to the first image of the SWE sequence, and the derived, masked SWE image was used to estimate its impact in standard clinical examination and CNN classification. Regarding the impact of the masked SWE image in clinical examination, one measurement by two radiologists was performed in each SWE image and two in the corresponding masked image measuring areas with high and low stiffness temporal stability. Then, stiffness stability parameters, interobserver variability evaluation and diagnostic performance by means of ROC analysis were assessed. The masked and unmasked sets of SWE images were fed into a CNN scheme for comparison. Results: The clinical impact evaluation study showed that the masked SWE images decreased the interobserver variability of the radiologists’ measurements in the high stiffness temporal stability areas (interclass correlation coefficient (ICC) = 0.92) compared to the corresponding unmasked ones (ICC = 0.76). In terms of diagnostic accuracy, measurements in the high-stability areas of the masked SWE images (area-under-the-curve (AUC) ranging from 0.800 to 0.851) performed similarly to those in the unmasked SWE images (AUC ranging from 0.805 to 0.893). Regarding the measurements in the low stiffness temporal stability areas of the masked SWE images, results for interobserver variability (ICC = 0.63) and diagnostic accuracy (AUC ranging from 0.622 to 0.791) were poor. Regarding the CNN classification, the masked SWE images showed improved accuracy (ranging from 82.5% to 95.5%) compared to the unmasked ones (ranging from 79.5% to 93.2%) for various CLD stage combinations. Conclusion: Our detection algorithm excludes unreliable areas in SWE images, reduces interobserver variability, and augments CNN's accuracy scores for many combinations of fibrosis stages. © 2019 American Association of Physicists in Medicine ER -