1 / 38

An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC. Grace Dasovich Robert Kim Midterm Presentation August 21 2009. Outline. Outline. Related Work Data Modeling Approach and Results Similarity Measures Artificial Neural Network

regis
Download Presentation

An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC Grace Dasovich Robert Kim Midterm Presentation August 21 2009

  2. Outline Outline • Related Work • Data • Modeling Approach and Results • Similarity Measures • Artificial Neural Network • Multivariate Linear Regression • Conclusions • Future Work

  3. Related Work • Computer-Aided Diagnosis (CADx) based on low-level image features • Armato et al. developed a linear discriminant classifier using features of lung nodules • Need to find the relationship between the image features and radiologists’ ratings

  4. Related Work • Image features and the semantic ratings • Lung Interpretations • Barb et al. developed Evolutionary System for Semantic Exchange of Information in Collaborative Environments (ESSENCE) • Raicu et al. used ensemble classifiers and decision trees to predict semantic ratings • Samala et al. used several combinations of image features and the radiologists’ ratings to classify nodules

  5. Related Work • Similarity • Li et al. investigated four different methods to compute similarity measures for lung nodules • Feature-based • Pixel-value-difference • Cross correlation • ANN

  6. Materials Data • LIDC Dataset • 149 Unique Nodules • One slice per nodule, largest nodule area • 9 Semantic Characteristics • Calcification and Internal Structure had little variation, thus were not used • 64 Content Features • Shape, size, intensity, and texture 6

  7. Outline • Related Work • Data • Modeling Approach and Results • Similarity Measures • Artificial Neural Network • Multivariate Linear Regression • Conclusions • Future Work

  8. Similarity Measures • Cosine Similarity • Jeffrey Divergence • Euclidean Distance

  9. Similarity Measures

  10. Similarity Measures

  11. Similarity Measures • Computed feature distance measures

  12. Outline Outline • Related Work • Data • Modeling Approach and Results • Similarity Measures • Artificial Neural Network • Multivariate Linear Regression • Conclusions • Future Work

  13. Two three-layer ANNs Input (64 neurons), hidden layer (5 neurons), output (1) Input (64 neurons), hidden layer (5 neurons), output (7) Input = 64 feature distances Output = Semantic similarity or difference in semantic ratings Hyperbolic tangent function, backpropagation algorithm, 200 iterations Methods

  14. ANN with a single output 640 random pairs from all 109 nodules 231 pairs from nodules with malignancy > 3 496 pairs from nodules with area > 122 mm2 Methods

  15. Methods • ANN with seven outputs • 640 random pairs from all 109 nodules

  16. Methods • Leave-one-out method • Cosine similarity or Jeffrey divergence or difference in Semantic ratings used as teaching data • An ANN trained with entire dataset minus one image pair • The pair left out used for testing • Correlation between calculated radiologists’ similarity and ANN output calculated

  17. Methods • ANN with a single output • 640 random pairs from all 109 nodules • 231 pairs from nodules with malignancy > 3 • 496 pairs from nodules with area > 122 mm2 • ANN with seven outputs • 640 random pairs from all 109 nodules

  18. ANN using 640 random pairs Results

  19. ANN using 231 pairs with malignancy rating > 3 Results

  20. ANN using 496 pairs with area > 122 mm2 Results

  21. ANN output vs. target values using Jeffrey divergence for the 640 pairs (r = 0.438) Results

  22. ANN using random 640 pairs and the Jeffrey divergence with seven semantic ratings Results

  23. Outline Outline • Related Work • Data • Modeling Approach and Results • Similarity Measures • Artificial Neural Network • Multivariate Linear Regression • Conclusions • Future Work

  24. Methods Methods • Normalization of Features • Min-Max Technique • Z-Score Technique • Pair Selection • Looked for matches between k number of most similar images based on semantic and content 24

  25. Methods Methods • Multivariate Regression Analysis • Select features with highest correlation coefficients • Feature distance measures 25

  26. Nodule Analysis Determine differences between selected and non-selected nodules Define requirements for our model Methods

  27. Results Results 27

  28. Results

  29. Results Results R2 = 0.871 29

  30. Results Results 30

  31. Results Results 31

  32. Results Results 32

  33. Results Results A. Equivalent Diameter, B. Standard Deviation of Intensity, C. Malignancy, D. Subtlety

  34. Conclusions Preliminary Issues • The ANN also is not yet sufficient to predict semantic similarity from content • Best correlation 0.438 • Malignancy correlation 0.521 • Jeffrey performed better unlike linear model • A semantic gap still exists

  35. Conclusions Conclusions • Our linear model applies to a specific type of nodule • Characteristics: High malignancy, high texture, low lobulation, and low spiculation • Features: Larger diameter, greater intensity • Linear models are not sufficient for determination of similarities • R2 of 0.871 with chosen nodules 35

  36. Future Work Future Work • Reduce variability among radiologists • Use only nodules with radiologists’ agreement • Find best combination of content features • 64 may be too many • Currently only using 2D

  37. Future Work • Different semantic distance measures • Some ratings are ordinal, Jeffery is for categorical • Different methods of machine learning • Incorporate radiologists’ feedback into training • Ensemble of classifiers

  38. Thanks for Listening Thanks for Listening Any Questions? 38

More Related