1 / 51

November 29, 2004 Michael McNitt-Gray (UCLA), Anthony P. Reeves (Cornell),

The Lung Image Database Consortium (LIDC) Data Collection Process This presentation based on the RSNA 2004 InfoRAD theater presentation titled “The Lung Imaging Database Consortium (LIDC) : Creating a Publicly Available Database to Stimulate Research in CAD Methods for Lung Cancer” (9110 DS-i).

neka
Download Presentation

November 29, 2004 Michael McNitt-Gray (UCLA), Anthony P. Reeves (Cornell),

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Lung Image Database Consortium (LIDC) Data Collection ProcessThis presentation based on the RSNA 2004 InfoRAD theater presentation titled“The Lung Imaging Database Consortium (LIDC) : Creating a Publicly Available Database to Stimulate Research in CAD Methods for Lung Cancer”(9110 DS-i) November 29, 2004 Michael McNitt-Gray (UCLA), Anthony P. Reeves (Cornell), Roger Engelmann (U. Chicago), Peyton Bland (U. Michigan), Chris Piker (U. Iowa), John Freymann (NCI) and The Lung Image Database Consortium (LIDC)

  2. Principal Goals To establish standard formats and processes for managing thoracic CT scans and related technical and clinical data for use in the development and testing of computer-aided diagnostic algorithms.

  3. Principal Goals To establish standard formats and processes for managing thoracic CT scans and related technical and clinical data for use in the development and testing of computer-aided diagnostic algorithms. To develop an image database as a web-accessible international research resource for the development, training, and evaluation of computer-aided diagnostic (CAD) methods for lung cancer detection and diagnosis using helical CT.

  4. The Database • The database will contain: • A collection of CT scan images • Technical factors about the CT scan • Non-patient information in DICOM header • For Nodules > 3 mm diameter • Radiologist drawn boundaries • Description of characteristics • For Nodules < 3 mm • Radiologist marks centroid, no characteristics • Pathology results or diagnosis information whenever available • All in a searchable relational database

  5. How to do this?The LIDC Data Collection Process • For nodule detection, recent research has demonstrated that the results from a single reader are not sufficient

  6. How to do this?The LIDC Data Collection Process • At least two and perhaps four readers may be required. • Not practical to do joint reading sessions across five institutions • LIDC Will NOT do a forced consensus read. We won’t force agreement on location of a nodule nor its boundary.

  7. Truth – DetectionLIDC – Initial Approach • Multiple Reads with Multiple Readers • First Read – 4 readers, each reads independently (Blinded) • Compile 4 blinded reads and distribute to readers • Second Read – Same 4 readers, this time unblinded to the results of the other readers from the first reading. • Still, no forced consensus on either location of nodules nor on their boundaries.

  8. Blinded Reads – Each Reader Reads Independently (Blinded to Results of Other Readers)

  9. Blinded Read for Reader 1 – Marks Only One Nodule Reader 1

  10. Blinded Read for Reader 2 – Marks Two Nodules (Note: One nodule is same as Reader 1) Reader 2

  11. Blinded Read for Reader 3 – Marks Two Nodules (Note: Again, One nodule is same as for Reader 1) Reader 3

  12. Blinded Read for Reader 4 – Did Not Mark Any Nodules Reader 4

  13. 2nd Round - UnBlinded Reads Readings in Which Readers Are Shown Results of Other Readers Each Reader Marks Nodules After Being Shown Results From Their Own and Other Readers’ Blinded Reads (Each Reader Decides to Include or Ignore).

  14. Unblinded Read for Reader 1 – Now Marks Two Nodules (Originally only marked one) Reader 1

  15. Unblinded Read for Reader 2 – Still Marks Two Nodules (No Change) Reader 2

  16. Unblinded Read for Reader 3 – Now Marks Three Nodules (Originally only marked two) Reader 3

  17. Unblinded Read for Reader 4 – Now Marks Three Nodules (Originally did not mark any) Reader 4

  18. Results of Unblinded Reads from All Four Readers 4/4 Markings 2/4 Markings 2/4 Markings We will capture one aspect of reader variability in this way

  19. R1U R2U R3U R4U Database (will contain Blinded AND Unblinded reads) • Radiologist Review & Reconcile- V2 • 4 radiologist (blinded) – R1B, R2B, R3B, R4B • Radiologist Review & Reconcile- V2 • 4 radiologist (blinded) – R1B, R2B, R3B, R4B • Submit to Requesting Site; This site compiles markings and re-sends case • Radiologist Review & Reconcile- V2 • 4 radiologist (blinded) – R1B, R2B, R3B, R4B • Submit to Requesting Site; This site compiles markings and re-sends case • 4 Radiologists see all (anonymized) markings • Radiologist Review & Reconcile • 4 Radiologists Perform Blinded Read – R1B, R2B, R3B, R4B • Submit to Requesting Site; This site compiles markings and re-sends case • 4 Radiologists see all (anonymized) markings • 4 Radiologists Perform Unblinded Read (R1U, R2U, R3U, R4U) • Nodules for each condition: (R1B, R2B, R3B, R4B, R1U, R2U, R3U, R4U) • Location • Outline (where appropriate) • Label (where appropriate)

  20. Case 5, Slice 19

  21. Radiologist 1 - Method 1

  22. Radiologist 1 - Method 2

  23. Radiologist 1 - Method 3

  24. Radiologist 2 - Method 1

  25. Radiologist 2 - Method 3

  26. Radiologist 3 - Method 1

  27. Radiologist 3 - Method 2

  28. Radiologist 3 - Method 3

  29. Radiologist 4 - Method 1

  30. Radiologist 4 - Method 2

  31. Radiologist 4 - Method 3

  32. Radiologist 5 - Method 1

  33. Radiologist 5 - Method 3

  34. How to Represent This Variability? Create a Probabilistic Description of Nodule Boundary • For each voxel, sum the number of occurrences (across reader markings) that it was included as part of the nodule • Create a probabilistic map of nodule voxels • Higher probability voxels are shown as brighter; lower probability are darker • Can use apply a threshold and show only voxels > some prob. Value if desired.

  35. Probabilistic Description of Boundary

  36. Apply Threshold if Desired

  37. Challenge: Define the Boundary of a Nodule • Do we need to have agreement between radiologists on boundaries? • LIDC’s answer is no. • LIDC Approach will be to: • Construct a probabilistic description of boundaries to capture reader variability • Use a threshold value (50% centile or 1% centile) to give fixed contours.

  38. Pathology Information • In those cases in which pathology is available, we will extract from reports: • Whether histology or cytology was performed • If histology, try to establish the cell type according to WHO classifications • If cytology, establish whether it was benign or malignant

  39. Pathology Information • If no pathology, other diagnostic information may be substituted when available (such as 2 years Dx F/U with no change in radiographic appearance). • If neither is available, then case will be used for detection purposes only.

  40. Database Implementation How to capture and collect all of this data? 5 Phases of Data Collection • Initial Review • review case for inclusion in database; • anonymize case; • Index case, e.g. Full Chest/Limited Chest, Image Quality. • Blinded Read • identifying and drawing nodules independently • Unblinded Read • confirming using an overread, labeling nodules (characteristics) • Subject info • demographics, smoking history, pathology. • Export Data to NCI-hosted database (public)

  41. Database Implementation How to capture and collect all of this data? We have developed an internal standard for representing a region of interest (ROI) that is 3-D based on xml. This is portable across software drawing tools. We are also using xml to capture radiologist interpretation of nodule characteristics (shape, subtlety, etc.) by using a limited set of descriptors

  42. Database Implementation How to capture and collect all of this data? We have designed and tested a communication protocol to send image data and xml messages • Read Request messages (with a code/mechanism to distinguish blinded from unblinded read request) • Read Response messages (with a code/mechanism to distinguish blinded from unblinded read response)

  43. Database Implementation How to capture and collect all of this data? Designed and implemented database for each host site for all case data. Designed and are implementing the central NCI hosted database.

  44. Database Implementation Communication Model • Each Site Plays Dual Roles • As a Requesting Site • Identify Case and collect data • Phase 1- Initial Review • Manage it through blinded and unblinded read process • Create database entry for case • Phase 4 – Demographics, Pathology • Phase 5 – Export to NCI • NOTE: Site does not READ/MARK its own cases • As a Servicing Site • Perform blinded (Phase 2) and unblinded (Phase 3) reads

  45. LIDC Message System Requesting Site X 1 Initial Review, Anonymize A SSH or SCP for transfer Compile Responses OtherSubject data fields Linked to case Send Image Data 11 2 6 10 XML Reading Assignment message 3 7 Nodule Marking Tools 5 9 4 XML Reading Response Message, 8 B,C,D,E Servicing Site

  46. Access to LIDC Database • Cases Exported to NCI • NCI hosts Database • Publicly Available • Query Based on Data Elements Collected • Imaging Data such as Slice Thickness, etc. • Pathology or F/U Data • Other Fields • Obtain • Image Data including DICOM headers • Serial Imaging when available • Radiologists’ Identification, Contours and Characterization of Nodules • Diagnosis Data (Path, Radiographic F/U, etc) whenever available • Case Demographics whenever available • Currently Implementing MIRC model (see infoRAD exhibit for demo)

  47. Database Implementation TASKS COMPLETED (see reports on website): • Specification of Inclusion Criteria: • CT scanning technical parameters • Patient inclusion criteria • Process Model for Data collection • Determination of Spatial "truth" Using Blinded and Unblinded reads • Development of Boundary Drawing Tools • Development and implementation of xml standard for ROIs

  48. Database Implementation TASKS COMPLETED • Defined Common Data Elements for LIDC • Database design – tables and relationships between tables • Communication protocol • Establishing Public Database and Access Mechanism at NCI

  49. Other Products Publications/Presentations • LIDC Overview manuscript • Radiology 2004 Sep;232(3):739-748. • Assessment Methodologies manuscript • Academic Radiology April 2004 • (Acad Radiol 2004; 11:462–475) • Special Session SPIE Medical Imaging • Sunday evening session at SPIE, 2005

  50. Summary • LIDC mission – to create public database • Current understanding of problem dictated multiple readers • Multi-Institutions dictated distributed, asynchronous reads

More Related