190 likes | 386 Views
Jules J. Berman, Ph.D., M.D. APIII, Pittsburgh, PA Monday, September 10, 2007 7:30 am – 8:30 am. Implementing an RDF Schema for Pathology Images, From the Association for Pathology Informatics. Pathology images have no value unless they are annotated with information that describes the image.
E N D
Jules J. Berman, Ph.D., M.D. APIII, Pittsburgh, PA Monday, September 10, 2007 7:30 am – 8:30 am Implementing an RDF Schema for Pathology Images, From the Association for Pathology Informatics
Pathology images have no value unless they are annotated with information that describes the image.
Important descriptors of an image might include: File information Image capture information Image format information Specimen information Patient information Pathology information Region of interest information
The API (Association for Pathology Informatics) wants to provide anyone using pathology image data with optional methods for annotating any kind of pathology image, in any image format they prefer. We did not want to create yet another new standard that obligates people to use a particular image format. Yet, we want to provide methods that could be understood by colleagues using existing, free standards for specifying data.
From 2004-2007, the API sponsored LDIP, the Laboratory Digital Imaging Project, which consisted of API members and imaging software developers. The original purpose of LDIP was to develop a new, open data specification for pathology images. LDIP had monthly conference calls, and the minutes of their discussions are available for anyone to review at: www.ldip.org
In 2007, after much discussion, the API Council determined that there were, in existence, adequate methods for annotating images. LDIP was dissolved, and the API Council accepted the primary goal of providing the field of pathology informatics with a document that describes available open annotation methods. As a secondary goal, the API would provide a very short RDF Schema that would permit those who prefer RDF annotations to type their metadata under general classes and properties that have particular relevance to pathologists (more about this later).
A technical white paper that contains detailed methods for annotating images is published today at: www.julesberman.info/rdfimage.pdf This paper is distributed under an open source license, and can be downloaded, copied, re-distributed, and even re-posted at other web sites.
The paper describes methods for 6 levels (organized by increasing difficulty and complexity) of image annotation. The methods use existing standards (including RDF, jpeg, exif, Dublin Core, XML Schema, W3C Semantic Image Annotation) and do not create any new standards, just one new very short RDF Schema document.
Level 1. Simply composing a free-text description of your image and any other information you'd like to add, such as your name, and adding the information as a Comment field in the header of the image file. The Comment will not alter the binary content of the image or the visual form of the image. When the file is copied, it will retain the header comment, and anyone receiving the image can read what you've added, using a simple Perl or Ruby script provided in the document, or using a simple extraction program prepared in any preferred programming language.
Level 2. Insert the Dublin Core file descriptors into your Comment. The Dublin Core is basic information designed by librarians to provide a minimal set of data to describe the contents of an electronic document. When the file is copied, it will retain the Dublin Core metadata, and anyone receiving the image can read what you've added, using a simple Perl or Ruby program provided in the document, or using a simple extraction program prepared in any preferred programming language.
Level 3. Insert an RDF (Resource Description Framework) document into your image file. The RDF document can be extracted, and the triples in the document can be extracted and integrated with other data.
All data can be specified using RDF, developed by the W3C. RDF files are collections of statements expressed as data triples <identified subject><metadata><data> “Jules Berman” “blood glucose level” “85” “Mary Smith” “eye color” “brown” “Samuel Rice” “eye color” “blue” “Jules Berman” “eye color” “brown” When you bind a key/value pair to a specified object, you're moving from the realm of data structure (i.e., XML) into the realm of data meaning.
RDF permits data to be merged between different files Medical file: “Jules Berman” “blood glucose level” “85” “Mary Smith” “eye color” “brown” “Samuel Rice” “eye color” “blue” “Jules Berman” “eye color” “brown” Merged Jules Berman database: “Jules Berman” “blood glucose level” “85” “Jules Berman” “eye color” “brown” “Jules Berman” “hat size” “9” Hat file: “Sally Frann” “hat size” “8” “Jules Berman” “hat size” “9” “Fred Garfield” “hat size” “9” “Fred Garfield” “hat_type” “bowler”
Level 4. Insert your image into an RDF document. The image can be extracted from the RDF document.
Level 5. Point to your image file from an RDF document. The RDF document and the image file (for example jpeg) can be separate documents linked by URLs.
Level 6. Break up your annotative data and your image binaries into multiple documents that can be pointed from any of the files and that can exclude or include RDF or image binary data as desired. The RDF data can be distributed into multiple documents, and each RDF document may point to more than one image file.
By annotating our images, we can ensure that the image conveys meaning and value By using RDF, we can ensure that the individual triples can be integrated with heterogeneous data sources beyond those of images. By using existing international standards, we attain interoperability and avoid the confusion and complexity that occurs whenever a new standard is created. See: www.julesberman.info/rdfimage.pdf