1 / 77

Semantic Web: Promising Technologies, Current Applications & Future Directions

Semantic Web: Promising Technologies, Current Applications & Future Directions. Invited and Colloquia talks at: Swinburne Institute of Technology –Melbourne (July 18), University of Adelaide-Adelaide (July 23), University of Melbourne- Melbourne (July 31), Victoria University- Melbourne

hesper
Download Presentation

Semantic Web: Promising Technologies, Current Applications & Future Directions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Web: Promising Technologies, Current Applications & Future Directions Invited and Colloquia talks at: Swinburne Institute of Technology –Melbourne (July 18), University of Adelaide-Adelaide (July 23), University of Melbourne- Melbourne (July 31), Victoria University- Melbourne Australia, 2008 Amit P. Sheth amit.sheth@wright.edu Kno.e.sis Center, Comp. Sc & Engg Wright State University, Dayton OH, USA Thanks Kno.e.sis team and collaborators

  2. Outline • Semantic Web –key capabilities and technlologies • Real-world Applications demonstrating benefit of semantic web technologies • Exciting on-going research

  3. Evolution of the Web Web of people - social networks, user-created casual content - Twine, GeneRIF, Connotea Web of resources - data = service = data, mashups - ubiquitous computing Web of databases - dynamically generated pages - web query interfaces Web of pages - text, manually created links - extensive navigation Web as an oracle / assistant / partner - “ask the Web”: using semantics to leverage text + data + services - Powerset 2007 Semantic TechnologyUsed 1997

  4. 1 2 3 of Semantic Web

  5. 1 • Ontology: Agreement with a common vocabulary/nomenclature, conceptual models and domain Knowledge • Schema + Knowledge base • Agreement is what enables interoperability • Formal description - Machine processability is what leads to automation

  6. 2 • Semantic Annotation (Metadata Extraction): Associating meaning with data, or labeling data so it is more meaningful to the system and people. • Can be manual, semi-automatic (automatic with human verification), automatic.

  7. 3 • Reasoning/Computation: semantics enabled search, integration, answering complex queries, connections and analyses (paths, sub graphs), pattern finding, mining, hypothesis validation, discovery, visualization

  8. Different foci • TBL – focus on data: Data Web (“In a way, the Semantic Web is a bit like having all the databases out there as one big database.”) • Others focus on reasoning and intelligent processing

  9. SW Stack: Architecture, Standards

  10. From Syntax to Semantics Deep semantics Expressiveness, Reasoning Shallow semantics

  11. a little bit about ontologies

  12. Open Biomedical Ontologies Many Ontologies Available Today Open Biomedical Ontologies, http://obo.sourceforge.net/

  13. From simple ontologies

  14. Drug Ontology Hierarchy(showing is-a relationships) owl:thing prescription_drug_ brand_name brandname_undeclared brandname_composite prescription_drug monograph_ix_class cpnum_ group prescription_drug_ property indication_ property formulary_ property non_drug_ reactant interaction_property property formulary brandname_individual interaction_with_prescription_drug interaction indication generic_ individual prescription_drug_ generic generic_ composite interaction_with_monograph_ix_class interaction_ with_non_ drug_reactant

  15. to complex ontologies

  16. N-Glycosylation metabolic pathway N-glycan_beta_GlcNAc_9 N-glycan_alpha_man_4 GNT-Vattaches GlcNAc at position 6 N-acetyl-glucosaminyl_transferase_V UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 <=> UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2 UDP-N-acetyl-D-glucosamine + G00020 <=> UDP + G00021 GNT-Iattaches GlcNAc at position 2

  17. A little bit about semantic metadata extractions and annotations

  18. Extractionfor Metadata Creation WWW, Enterprise Repositories Nexis UPI AP Digital Videos Data Stores Feeds/ Documents Digital Maps . . . . . . Digital Images Digital Audios . . . Create/extract as much (semantics)metadata automatically as possible; Use ontlogies to improve and enhance extraction EXTRACTORS METADATA

  19. Automatic Semantic Metadata Extraction/Annotation

  20. Semantic Web in Action Supporting Clinical Decision Making

  21. 1. Supporting Clinical Decision Making • Status: In use today • Where: Athens Heart Center • What: Use of Semantic Web technologies for clinical decision support

  22. Operational Since January 2006

  23. Active Semantic Electronic Medical Records (ASEMR) Goals: • Increase efficiency with decision support • formulary, billing, reimbursement • real time chart completion • automated linking with billing • Reduce Errors, Improve Patient Satisfaction & Reporting • drug interactions, allergy, insurance • Improve Profitability Technologies: • Ontologies, semantic annotations & rules • Service Oriented Architecture Thanks -- Dr. Agrawal, Dr. Wingeth, and others. ISWC2006 paper

  24. ASEMR - Demonstration Click to Launch

  25. Further Opportunity: Clinical and Biomedical Data text binary Scientific Literature PubMed 300 Documents Published Online each day Health Information Services Elsevier iConsult User-contributed Content (Informal) GeneRifs NCBI Public Datasets Genome, Protein DBs new sequences daily Clinical Data Personal health history Laboratory Data Lab tests, RTPCR, Mass spec Search, browsing, complex query, integration, workflow, analysis, hypothesis validation, decision support.

  26. Semantic Web in Action Querying Integrated Data Sources

  27. 2. Querying Integrated Data Sources • Status:Completed research • Where:NIH • What:Querying Integrated Data Sources • Enriching data with ontologies for integration, querying, and automation • Ontologies beyond vocabularies : the power of relationships

  28. Using Data to Test Hypothesis Gene name Interactions GO gene Sequence PubMed OMIM Link between glycosyltransferase activity and congenital muscular dystrophy? Glycosyltransferase Congenital muscular dystrophy Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

  29. On the World Wide Web (GeneID: 9215) has_associated_disease Congenital muscular dystrophy,type 1D has_molecular_function Acetylglucosaminyl-transferase activity Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

  30. With the Semantically Enhanced Data glycosyltransferase GO:0016757 isa GO:0008194 GO:0016758 acetylglucosaminyl-transferase GO:0008375 has_molecular_function acetylglucosaminyl-transferase GO:0008375 EG:9215 LARGE Muscular dystrophy, congenital, type 1D MIM:608840 has_associated_phenotype SELECT DISTINCT ?t ?g ?d { ?t is_a GO:0016757 . ?g has molecular function ?t . ?g has_associated_phenotype ?b2 . ?b2 has_textual_description ?d . FILTER (?d, “muscular distrophy”, “i”) . FILTER (?d, “congenital”, “i”) } From medinfo paper. Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

  31. Semantic Web in Action Industry Examples

  32. Zemanta • Twine • Digger • Calais – Reuters Thompson • Powerset • Talis

  33. Emerging Research Areas

  34. Fact Extraction and Schema Creation Knowledge Extraction from Community-Generated Content

  35. Fact Extraction From Community Content • Search helps us find relevant pages/articles • But: It doesn’t answer questions.

  36. Fact Extraction From Community Content • Fact Extraction is the first step towards answering questions. • Famous new company that does fact extraction from Wikipedia is Powerset http://www.powerset.com

  37. Fact Extraction From Community Content • Problem: without a guiding schema, extracted predicates are just terms •  useful for humans, but not for machines

  38. Fact Extraction From Community Content • Expert-created schemas are expensive and usually very restricted

  39. Fact Extraction From Community Content • Solution: Have a community-generated schema •  Wikipedia hierarchy for terms and concepts

  40. Hierarchy Creation Query: “cognition”

  41. Fact Extraction From Community Content • Solution: Have a community-generated schema •  Wikipedia hierarchy for terms and concepts • See Automatic Domain Model creation •  Wikipedia Infoboxes for relationship types Relationshiptypes

  42. Learn Patterns that indicate Relationships Query: Australia  Sidney Australia  Canberra • in Sydney, New South Wales, Australia • Sydney is the most populous city in Australia • Canberra, the Australian capital city • Canberra is the capital city of the Commonwealth of Australia • Canberra, the Australian capital • in Sydney, New South Wales, Australia • Sydney is the most populous city in Australia • Canberra, the Australian capital city • Canberra is the capital city of the Commonwealth of Australia • Canberra, the Australian capital We know that countries have capitals. Which one is Australia’s?

  43. Add relationships

  44. Fact Extraction From Community Content • The accumulation of many pattern occurrences give the necessary support • Canberra, Australia  minimal positive support • The Australian capital of Canberra  additional major support

  45. Summary • Create Domain models from seed queries or seed concepts • Connect the concepts in the created domain models with valid relationships • Learn pertinent patterns for relationships • Find evidence for relationships in text • Wikipedia • WWW

  46. Discovering Undiscovered Knowledge Connecting the Dots

  47. How are Harry Potter and Dan Brown related?

  48. Motivation • Undiscovered Public Knowledge [Swanson 89] • Hidden connections in text • Our objective: build mechanisms to reveal these connections • Our approach: • Populate existing ontology schemas via information extraction from text • Use the extracted information to • Support browsing • Text retrieval • Knowledge discovery

  49. Discovering the Undiscovered Knowledge Swanson’s discoveries – Associations between Migraine and Magnesium [Hearst99] stress is associated withmigraines stress can lead to loss of magnesium calcium channel blockersprevent some migraines magnesiumis a natural calcium channel blocker spreading cortical depression (SCD) is implicated in some migraines high levels of magnesiuminhibit SCD migraine patients have highplatelet aggregability magnesium can suppressplatelet aggregability Data sets generated using these entities (marked red above) as boolean keyword queries against pubmed Bidirectional breadth-first search used to find paths in resulting RDF Magnesium Stress leads to loss inhibits Spreading Cortical Depression suppress is a associated with platelet aggregability implicated Calcium Channel Blockers Associated with Migraine prevents

  50. Background Knowledge Used • UMLS – A high level schema of the biomedical domain • 136 classes and 49 relationships • Synonyms of all relationship – using variant lookup (tools from NLM) • 49 relationship + their synonyms = ~350 mostly verbs • MeSH • 22,000+ topics organized as a forest of 16 trees • Used to query PubMed • PubMed • Over 16 million abstract • Abstracts annotated with one or more MeSH terms T147—effect T147—induce T147—etiology T147—cause T147—effecting T147—induced

More Related