1 / 20

“Penuria nominum” – shortage of words Knowledge beyond the capacity of language? by Gy örgy Surján ESKI Hungary Commen

“Penuria nominum” – shortage of words Knowledge beyond the capacity of language? by Gy örgy Surján ESKI Hungary Commentary to Judith Blake Beyond Data Integration: Data Management for Knowledge Discovery. Ontology and Biomedical Informatics Rome 29 April – 2 May 2005. Overview:

bina
Download Presentation

“Penuria nominum” – shortage of words Knowledge beyond the capacity of language? by Gy örgy Surján ESKI Hungary Commen

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Penuria nominum” – shortage of words Knowledge beyond the capacity of language? by György Surján ESKI Hungary Commentary to Judith Blake Beyond Data Integration: Data Management for Knowledge Discovery Ontology and Biomedical Informatics Rome 29 April – 2 May 2005

  2. Overview: • (A commentary of an outsider) • Modern science is analytical • Problem of identity • Capacity of language is limited • Genomics and proteomics deals with extremely large databases. • Ontologies are bound to reality by language tags

  3. 1. Modern science is analytical

  4. Mouse and human genes agree in 90% Mouse and human body is built from rather similar building blocks. The difference is not in the building elements, but in the different way of integration of the elements. The difference is not only phenotypic, but functional: Humans are able to create such a sculpture demonstrating the beauty of human body

  5. By changing of 10 % of its genes would a mouse be able to create sculptures like Michelangello? Q1. Is analytical approach sufficient to explain differences of living organisms?

  6. 2. The problem of identity Importance of identity in ontology: Entities having different identity criteria can not belong to the same class.

  7. We have the strong feeling of our self-identity all over of our whole life, despite of all changes that happen to us =

  8. Identity is independent from similarity and recognition =

  9. Identity of genes or proteins • Entities may gain or loose parts without loosing their identity • Genes loosing some nucleotides are still identical? Q2. What are the identity criteria for genes and proteins?

  10. Humans and developed animals obviously have Elementary particles have no identity Q3. At which level of organisation identity emerges? (Do biological macromolecules have identity?)

  11. 3. Capacity of language is limited Shepard in the 19th century ~3-400 words Anatomy (intermediate language certificate) ~4000 terms SNOMED 3.1, Encyclopaedia Britannica ~120 000 terms WordNet ~150 000 strings UMLS Metathesaurus > 1 500 000 terms • Our language capacity is huge, but nevertheless finite

  12. Limiting factors: • 1. Capacity of human brain • 2. Number of terms shared by a community

  13. Example of numbers • Different names for the first 13 numbers (zero- twelve) in English, then we use combinations • hundred 102 • thousand 103 • million 106 • billion 109 • …. • ? • 1080 We have linguistic solution to express extremely large numbers in price of precision loss 94869313860999624578839454223454292345623754278394542323452456598564789345634987 9.486 1080

  14. Up to now, mankind has not met any situation which could exhaust the capacity of human language, not because the number of things to be expressed were less than this capacity, but we always could find some acceptable compromise. We do not know where are the limitations of our language capacity, but the feeling of this limitation was well known centuries ago (penuria nominum): In the 17th century Harsdörfer proposed a machine with 5 wheels containing 256 syllables, prefixes and suffixes, beeing able to generate about 97 million (mostly nonsense) German words in order to find the real name of God and also to being able to use different names for all particualrs in the world instead of referring them by names of their classes (U. Eco: Between La Mancha and Babel)

  15. Size of genomics databases: GO ~18 000 terms Human genome ~ 30 000 ? genes GenBank ~42 000 000 sequences

  16. Are we able to use 42 million names? Q4. Is it possible to describe molecular biology using human language? Is there any other representation tool to be used for that purpose?

  17. 5. Ontologies are bound to reality by language tags formal languages are used to describe structures

  18. Language ID ID ID ID ID ID ID ID ID ID ID ID Reality What is the meaning ? language tag language tag language tag language tag language tag language tag language tag language tag language tag language tag language tag

  19. If ontologies are bound to reality by language, than it is hard to create (use) ontology where the problem field exceeds the capacity of language. Q5. If language fails in genomics and proteomics, is there a need and possibility for alternative methods of ontology engineering, that does not requires language?

  20. Summary of questions Q1. Is analytical approach sufficient to explain differences of living organisms? Q2. What are the identity criteria for genes? Q3. At which level of organisation identity emerges? (Do biological macromolecules have identity?) Q4. Is it possible to describe molecular biology using human language? Is there any other representation tool to be used for that purpose? Q5. If language fails in genomics and proteomics, is there a need and possibility for alternative methods of ontology engineering, that does not requires language?

More Related