120 likes | 314 Views
Quranic Arabic Corpus. Data Mining & Text Analytics By Ismail Teladia & Abdullah Alazwari. Introduction. What is the Quran ? Holy book for Muslims Revealed from 610 AD 6,236 verses, 114 chapters Corpus Definition. Written or spoken language What is the Quranic Arabic Corpus ?
E N D
Quranic Arabic Corpus Data Mining & Text Analytics By Ismail Teladia & Abdullah Alazwari
Introduction • What is the Quran? • Holy book for Muslims • Revealed from 610 AD • 6,236 verses, 114 chapters • Corpus Definition. • Written or spoken language • What is the Quranic Arabic Corpus? • 77,430 words of Quranic Arabic • Researcher: Kais Dukes
Features of QAC: • Morphological Annotation • Syntactic Treebank • Semantic Ontology
Morphological Annotation • Part-of-speech tagging • Natural Language Computing Technology • Word By Word • Grammar • Syntax • Morphology
Details of Word’s Grammar • Clicking the word gives more detail: • Type of Word • Translation • Gender • Case • Root • In addition it shows the verse in which word appears and sound recitation of the verse.
Syntactic Treebank • Verse by verse dependency graphs • Meaning of verse (broken down) • Sentence structure (dependencies) • Case • Mathematical graph theory
Ontology of Concepts • Knowledge representation • Relationship between concepts • Historic places and people • Named entity tagging • E.g. Sun, Moon, Star, Earth classified under “Astronomical Body” • Uses predicate logic
Visual Representation of Ontology • 300 linked concepts with 350 relations
Conclusion • Uses of the QAC: • Analysing Arabic text of each verse • Linking Arabic words through dependencies • Finding relationships between concepts • Website used daily by 2,500 people from 165 countries
Bibliography • http://corpus.quran.com