500 likes | 539 Views
Sanskrit Linguistic Processing. Character-encoding, morphology, and lexicography. Peter M. Scharf Brown University 23 December 2009. Roman-based Standards. Devanagar ī -based Standards. Nominal inflection. Verbal inflection. Vedic Unicode. Encoding Vedic Characters.
E N D
Sanskrit Linguistic Processing Character-encoding,morphology,and lexicography Peter M. ScharfBrown University23 December 2009
Roman-based Standards Peter M. Scharf, 23 Dec. 2009:
Devanagarī-based Standards Peter M. Scharf, 23 Dec. 2009:
Nominal inflection Peter M. Scharf, 23 Dec. 2009:
Verbal inflection Peter M. Scharf, 23 Dec. 2009:
Vedic Unicode Peter M. Scharf, 23 Dec. 2009:
Encoding Vedic Characters The Vedic Unicode Proposal recommends the addition of Vedic characters to the Unicode standard so that tone marks that appear in red in this palmleaf manuscript of the Vājasaneyisaṃhitā may be accurately represented in print. Peter M. Scharf, 23 Dec. 2009:
Vedic Unicode Charts Peter M. Scharf, 23 Dec. 2009:
Devanāgarī Extended Peter M. Scharf, 23 Dec. 2009:
Vedic Extensions Peter M. Scharf, 23 Dec. 2009:
LIES Appendix B The Sanskrit Library Phonetic Basic encoding scheme (SLP1) attempts to meet high standards of unambiguous encoding while restricting encoding to 75 codepoints in the ASCII character set. SLP1 utilizes 57 codepoints to encode segments: 53 to represent phonetic segments and four to represent punctuation. In addition SLP1 utilizes 18 codepoints to encode phonetic features: three to indicate stricture, five to indicate length, eight to indicate tone, and one to indicate nasalization…. Peter M. Scharf, 23 Dec. 2009:
SLP1Basic Segments Peter M. Scharf, 23 Dec. 2009:
B.3 Modifiers Modifiers are added after a character to indicate variations in segment stricture, length, accent, and nasalization, in the order stated. Prolonged length, accent, and nasalization occur in classical Sanskrit as well as Vedic. Modifiers are used in combination to indicate special features of stricture, length, accent, and nasalization in Vedic. Peter M. Scharf, 23 Dec. 2009:
B.3.1 Stricture _ heaviness [used for semivowels y or v] = lightness [used for semivowels y or v] ! lack of release (abhinidhāna) [used for stops or semivowels y, v, or l] Peter M. Scharf, 23 Dec. 2009:
B.3.2 Length * subsegmental epenthetic vowel (svarabhakti) # length of half a mora 1 length of one mora [used in Vedic after short agitated kampa; short e, o; and heavy anusvāra] 1# slightly lengthened 2 length of two morae [used for dvimātra anusvāra in Vedic] 3 prolonged length of three morae [used for pluta vowels] 4 prolonged length of four or more morae [used in raṅga] Peter M. Scharf, 23 Dec. 2009:
B.3.3 Accent / high pitch \ low pitch ^ circumflex 6 extra low tone 7 low tone 8 high tone 9 extra high tone + sharpness Peter M. Scharf, 23 Dec. 2009:
B.3.4 Nasalization ~ nasalization Yamas 20 epenthetic nasalized segments: k~, kh~, . . . , b~, bh~ 4 four epenthetic nasalized segments: k~, kh~, g~, gh~ 20 replacements for a non-nasal stop before a nasal: k~, kh~, . . . , b~, bh~ (Ṛkprātiśākhya) Peter M. Scharf, 23 Dec. 2009:
B.4.4 Syllabified visarga and anusvāra accent H/ high-pitched visarga H\ low-pitched visarga H^ svarita visarga M\ low-pitched anusvāra Peter M. Scharf, 23 Dec. 2009:
Nominal Declension Peter M. Scharf, 23 Dec. 2009:
Verbal Conjugation Peter M. Scharf, 23 Dec. 2009:
XML Rulesfor guṇa Peter M. Scharf, 23 Dec. 2009:
ExecutablePerl code Peter M. Scharf, 23 Dec. 2009:
XMLFull-form Lexicon Peter M. Scharf, 23 Dec. 2009:
Morphological Analyzer Peter M. Scharf, 23 Dec. 2009:
Cologne Digital Sanskrit Dictionaries Peter M. Scharf, 23 Dec. 2009:
CDSL Monier Williams Peter M. Scharf, 23 Dec. 2009:
Digital Dictionaries of South Asia Peter M. Scharf, 23 Dec. 2009:
Digital Sanskrit Library Integration Flexible input and display,linking text to the full-form lexicon, and aligning inflectional and morphological tags
Sanskrit Library Text-lexicon Integration Peter M. Scharf, 23 Dec. 2009:
Sanskrit Library Text-lexicon Integration Peter M. Scharf, 23 Dec. 2009:
Sanskrit LibraryMorpho-logical Analysis Peter M. Scharf, 23 Dec. 2009:
Monier Williams: anuttama Peter M. Scharf, 23 Dec. 2009:
Sanskrit Library Input/Display Preferences Peter M. Scharf, 23 Dec. 2009:
Sanskrit Library Input/Display Preferences Peter M. Scharf, 23 Dec. 2009:
Sanskrit Library Input/Display Preferences Peter M. Scharf, 23 Dec. 2009:
Sanskrit Library Input/Display Preferences Peter M. Scharf, 23 Dec. 2009:
Sanskrit Library Lexical Sources Preferences Peter M. Scharf, 23 Dec. 2009:
Böhtlingk’sSanskrit-Wörterbuch in kürzerer Fassunganuttama Peter M. Scharf, 23 Dec. 2009:
Böhtlingk and Roth’sGrosses Sanskrit-Wörterbuchanuttama Peter M. Scharf, 23 Dec. 2009:
Apte'sPractical Sanskrit-English Dictionaryanuttama Peter M. Scharf, 23 Dec. 2009:
Macdonell'sA Practical Sanskrit Dictionaryanuttama Peter M. Scharf, 23 Dec. 2009:
Sanskrit Linguistic Processing Text-image alignment,and digital critical editing
Monier Williams Digital Image Peter M. Scharf, 23 Dec. 2009:
Machine-readable text Below is a segment of Ṣaḍguruśiṣya’s Vedārthadīpikā in SLP1 encoding. Peter M. Scharf, 23 Dec. 2009:
Syllable Tags Below is a segment of Ṣaḍguruśiṣya’s Vedārthadīpikā with orthographic syllable XML tags inserted. Peter M. Scharf, 23 Dec. 2009:
Variant Readings An XML file contains variant readings for various manuscripts and editions of Ṣaḍguruśiṣya’s Vedārthadīpikā. Peter M. Scharf, 23 Dec. 2009:
Page Boundaries An XML file of entries associates page boundaries in the manuscript Wai321 of Ṣaḍguruśiṣya’s Vedārthadīpikā with orthographic syllable tags in the machine-readable edition and in manuscript variants tags. Peter M. Scharf, 23 Dec. 2009:
Word-spotting A highlighted passage in a manuscript of Ṣaḍguruśiṣya’s Vedārthadīpikā: Wai321, folio 131, recto, line 8. Peter M. Scharf, 23 Dec. 2009:
VAD Digital Critical Edition Peter M. Scharf, 23 Dec. 2009: