140 likes | 328 Views
zulu. . semanti. complex. bangla. Social Computing for Linguistics & Linguistics for Social Computing. network. POS. syntax. model. edge. DD. NLP. node. @. . lexica. . evolution. learning. PA. word. Monojit Choudhury Microsoft Research India monojitc@microsoft.com.
E N D
zulu semanti complex bangla Social Computing for Linguistics & Linguistics for Social Computing network POS syntax model edge DD NLP node @ lexica evolution learning PA word Monojit Choudhury Microsoft Research India monojitc@microsoft.com
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Web Search Queries as an Emerging Language zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Web Search Queries as an Emerging Language zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Mean length of queries Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Web Search Queries as an Emerging Language zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Mean length of queries Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Web Search Queries as an Emerging Language zulu Unsupervised segmentation semanti complex bangla Basic units network POS syntax model edge DD NLP node Taxonomy of Function units @ Distributional Patterns lexica evolution learning PA word Queries are difficult to interpret Word co-occurrence Network Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC Search expertise language acquisition User Behavior Studies
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Web Search Queries as an Emerging Language zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Web Search Queries are an evolving Protolanguage • Collaborators: Rishiraj Saha Roy & Niloy Ganguly (IIT Kharagpur), • Srivatsan Laxman & Kalika Bali (MSR India)
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Use of Indian languages on online social media zulu semanti complex bangla network Transliteration POS syntax model edge DD NLP node @ lexica Spelling Change evolution learning PA word Code mixing Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC Indian English • Collaborators: Kalika Bali & Nimmi Rangaswamy (MSR India)
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories CodeGambler is an OSN Game to study the population-level emergence of categorization of a continuous space (colors) into discrete category terms (color names) zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC Sociolinguistic Experiments OSN Games with a Purpose Collaborators: Animesh Mukherjee (IIT Kgp), Vittorio Loreto, (University of Rome)
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Flat segmentation (queries and sentences) zulu semanti complex bangla network POS syntax model edge DD NLP node @ Nested segmentation lexica evolution learning PA word Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC Sociolinguistic Experiments Crowdsourcing for data
SoC for Linguistics SoC platforms for linguistic experiments; SoC Models for linguistic theories Small world distinct Two-regime power law degree distribution zulu neighboring semanti complex bangla structure network interacting POS syntax model Kernel-Periphery structure edge DD NLP node @ lexica evolution learning PA word word Low rank web Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC SoC inspired Linguistic Theory Word Co-occurrence Network evolving sentences treat • Collaborators: Animesh, Niloy (IIT Kgp), Chris Biemann (University of Darmstadt), Ravi Kannan (MSR India) in such language can complex a human is as network
WCN of Bollywood Lyrics is a very small world with a tiny kernel. Learn 1000 Hindi words and you will understand every Bollywood song! zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC SoC inspired Linguistic Theory Word Co-occurrence Network • Collaborators: Animesh, Niloy (IIT Kgp), Chris Biemann (University of Darmstadt), Ravi Kannan (MSR India)
WCN of Web search queries is not (yet) small world! It has a tiny kernel but very large periphery zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word Language is shaped by social interaction patterns; emails, blogs, sms, queries – CMC’s Linguistics for SoC SoC inspired Linguistic Theory Word Co-occurrence Network • Collaborators: Animesh, Niloy (IIT Kgp), Chris Biemann (University of Darmstadt), Ravi Kannan (MSR India)
Thank You!http://research.microsoft.com/people/monojitc/ zulu semanti complex bangla network POS syntax model edge DD NLP node @ lexica evolution learning PA word