60 likes | 75 Views
LING/C SC 581: Advanced Computational Linguistics. Lecture 28 April 18 th. 2019 HLT Lecture Series talk. Teaching demo slides on course website. Speaker: Mans Hulden , University of Colorado Boulder Time: 12-1 pm, Wednesday, April 24 Location: Chavez 308 Title: Black-box Linguistics
E N D
LING/C SC 581: Advanced Computational Linguistics Lecture 28 April 18th
2019 HLT Lecture Series talk • Teaching demo slides on course website Speaker: Mans Hulden, University of Colorado Boulder Time: 12-1 pm, Wednesday, April 24 Location: Chavez 308 Title: Black-box Linguistics Abstract: Neural networks have in a short time brought about previously unimaginable advances in computational linguistics and natural language processing. The main criticism against them from a linguistic point of view is that neural models - while fine for language engineering tasks - are thought of as being black boxes, and that their parameter opacity prevents us from discovering new facts about the nature of language itself, or specific languages. In this talk I will examine that assumption and argue that there are ways to uncover new facts about language, even with a black box learner. I will discuss specific experiments with neural models that reveal new information about the organization of sound systems in human languages, give us insight into the limits of complexity of word-formation, give us models of why and when irregular morphology - surely an inefficiency in a communication system - can persist over long periods of time, and reveal what the boundaries of pattern learning is.
A look forwards? GPT-2 https://www.wired.com/story/ai-text-generator-too-dangerous-to-make-public/
A look forwards? GPT-2 https://openai.com/blog/better-language-models/
A look forwards • Language Models are Unsupervised Multitask Learners • Alec Radford * 1 Jeffrey Wu * 1 Rewon Child 1 David Luan 1 Dario Amodei ** 1 Ilya Sutskever ** 1 • *, **Equal contribution 1OpenAI, San Francisco, California, United States. Correspondence to: Alec Radford <alec@openai.com>. • Natural language processing tasks, such as ques- tion answering, machine translation, reading com- prehension, and summarization, are typically approached with supervised learning on task- specific datasets. We demonstrate that language models begin to learn these tasks without any ex- plicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the an- swers generated by the language model reach 55 F1 on the CoQA dataset - matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and in- creasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested lan- guage modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain co- herent paragraphs of text. These findings suggest a promising path towards building language pro- cessing systems which learn to perform tasks from their naturally occurring demonstrations.