260 likes | 281 Views
The world’s best chatbot. Marcus Liwicki EISLAB Machine Learning (chair) Luleå University of Technology. EISLAB: Embedded Intelligent Systems LAB. What is the Biggest Break Through of AI?. Areas of Machine Learning. Machine Learning. Data is available, but no labels.
E N D
The world’s best chatbot Marcus Liwicki EISLAB Machine Learning (chair) Luleå University of Technology EISLAB: Embedded Intelligent Systems LAB
Areas of Machine Learning Machine Learning Data is available, but no labels Training data with labels available Supervised Learning Unsupervised Learning Classification Regression Reinforcement Learning Clustering Feature Learning Artificial Curiosity
Machine Learning @LTU • Fundamental Res. • DocumentAnalysis • eHealth • Space • Speech • And of course: natural language processing • http://bit.ly/liwicki-vdl-17 (all my lecture material)
Overview of Today • Background: small overview of ML • Tools, andlinks – just foryour reference (slides available) • Creating the world's best chatbot • Natural language processing • Semantic hashing • Conclusion
Useful Toolkits (Most Popular) • Keras: https://elitedatascience.com/keras-tutorial-deep-learning-in-python • deeplearnjs: https://deeplearnjs.org/ • Deeplearning4j: https://deeplearning4j.org/ • https://mxnet.incubator.apache.org/how_to/finetune.html • Tensorflow (and interesting visualizations in tensorboard) • https://www.tensorflow.org/get_started/ • https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard • Caffe2 and Caffe • https://caffe2.ai/ • http://caffe.berkeleyvision.org/ • PyTorch & Torch • http://pytorch.org/ • http://torch.ch/ Research business easy-to-use UI’s and more for end-users https://cloud.google.com/ml-engine/docs/ https://aws.amazon.com/machine-learning/ https://azure.microsoft.com/en-us/overview/machine-learning/ https://developer.nvidia.com/embedded/learn/tutorials https://developer.nvidia.com/digits http://deepcognition.ai/resources/ https://orange.biolab.si/
Useful Models • The number one place for finding pre-trained models • https://github.com/BVLC/caffe/wiki/Model-Zoo • (also gives hints for successful applications) • A bit easier to understand, because it is curated • https://modeldepot.io/ • Small, but with demos • http://pretrained.ml/ • https://github.com/keras-team/keras/tree/master/examples • Individual task: Look at both websites and try to find a model working in a domain which is interesting for YOU
Other Useful Links • https://teachablemachine.withgoogle.com/ • http://playground.tensorflow.org • https://experiments.withgoogle.com/ai • https://transcranial.github.io/keras-js/#/imdb-bidirectional-lstm • https://transcranial.github.io/keras-js/#/mnist-acgan • https://quickdraw.withgoogle.com/
We can Learn From Failures & Success • Deep Learning and AI is not the answer to everything • https://www.techrepublic.com/article/top-10-ai-failures-of-2016/ • An extension of reinforcement learning is Artificial Curiosity • Could (and definitely would) go terribly wrong • https://blog.statsbot.co/deep-learning-achievements-4c563e034257
Recent Achievement @LTU: Intent Classification I want.. Can I.. Challenges • Varying Domain • Small Data • DL needs Data • Writing Errors Approach • Standard ML • DL I need.. Je veux.. Gimme.. What.. I want.. Ich.. Jag.. Do I.. Hur..
LTU is Leading in Intent Classification LTU Top Start-Ups Microsoft IBM Google Open Source SAP https://github.com/kumar-shridhar/HackathonLulea
Natural Language Processing • Languages often seem to behave in arbitrary ways and forms • cabz, cats • Ambiguity, sarcasm and irony are often not apparent from purely textual information • Domain-specific terms and phrases that may not even be grammatically correct • to short a stock • no woman no cry
Differences in Principles • Word order • English: John ate apples • Japanese: Jon waringo o tabeta • Null Subject • English: It is raining • Spanish: Estálloviendo • Ambiguity • John and Henry’s parents arrived at the house. -> how many people? • Recursion • He said that _ she knew that _ they are there _ where we have been _ when
Discussion • What is better? • Rule-based NLP (Grammar, like software constructs) • Deep Learning based NLP (much text available)
Word Embeddings the catsaton themat • Classical (Bag of Words - BoW) • Various ways of choosing the values: Binary, Count, TF-IDF • Note: we loose information. Which? • One-hot encoding • dog: 1 0 0 0 0 … 0 • cat: 0 0 1 0 0 … 0 • No semantic similarity conveyed, lots of data required
Vectors can Represent Words • One Hot representation • Let us assume we have 1000 words in the dictionary • A word would be represented by a vector with 1 one and 999 zero elements • Even a sentence or a document can be represented similarly • More ones • Example: are am ... Hello I ... Marcus Hello I am Marcus
Learning from Data: Distributional Hypothesis • Firth, 1957: ‘You can tell a word by the company it keeps’ • The cat sat on the mat • The dog sat on the mat • The elephant sat on the mat • The quickly sat on the mat • Idea: Embed words with just 300 or 500 values, not |V| • More dense • Less dimensions • Should embed domain semantics • Generalize easily
Word2Vec • Mikolov et al., 2013 (while at Google) • Family of Models to train Word Embeddings (E) • Linear models in an encoder-decoder structure • Two models for training embeddings in an unsupervised manner: Continuous Bag-of-Words (CBOW) Target word is output of combined context words Skip-Gram Target word is input to each context word 1-hot (|V|) 1-hot (|V|) PAD = padding (ε) d-dimensional d-dimensional cat the PAD sat on d-dimensional d-dimensional 1-hot (|V|) 1-hot (|V|) Σ E E E E E E E E cat cat cat cat PAD the sat on
And These can be Really Useful and Fun • Task: test the word2vec online demo • https://rare-technologies.com/word2vec-tutorial/#app • Try out your own word combinations • Are there cases where it is particularly good/bad?
AdvancedTechniques & Tricks • SemanticHashing • Reducesvocabularysize • Works w. unknown and spellingerrors • Data Augmentation • Word Shuffling • Mis-spelling • Keyboard key closeness • Data Balancing • Imbalanceddataforclasses • SGDclassifier • Classicalapproach good Word #good# Add # #go, goo,ood, od# Trigrams 0 …. 1 0 1 1 … 1-hot enc 500k reduced to 30k; then typical Embedding
Marvin2025 is happy! • Most of our work is Open Source • Including data and documentation • https://diva-dia.github.io/DeepDIVAweb/ • https://diuf.unifr.ch/main/hisdoc/divaservices • As interactive iPython Notebook with tutorial-like explanation https://github.com/kumar-shridhar/HackathonLulea
Engaging Education with Music • iMuSciCAwww.imuscica.eu • Ongoing EU project • Try it out with Chrome: https://workbench.imuscica.eu/ • Team Teaching with STEAM • Science, Technology, Engineering & Mathematics combined with Arts • Workshop & Concert in Luleå (2019-03-02) • http://www.kulturenshus.com/evenemang/imuscica/ • http://www.ltu.se/eu-steam-2019
Conclusion • Deep Learning is really good (SotA) in many tasks • Speech, image, handwriting, video recognition • Intend recognition, sentiment analysis • Stock market prediction, big data forecast • However, it does not solve everything • More than 1000 classes? • Often biased to training set • https://arxiv.org/ftp/arxiv/papers/1801/1801.00631.pdf • And: https://medium.com/@GaryMarcus/in-defense-of-skepticism-about-deep-learning-6e8bfd5ae0f1
Thank You + Lab Members & Beyond And colleagues • LTU • Kaiserslautern • Fribourg • International Gustav Marcus Fotini Pedro Rajkumar Priamvada Oluwatosin György