Object Recognition based on Shape and Function

Object Recognitionbased on Shape and Function Akihiro Eguchi BS Thesis Defense – November 30, 2011 Committee Craig Thompson Russell Deaton John Gauch College of Engineering Computer Science and Computer Engineering Department

Outline • Personal Background • Introduction • Background • Approach and Architecture • Methodology, Results, and Analysis • Conclusion

2. Introduction 3. Background 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion 1.Personal background

Akihiro Eguchi • Majors at the University of Arkansas • College of Engineering • Honors B.S., Computer Science • Advisor: Dr. Craig Thompson • Fulbright College of Fine Arts and Science • Honors B.A., Psychology • Thesis: “Cultural bias during word learning” • Advisor: Dr. Douglas Behrend • Minor in B.S., Mathematics

Projects, Publications, Awards • Projects in Computer Science field • Semantic World • Publications: • International Journal of Computer Information Systems and Industrial Management (2011) • Web Virtual Reality and Three-Dimensional Worlds Workshop, Freiburg, Germany (2010) • Inquiry Journal of Undergraduate Research (2010) • Cyber Infrastructure Days Conference (2010) • Conference on Applied Research in Information Technology (2010) • X10 Workshop on Extensible Virtual Worlds (2010) • Awards: • Honorable Mention, CRA Outstanding Undergraduate Researchers Award (2011) • Winner, University of Arkansas Undergraduate Research Award, U of A. (2010) • Autonomous Floor Mapping Robot • Publications: • 9th IEEE International Symposium on Robotic and Sensors Environments (ROSE), Montreal, QC, Canada (2011) • Minority Game • Publications: • 5th IEEE International Workshop on Multi-Agent Systems and Simulation (MAS&S), Szczecin, Poland (2011) • Smart Housekeeper Robot with Android • Projects in Psychology field • Cultural bias during children’s word learning • Cultural differences in identity perspectives

1. Personal Background 3. Background 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion 2. Introduction

Earlier Object Recognition • Required • extensive knowledge of math • expensive equipment use of Kinect • Mostly based only on shape of objects • difficult to recognize objects like • a uniquely designed chair • different objects that have similar shapes ideas from Developmental Psychology “Function Bias”

Objective • To combine Kinect sensor with machine learning techniques • To develop a new object recognition model based on shape bias and function bias

1. Personal Background • 2. Introduction 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion 3. Background

Key Concept 1Machine Learning • Give computers a way to learn without explicitly being programmed • autonomous vehicles, checker playing, and signal processing, etc. • Predict the function f(x) = y • Examples include • rote learning • decision tree based (ID3) • neural networks • k-nearest neighbor clustering

Key Concept 2Microsoft Kinect for the Xbox • A controller for the Microsoft gaming console Xbox 360 costs only $150 • dynamic depth image retrieval • human body recognition • skeletal joint tracking • multi-array microphone • Kinect SDK was officially released in June 2011

Related WorkDevelopmental Psychology • Human children use two main biases when learning to name objects: • Shape bias (B. Landau et al., 1988) • generalize name of the object if the shape is similar • Function bias (D. G. K. Nelson et al., 2000) • generalize name of the object if the function of the use seem to be the same • B. Landau, L. Smith, and S. Jones, “The Importance of Shape in Early Lexical Learning,” Cognitive Development, vol. 3, no. 3, 1988, pp. 299-321. • D. G. K. Nelson, R. Russell, N. Duke, and K. Jones, “Two-Year-Olds Will Name Artifacts by Their Functions,” Child Development, vol. 71, no. 5, 2000, pp. 1271-1288.

Related WorkObject Recognition in CS • Simulation of biases (Grabner et al., 2011) • Shape bias: • Define 3D models that are labeled with a same name (e.g., chair) and run a machine learning algorithm to train the classifier. • Function bias: • First define the use of the object; e.g., chair is a object to sit on. • Then, let the program learn the posture of sitting. • If the object is sittable, the objectis more likely to be a chair. • H. Grabner, J. Gall, L. V. Gool, “What Makes a Chair a Chair?,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), Colorado Springs, CO, June 20-25, 2011, pp. 1529-1536.

Related WorkActivity Recognition • Workflow analysis using RFID (Philipose et al., 2004) • Label objects with RFID to record action • Bayesian networks to infer actions • Video analysis with SVM (Schuldt et al., 2004) • 4 second video recording of walking, jogging, and running • SVM to train the classifier to recognize those actions. • M. Philipose, K. P. Fishkin, M. Perkowitz, D.J. Patterson, D. Fox, H. Kautz, D. Hahnel,, “Inferring Activities from Interactions with Objects,” IEEE Pervasive Computing, vol. 3, no. 4, Oct.-Dec. 2004, pp. 50-57. • C. Schuldt, I. Laptev, B. Caputo, “Recognizing Human Actions: a Local SVM Approach, ” Proceedings of the 17th International Conference on Pattern Recognition (ICPR), 2004.

1. Personal Background • 2. Introduction 3. Background 5. Methodology, Results, and Analysis 6. Conclusion 4. Approach and Architecture

Selecting a Learning Technique • hand-written number recognition • 7x11 grid of 77 cells • Different Techniques • Neural network (RBFN) • 5 nodes, 3 nodes, 300 iterations • accurate, but slow • K-NN clustering • simple and fast for small input. • less accurate but sufficient for this project

Learning to Use the Kinect SDK • Noise reduction • 3D reconfiguration with OpenGL

Architecture of the Object Recognition Model

Plain Surface Removal with RANSAC Algorithm • Random Sample Consensus (RANSAC) • randomly takes 3 points from the point cloud to determine a random plain • counts how many points are on the plain • iterating through to find a plain that has maximum number of points

K-NN for Shape Learning • Learning: • User names each object or chooses the name from a list of names the user has already told the program • Testing • program compares the shape of the target object with previously learned shape information to infer the name

Activity Recognition using the Kinect • Learning: • Kinect tracks twenty different joints of skeletal information • records coordinates for each joint every 0.1 seconds for 10 seconds • name the activity • Testing: • Use of K-NN to infer the target activity

Demo: Activity Recognition using the Kinect • http://www.youtube.com/watch?v=AxCn0eKWkiQ

Object Recognition Model with Shape Bias and Function Bias • Learning: • Let program learn the shape • Activity learning to learn the use • Associate the activity with the name of object. • Testing: • Based on the confidence: • "Maybe the object is [answer based on the shape]. But the object may be a [answer based on the action] because you used the object for [the name of action]" • "I think the object is [answer based on the action] because you used the object for [name of action]. But it might be a [answer based on the shape] based on the shape."

Demo: Object Recognition Model with Shape Bias and Function Bias • http://www.youtube.com/watch?v=4ia76fzxm68

1. Personal Background • 2. Introduction 3. Background 4. Approach and Architecture 6. Conclusion 5. Methodology, Results, and Analysis

Methodology • Target Objects • two objects that look similar but have a different use • two other objects that look different but have the same name and function a can of antiperspirant and a can of insecticide a conventional chair and an oddly shape of a chair

Object Recognition based only on Shape • Learning: • Antiperspirant • Insecticide • Chair (1) • Testing: • Antiperspirant -> “Antiperspirant” / “Insecticide” • Insecticide -> “Antiperspirant” / “Insecticide” • Chair (1) -> “Chair” • Chair (2) -> “Antiperspirant” ??? • Because shape is quite different from the chair (1)

Function Recognition • Learning: • “killing bugs” with insecticide • “deodorizing” with antiperspirant • “sitting” on a chair (1) • Testing • Killing bugs with insecticide  “killing bugs” • Deodorizing with antiperspirant  “deodorizing” • Sitting on a chair (1)  “sitting” • Sitting on a chair (2)  “sitting”

Object Recognition based on both Shape and Function • “Deodorizing” with antiperspirant  correctly answer OR  "Maybe the object is insecticide. But the object may be antiperspirant because you used the object for deodorizing". • “Killing bugs” with insecticide  correctly answer OR  "Maybe the object is antiperspirant. But the object may be insecticide because you used the object for killing bugs". • Sitting on a Chair (1)  correctly answer “chair” • Sitting on a Chair (2) "I think the object is a chair because you used the object for sitting. But it might be a antiperspirant based on the shape."

Analysis • when action does not involve a lot of movement, like sitting, the program works well • when action involves movement like walking, the shifting of timing can be a problem.

1. Personal Background • 2. Introduction 3. Background 4. Approach and Architecture 5. Methodology, Results, and Analysis 6. Conclusion

Summary • Proposed a new way of designing a computational object recognition model • knowledge from developmental psychology • shape + function • Leveraged powerful features of the Kinect sensor • depth map retrieval • human body joint recognition • easier to program an advanced application • less expensive • Result shows that the model works as expected

Future Work • The machine learning technique used for this model can be improved • Speech recognition technology can be used to name objects and actions • Can recognize sequences of activities (workflows) • brushing your teeth involves turning on the faucet, picking up the toothpaste, brushing, rinsing … • Help to create a future semantic world with smart objects • visual object recognition complements RFID tag based recognition • the accuracy can be improved by combining with ontology field of study

Questions

Object Recognition based on Shape and Function

Object Recognition based on Shape and Function

Presentation Transcript

Object Recognition Based on Shape Similarity

Shape Matching and Object Recognition using Low Distortion Correspondence

Shape Recognition

2D Shape Matching (and Object Recognition)

Keypoint -based Recognition and Object Search

Face Recognition based on Kernel Radial Basis Function

Shape recognition

Shape Recognition

Object Recognition

Face Recognition Based on 3D Shape Estimation

Object Recognition

Shape Matching and Object Recognition Using Shape Contexts

SHAPE RECOGNITION

Face Recognition based on Radial Basis Function and Clustering Algorithm

Object Recognition

Object Recognition

Shape-based Object Recognition

Object recognition

Project title: Image Recognition and Classification Based on Object Parts

3-D Object Recognition From Shape

Object Recognition Based on Shape Similarity

Object Recognition