1 / 29

SHREC’ 19 T rack : Extended 2D Scene Sketch-Based 3D Scene Retrieval

Explore the SceneSBR2019 Benchmark for 2D sketch-based 3D scene retrieval, presenting methods, results, and insights for this cutting-edge research direction. Enhance your understanding of ResNet50-based sketch recognition and advanced scene classification algorithms.

Download Presentation

SHREC’ 19 T rack : Extended 2D Scene Sketch-Based 3D Scene Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SHREC’19 Track: Extended • 2D Scene Sketch-Based 3D Scene Retrieval Juefei Yuan, Hameed Abdul-Rashid, Bo Li, Yijuan Lu, Tobias Schreck, Ngoc-Minh Bui, Trong-Le Do, Khac-Tuan Nguyen, Thanh-An Nguyen, • Vinh-Tiep Nguyen, Minh-Triet Tran, Tianyang Wang

  2. Outline • Introduction • Benchmark • Methods • Results • Conclusions and Future Work

  3. Introduction • 2D Scene Sketch-Based 3D Scene Retrieval • Focuses on retrieving relevant 3D scene models • Using scene sketches as input • Motivation • Vast applications: 3D scene reconstruction, autonomous driving cars, 3D geometry video retrieval, and 3D AR/VR Entertainment • Challenges • 2D sketches lack 3D scene information • Semantic gap: iconic 2D scene sketches and accurate 3D scene models

  4. Introduction (Cont.) • 2D Scene Sketch-Based 3D Scene Retrieval • Brand new research topic in sketch-based 3D object retrieval: • A query sketch contains several objects • Objects may overlap with each other • Relative context configurations among the objects • Our previous work • SHREC’18 track: 2D Scene Sketch-Based 3D Scene Retrieval track • Built SceneSBR2018 [1] benchmark: 10 scene classes, each has 25 sketches and 100 3D models • Good performance called for a more comprehensive dataset • We build the SceneSBR2019 Benchmark • To further promote this challenging research direction • Most comprehensive and largest 2D scene sketch-based 3D scene retrieval benchmark • [1] J. Yuan and et al. SHREC’18 track: 2D scene sketch-based 3D scene retrieval. In 3DOR, pages 1–8, 2018

  5. Outline • Introduction • Benchmark • Methods • Results • Conclusions and Future Work

  6. SceneSBR2019 Benchmark Overview • Overview • We have substantially extended the SceneSBR2018 with 20 additional classes • Building process • Voting method among three individuals • Scene labels chosen from Places88 [2] • Data collected from Flickr, Google Images and 3D Warehouse • [2] B. Zhou and et al. Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 40(6):1452–1464, 2018

  7. SceneSBR2019 Benchmark • 2D Scene Sketch Query Dataset • 750 2D scene sketches • 30 classes, each with 25 sketches • 3D Scene Model Target Dataset • 3,000 3D scene models • 30 classes, each with 100 models • To evaluate learning-based 3D scene retrieval Table 1Training and testing dataset information of our SceneSBR2019 benchmark

  8. 2D Scene Sketch Query Dataset Fig. 1 Example 2D scene query sketches (1 per class)

  9. 3D Scene Model Target Dataset Fig. 2 Example 3D target scene models (1 per class)

  10. Evaluation • Seven commonly adopted performance metrics in 3D model retrieval techniques [3]: • Precision-Recall plot (PR) • Nearest Neighbor (NN) • First Tier (FT) • Second Tier (ST) • E-Measures (E) • Discounted Cumulated Gain (DCG) • Average Precision (AP) • We also have developed the code to compute them: • http://orca.st.usm.edu/~bli/SceneSBR2019/data.html [3] B. Li, Y. Lu, C. Li, A. Godil, T. Schreck and et al. A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries. Computer Vision and Image Understanding, 131:1–27, 2015.

  11. Outline • Introduction • Benchmark • Methods • Results • Conclusions and Future Work

  12. Methods • ResNet50-Based Sketch Recognition and Adapting Place Classification for 3D Models Using Adversarial Training (RNSRAP) • View and Majority Vote Based 3D Scene Retrieval Algorithm (VMV)

  13. RNSRAP: Sketch Recognition with ResNet50 Encoding and Adapting Place Classification for 3D Model Using Adversarial Training Ngoc-Minh Bui1, 2, Trong-Le Do1, 2, Khac-Tuan Nguyen1, Minh-Triet Tran1, Van-Tu Ninh1, Tu-Khiem Le1, Khac-Tuan Nguyen1, Vinh Ton-That1, Vinh-Tiep Nguyen2, Minh N. Do3, Anh-Duc Duong2 1Faculty of Information Technology, Vietnam National University - Ho Chi Minh City, Vietnam 2Software Engineering Lab, Vietnam National University - Ho Chi Minh City, Vietnam 3University of Information Technology, Vietnam National University - Ho Chi Minh City, Vietnam

  14. Two-Step 3D Scene Classification Fig. 3 Two-step process of the 3D scene classification method

  15. Sketch Recognition with ResNet50 Encoding • (1) Use ResNet50 output to encode a sketch image into a 2048-D feature vector • (2) Data augmentation: • Regular transformations: flipping, rotation, translation, and cropping • Saliency map based image synthesis • (3) Use two types of fully connected neural networks • (4) Use multiple classification networks with different initializations for the two types of neural networks • (5) Fuse the results of those models based on the majority-vote scheme to determine the label of a sketch query image

  16. Saliency-Based Selection of 2D Screenshots • Use multiple views of a 3D object for classification • Randomly capture multiple screenshots at 3 different levels of details: • (1) general views, (2) views focusing on a set of entities, and (3) detailed views on a specific entity • Use DHSNet[4] to generate the saliency map of each screenshot • Select promising screenshots of each 3D model for place classification task • A 3D model can be classified with high accuracy (>92%) with no more than 5 information-rich screenshots [4] N. Liu and et al. DHSNet: Deep hierarchical saliency network for salient object detection. In CVPR (2016), pp. 678–686.

  17. Rank List Generation • Assign one or two best labels for each sketch image, and retrieve all 3D models having such labels • The similarity between a sketch and a 3D model: the product of the prediction score of the query sketch and that of the 3D model on the same label • Insert other 3D models which are considered irrelevant in the tail of that rank list with the distance of infinity

  18. VMV: View and Majority Vote Based 3D Scene Retrieval Algorithm Juefei Yuan1, Hameed Abdul-Rashid1, Bo Li1, Yijuan Lu2, Tianyang Wang3 1School of Computing Sciences and Computer Engineering, University of Southern Mississippi, USA 2Department of Computer Science, Texas State University, USA 3Department of Computer Science & Information Technology, Austin Peay State University, USA

  19. VMV Architecture Fig. 4 VMV architecture

  20. VMV Algorithm • VMV six steps • (1) Scene view sampling (Qmacro script) • (2) Data Augmentation • Random rotations, reflections, or translations • (3) Pre-training and training on AlexNet1/VGG1 and • AlexNet2/VGG2 • (4) Fine-tuning on scene sketches/views • (5) Sketch/view classification • (6) Majority vote-based label matching Fig. 5 A set of 13 sample views of an apartment scene model

  21. Outline • Introduction • Benchmark • Methods • Results • Conclusions and Future Work

  22. Precision-Recall Fig. 6 Precision-Recall diagram performance comparisons on the testing dataset of our SceneSBR2019 benchmark for two learning-based participating methods

  23. Other Six Performance Metrics Table 2.Performance metrics comparison on our SceneSBR2019 benchmark for the two learning-based participating methods • More details about the retrieval performance of each individual query of every participating method are available on the SceneSBR2019 track homepage [5] • [5] SceneSBR2019 track Homepage: http://orca.st.usm.edu/~bli/SceneSBR2019/results.html

  24. Discussions • Both of the two submitted approaches utilized CNN models • CNNs contribute a lot to the achieved performance of those two learning-based approaches • Bui utilized object-level semantic information for data augmentation and refining retrieval results • Very promising to utilize both deep learning and scene semantic informationto support large-scale scene retrieval • The overall performance achieved on the SceneIBR2019 track is better than that on the SceneSBR2019 track • SceneIBR2019 benchmark: • Replaced the query datasetwith query images: 1000 foreach class • Much larger 2D image query dataset for better training • More accurate 3D shape information in the query images • Much smaller semantic gap between images and models

  25. Outline • Introduction • Benchmark • Methods • Results • Conclusions and Future Work

  26. Conclusions Conclusions Objective: To foster this challenging and interesting research direction: Scene Sketch-Based 3D Scene Retrieval Dataset: Build the current largest 2D scene sketch 3D scene retrieval benchmark Participation: Though challenging, 2 groups successfully participated in the track and contributed 4 runs of 2 methods Evaluation: Performed a comparative evaluation on the accuracy Impact: Provided the largest and most comprehensive common evaluation platform forsketch-based 3D scene retrieval

  27. Future Work Future work Build a large 2D scene-based 3D scene retrieval benchmark in terms of number of categories and variations within each category Build/search other more realistic 3D scene models 2D scene sketch-based 3D scene retrieval by incorporating semantic information Extend the feature vectors by incorporating the geolocation estimation features 2D scene-based 3D scene retrieval related applications Deep learning models specifically designed for 3D scene retrieval

  28. References • [1] J. Yuan and et al. SHREC’18 track: 2D scene sketch-based 3D scene retrieval. In 3DOR, pages 1–8, 2018 • [2] B. Zhou and et al. Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 40(6):1452–1464, 2018 • [3] B. Li and et al. A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries. Computer Vision and Image Understanding, 131:1–27, 2015. • [4]N. Liu and et al. DHSNet: Deep hierarchical saliency network for salient object detection. In CVPR (2016), pp. 678–686. • [5] Extended SceneSBR track Homepage: http://orca.st.usm.edu/~bli/SceneSBR2019/results.html

  29. Thank you! Q&A?

More Related