Vision Computing: Segmetation

Vision Computing: Segmetation Group speakers: • AKRAM HUSSAIN • RIAZ ALI • BEHNAM IRANI • YULIA NOVSKAYA • SMAIL GHOUL • MUHAMMAD ALI MAHMOOD • FERIEL ELHADDAD • AKILA ELHADDAD • YASMINE REZGUI Shanghai JiaoTong University 2016

Parallel Multi-Dimensional LSTM with Application to Fast Biomedical Volumetric Image Segmentation By: Akram Hussain 115030990007

IMAGE SEGMENTATION: It is the process of dividing a digital image into multiple parts. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. LONG SHORT TERM MEMORY (LSTM) NETWORKS: They are recurrent neural networks, initially used for sequence processing. Their architecture contains gates to store and read out information from linear units called error carousels that retain information over long time intervals. MULTI-DIMENSIONAL LSTM NETWORKS(MD-LSTM): Hidden LSTM units are connected in grid-like fashion.

There are many biomedical 3D volumetric data sources, such as CT, MR and EM. • They introduced pyramidal multi-dimensional LSTM networks to process the 3D volumetric data in parallel, need fewer computation, and can be scaled. • Pyramidal connection topology:

Fig 1 shows the different topology for standard MD-LSTM: • It evaluates the context of each pixel recursively from neighboring pixel contexts along the axes, that is, pixels on a simplex can be processed in parallel. • Turning this order by 45 degree causes the simplex to become a plane. • The resulting gaps are filled by adding extra connections, to process more than 2 elements of the context. Extending the above approach as in fig 1 in 3D results in pyramidal connection topology as in Fig 2 (in each direction).

Fig 2 An MD-LSTM needs 8 LSTMs to scan a volume, while a PyraMiD-LSTM needs only 6, since it takes 8 cubes or 6 pyramids to fill a volume. Given dimension d, the number of LSTMs grows as 2^d for an MD-LSTM (exponentially) and 2 × d for a PyraMiD-LSTM (linearly).

EXPERIMENT: This approach is evaluated on two 3d biomedical image segmentation datasets, electron microscopy (EM) and mr brain images. The training and test is done on sub-volume (random) due to space limitation. PRE-PROCESSING: simple pre-processing is applied on the three datatypes of the MR Brain dataset, since they contain large brightness changes under the same label.

RESULTS: Performance comparison on EM (electron microscopy) images is done. As shown in Fig. PRAMID-LSTM reduced more random error as compared to other techniques, and for warping error and pixel error, this technique performed near the best one.

RESULT 2: PyraMiD-LSTM networks outperform other methods in rand error, and are competitive in wrapping and pixel errors on MR images.

CONCLUSION AND FUTURE WORK: The given novel highly parallel PyraMiD-LSTM has already achieved state-of-the-art segmentation results in challenging benchmarks. Compared to the results of EM images with MR images, I think, it can be improved for the application of EM images.

Reference: M. F. Stollenga, W. Byeon, M. Liwicki, and J. Schmidhuber. “ Parallel multi-dimensional lstm, with application to fast biomedical volumetric image segmentation” http://arxiv.org/pdf/1506.07452v1.pdf

Convolutional Feature Masking for Joint Object and Stuff SegmentationJifeng Dai, Kaiming He, Jian SunMicrosoft Research(2015) • Presented By: Riaz Ali (115030990008) Presented To: Associate Prof. Li Song • Shanghai Jiao Tong University • School of Electronic Information and Electrical Engineering • 2016

introduction • Proposed Solution • Block Diagram Conclusion Future Work • References Outline…

Introduction • 1 • semantic segmentation has witnessed considerable progress due to the powerful features learned by convolutional neural networks (CNNs). • The current leading approaches for semantic segmentation exploit shape information by extracting CNN features from masked image regions. • Semantic segmentation aims to label each image pixel to a semantic category. • This strategy introduces artificial boundaries on the images and may impact the quality of the extracted features. • Besides, the operations on the raw image domain require to compute thousands of networks on a single image, which is time-consuming.

Proposed solution • 2 • In this paper, Author propose to exploit shape information via masking convolutional features. The proposal segments (e.g., super-pixels) are treated as masks on the convolutional feature maps. • The CNN features of segments are directly masked out from these maps and used to train classifiers for recognition. • The author further propose a joint method to handle objects and “stuff” (e.g., grass, sky, water) in the same framework.

System Pipe line/Block Diagram • 3

Convolutional Feature Masking Layer

Cont.…

Conclusion • 4 • The author have presented convolutional feature masking, which exploits the shape information at a late stage in the network. • Author have further shown that convolutional feature masking is applicable for joint object and stuff segmentation.

Future Work • 5 • Author plan to further study improving object detection by convolutional feature masking. • Exploiting the context information provided by joint object and stuff segmentation would also be interesting.

References • 6 [1]. Convolutional Feature Masking for Joint Object and Stuff Segmentation written by: Jifeng Dai, Kaiming He, Jian Sun. Microsoft Research. [2]. Also search Google for some diagrams and Explanation.

Learning to Segment Moving Objects in Videos Behnam Irani KaterinaF,Panna Felsen,Pablo Arbelaez, CVPR2015

Main Contributions • Moving object proposals from multiple segmentations on optical flow boundaries. • A moving objectness detector for ranking per frame segment and tube proposals. • Random walks in a trajectory motion embedding for extending per frame segments into spatio-temporal trajectory clusters.

Overview

Steps explained with more details 1. Segment moving objects from multiple segmentations on motion boundaries(optical flow/static)

Steps explained with more details 2. Rank the results(MOP/SOP)using “Moving Objectness” Detector (MOD).

Steps explained with more details • 3. Extend to top ranked segments into space-time tubes using constrained segmentation on dense point trajectories motion affinities.

Steps explained with more details • 4. A “Moving Objectness” Convolutional Neural Network Detector (MOD) trained from image and optical flow fields discard over/under segmentations and static parts of the scene.

To appear in ICCV 2015 Semantic Image Segmentation via Deep Parsing Network Yulia Novskaya School of Electronic, Information and Electrical Engineering Shanghai Jiao Tong University 12 April 2016, Shanghai

Problem Semantic Image Segmentation

Motivation High-order MRF as One-pass CNN Multiple Convolutional Layers as Unary Term Additional Designed Layers as Pairwise Term 21 channels 512 512 512 512

Pipeline Original image Ground Truth Unary Term Joint Tuning Triple Penalty Label Context

Label context learning Label-Label space bkg areo bike bird boat bottle • bus • car cat • chair • cow • table • dog • horse • mbike • person • plant • sheep • sofa • train tv Spatial-Label Space bkg areo bike bird boat bottle • bus • car cat • chair • cow • table • dog • horse • mbike • person • plant • sheep • sofa • train tv penalty person:mbike chair: person favor

Problem Quantitative Results (PASCAL VOC 2012Challenge)

Problem Qualitative Results Original image Ground Truth DPN FCN DeepLab

Semantic Image Segmentation via Deep Parsing Network Thanks!

Visual Computing TOPIC: Fully Convolutional Networks for Semantic Segmentation ---- IEEE Conference on Computer Vision and Pattern Recognition 2015 Presented by GHOUL Smail (05) Student ID:115030990050

Outline • Application domain of this topic • What is the meaning of convolution? • Fully convolutional networks and R-CNN • Classification network • Spectrum of deep features and skip layers • Training & testing • Conclusion & references

Application domain of this topic • To help partially sighted people by highlighting important objects in their glassed; • To let robots segment objects so that they can grasp them; • Read scenes understanding ; • Useful for autonomous navigation of cars and drones; • Useful for editing images; • Medical purposes: eg: segmenting tumours, dental cavities,…

Meaning of convolution?

Fully convolutional networks and R-CNN

14 – Add skip layers

Conclusion • Fully convolutional networks are a rich class of models, of which modern classification convnets are a special case. • Recognizing this, extending these classification nets to segmentation, and improving the architecture with multi-resolution layer combinations dramatically improves the state-of-the-art, while simultaneously simplifying and speeding up learning and inference.

References • J. Carreira, R. Caseiro, J. Batista, and C. Sminchisescu. Semantic segmentation with second-order pooling. In ECCV,2012. 9 • D. C. Ciresan, A. Giusti, L. M. Gambardella, and J. Schmidhuber. Deep neural networks segment neuronal membrane in electron microscopy images. In NIPS, pages 2852–2860, • J. Dai, K. He, and J. Sun. Convolutional feature masking for joint object and stuff segmentation. • J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. DeCAF: A deep convolutional activation feature for generic visual recognition. • Y. Ganin and V. Lempitsky. N4-fields: Neural network nearest neighbor fields for image transforms. In ACCV, 2014 • S. Mallat. A wavelet tour of signal processing. Academicpress, 2nd edition, 1999 • N. Zhang, J. Donahue, R. Girshick, and T. Darrell. Partbased r-cnns for fine-grained category detection. In Computer Vision–ECCV 2014, pages 834–849. Springer, 2014.

JOTS: Joint Online Tracking and Segmentation Presented By: Muhammad Ali Mahmood Published in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Date of Conference: 7-12 June 2015

Vision Computing: Segmetation

Vision Computing: Segmetation

Presentation Transcript

Quantum vs. DNA Computing

The Vision and Reality of Ubiquitous Computing

CSE 301 History of Computing

Accelerating Cloud Computing Infrastructure: Cisco Nexus 1000V

COS 497 - Cloud Computing 2. Distributed Computing

A Vision for NA Service

UNICORE and EUROGRID: Grid Computing in EUROPE

Meeting Everyone’s Need for Computing

CLOUD COMPUTING TECHNOLOGY

Optical Computing

Pervasive Computing: Vision and Challenges

Vision for OSC Computing and Computational Sciences

3-D Computational Vision CSc 83029

UbiCom

Vision on Medical Device Plug-and-Play

Working towards the Computing Model for CMS

Ubiquitous Computing Visions

Soft Computing

MCS Vision

Vision-Based Motion Control of Robots

The Evolving Computing Model:

Autonomic Computing