1 / 38

E-V: Efficient Visual Surveillance with Electronic Footprints

E-V: Efficient Visual Surveillance with Electronic Footprints. Jin Teng, Junda Zhu, Boying Zhang, Dong Xuan and Yuan F. Zheng IEEE Infocom 2012. Outline. Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks.

gage-myers
Download Presentation

E-V: Efficient Visual Surveillance with Electronic Footprints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. E-V: Efficient Visual Surveillancewith Electronic Footprints Jin Teng, Junda Zhu, Boying Zhang, Dong Xuan and Yuan F. Zheng IEEE Infocom 2012

  2. Outline Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks 2014/11/16 2

  3. Visual Surveillance 2014/11/16 3

  4. Failure Examples Chicago police installed 10,000 surveillance cameras in the city, only 1 of 200 crimes is captured by the visual surveillance [2]! One of the bombers in London bombing (July, 2005) is not identified by the surveillance system and escaped [3]! 2014/11/16 4

  5. Why fail? Visual technologies are not efficient and accurate enough to do automatic localization and tracking, and a lot of human power is needed! † Big Apple is Watching You: http://www.slate.com/articles/news_and_politics/explainer/2010/05/big_apple_is_watching_you.html • Large volume of video data • Temporal: 2.07*106 frames per camera per day • Spatial: tons of surveillance cameras in a city e.g. New York has 4176 video cameras in lower Manhattan area[1]. • Monitored objects may be visually occluded or have multiple inconsistent appearance 2014/11/16 5

  6. Outline Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks 2014/11/16 6

  7. Our Methodology: E-V Integration Combining electronic and visual signals for efficient surveillance E-V Integration makes it possible to efficiently and accurately localize and identify objects in a large volume of video data 7

  8. Electronic Signals • Wireless channels: • Wireless address, such as WiFi MAC address • Content etc. 8

  9. Pervasiveness of Electronic Signals • Electronic signals are emitted by many mobile devices • Mobile device’s popularity is increasing • Smartphone as an example: 302.6 million shipped in 2010 9

  10. Our E-V System: A Bird’s Eye View 10

  11. Our E-V System: Layers Surveillance Training Health Specific Applications Localization Identification Other Technologies Technologies Electronic Visual Other Signals Sensing Methods 11

  12. Related Work on E-V Integration Fuse multiple sensors for tracking [4] Visual camera + RFID for monitoring [5] Existing work cannot achieve accuracy and efficiency for visual surveillance at the same time! 2014/11/16 12

  13. Outline Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks 2014/11/16 13

  14. A Typical Surveillance Scenario • Find a specific person given some vague visual information, i.e., retrieve his appearance in videos of a long period of time • If we depend on videos alone, we may need • Extract all human figures in each frame, which may come in the number of thousands, and compare them with a designated vague picture. • Involve a large amount of human efforts to stare at the videos, which may last several hours or even days, from a number of cameras. • With E-V integration, how can we do? 2014/11/16 14

  15. Problem Formulation: Notations • V-sensing: V-ID and V Frame • V-ID: Visual identity, such as human figure • VID*: Our target V-ID • V Frame: a set of V-IDs with some background captured by visual sensors (cameras) in certain area and time • E-sensing: E-ID and E Frame • E-ID: Electronic identity such as MAC address etc. • EID*: Our target E-ID • E Frame: a set of E-IDs captured by electronic sensors in certain area and time • Vagueness and completeness • Vagueness: reflect how clearly a V-ID/E-ID can be identified • Completeness: reflect if V-IDs/E-IDs are complete in a V/E frame 15

  16. Problem Formulation: Cases • General case: • Input: EID* (and VID*), and a set of E frames and corresponding V frames • Output: VID* in video frames √ √ √ √ √ √ • Baseline case ( ): • Input: clear EID*, (and vague VID*), and a set of E frames with clear and complete EIDs and V frames with vague and complete VIDs • Output: VID* in video frames (VID* may be different from given vague VID*) √ 2014/11/16 16

  17. A Naïve Solution to the Baseline Case E frame 1 EID* EID1 E frame 2 E frame 3 EID* EID2 EID* EID2 EID3 • Two steps: • Step 1: Find out all E frames which include EID* (example) • Step 2: Identify VID* in their corresponding V frames • Comments: Few V frames to process because V frames without VID* are filtered out, but there may be still many V frames 17 Suppose we have three E/V frames. We go through them one by one.

  18. Our Solution • E-Filtering • Find the minimum number of E Frames, whose intersection is the given E-ID, i.e. EID* • Much less frames for further V side processing • We will formulate it into the Element Distinguishing Problem (EDP) • V-Retrieval • Retrieve the V-ID from the filtered frames through intersection to determine VID* • We will formulate it into the n-partite Best Matching Problem (nBM). 18

  19. E-filtering Overview E frame 1 E frame 1 EID1 EID* EID1 EID* EID2 EID3 E frame 2 E frame 3 E frame 2 EID* EID3 EID* EID2 Two E Frames are enough to identify EID* through intersection. EID2 19

  20. Nature of E-Filtering At least one 0 in each non-EID* column • Finding the minimum number of frames, whose intersection is EID* • NP-complete: equivalent to the set cover problem • Whether each E-ID appears in each E frame is summarized in a matrix, with 1 meaning ‘appear’ and 0 ‘not appear’. • At least one 0 in each non-EID* column • Use these 0s to ‘cover’ all non-EID* column 20

  21. Solution: EDP Algorithm • Element Distinguishing Problem (EDP) • The element to be distinguished is EID* • Greedily select E Frames in which the most number of E-IDs can be told apart from EID* • In the example, the greedy algorithm will select e1 or e3 first, because we can tell two E-IDs are not EID* • Repeat the greedy selection until EID* is distinguishable

  22. EDP(cont’d) 22 Approximation results can be achieved with the greedy heuristic algorithm for the set cover problem

  23. V-Retrieval • General Problem • Find the corresponding VID* from the frames selected by E-Filtering • VID* is the only one that should appear in all the frames after E filtering. So an intersection operation can give VID*. • Largest Challenge • Indistinct V-IDs: do not know for sure which person is which in different frames • Solution • nBM algorithm: find the VID with the largest probability of appearing in all V frames. 23

  24. The nBM Algorithm Similar? • Find whether an VID appears in each V frame based on similarity scores • Using Maximum Likelihood Criterion to choose the VID whose appearance/ disappearance agrees with EID* best. Dummy VID to indicate that VID1 is not similar to any VIDs in this frame • n-partite Best Match Problem (nBM) • Find the VID* that matches the visual appearance of EID* best • Put all VIDs in different frames in n different circles • n-partite graph (right) 24

  25. Practical Considerations √ √ √ √ √ √ √ √ □ practical case of our focus solved √ The baseline case we have studied • In the baseline case, we assumed that the information of E-IDs and V-IDs is complete. • However, in realistic cases, we may have • Ghost V-ID or missing V-ID • Missing E-ID 25

  26. Solutions to Practical Problems Time 1 EIDi 0 smoothing 1 EIDi 0 1 EIDi 0 smoothing 1 EIDi 0 • Careful Deployment • Make sure that the coverage of the camera and the wireless detectors are roughly the same • nBM is probability based, so it is naturally resistant to noises • Select appropriate threshold in nBM for better tradeoff between noise resistance and performance • Generalized EDP • Handle missing/ghost E-ID • Introduction of fuzzy logic to improve the robustness of EDP • Use RSSI for estimation and smoothing 26

  27. A Quick Recap of Our Solutions 27

  28. Implementation • Real world implementation • One camera viewing from above to collect V frames • 1-3 laptops around sniffing the WiFi traffic to collect E frames • Tested on campus • Gymnasium • Library 28

  29. Experimental Evaluations Scenario 1: Gymnasium 6 people 28 frames Scenario 2: Library 8 people 40 frames • Real world experiments • Successfully find the VID* • Minimum frames needed for Scenario 1 is 3, and we achieve 3 • Minimum frames needed for Scenario 2 is 3, and we achieve 4 29

  30. Large Scale Simulation-based Evaluations • Evaluation settings • Networks of cameras and wireless detectors at three locations • ~120 people moving randomly • Much less video frames to process (left) • High Accuracy (right) 30

  31. E-V Surveillance: Problem Space Uncooperative Cooperative Tracking Onsite Offline 31

  32. Final Remarks • Existing visual surveillance system is not efficient • Our E-V system • Integrates the E signals and V signals for efficient visual surveillance • Implemented in real world • Many open issues left, still a long way to go 32

  33. References [1] Big Apple is Watching You: http://www.slate.com/articles/news_and_politics/explainer/2010/05/big_apple_is_watching_you.html [2] http://articles.chicagotribune.com/2010-05-06/news/ct-oped-0506-chapman-20100506_1_surveillance- cameras-vandalism-effect-on-violent-crime [3] http://news.bbc.co.uk/2/hi/4659093.stm [4] D. Smith, et.al, “Approaches to Multisensor Data Fusion in TargetTracking: A Survey”, Knowledge and Data Engineering, IEEE Transactionson, 2006. [5] S. Cho, et.al, “Association and Identification in HeterogeneousSensors Environment with Coverage Uncertainty”, IEEE AdvancedVideo and Signal Based Surveillance, 2009. 2014/11/16 33

  34. Backup Slides

  35. A Case Study A typical surveillance scenario Problem formation in E-V integration Our solution Implementation and Evaluations 2014/11/16 35

  36. GEDP Algorithm • Clearly NP-hard • We can reduce EDP to GEDP • Heuristic algorithm based on the subset sum approximation algorithm 36

  37. The nBM Algorithm • Similarity matrix for all V-IDs which have appeared • n-partite Best Match Problem (nBM) • Find the VID* that matches the visual appearance of EID* best • Put all VIDs in different frames in n different circles • n-partite graph (right) 37

  38. nBM (cont’d) VID1 is in v2, and appears as VID2 VID1 is not in v2 • Maximum Likelihood matching • Given the observed VID1 … VIDm • Which VID is the best candidate • Calculate the probability of all VIDi across all V frames • Select the VID with the largest probability 38

More Related