250 likes | 367 Views
ETISEO: Video Understanding Performance Evaluation Francois BREMOND, A.T. Nghiem, M. Thonnat, V. Valentin, R. Ma Orion project-team, INRIA Sophia Antipolis, FRANCE Francois.Bremond@sophia.inria.fr http://www-sop.inria.fr/orion/ETISEO/. Outline. Introduction ETISEO Project Video Data
E N D
ETISEO: Video Understanding Performance Evaluation Francois BREMOND, A.T. Nghiem, M. Thonnat, V. Valentin, R. Ma Orion project-team, INRIA Sophia Antipolis, FRANCE Francois.Bremond@sophia.inria.fr http://www-sop.inria.fr/orion/ETISEO/
Outline • Introduction • ETISEO Project • Video Data • ETISEO Results • Metric Analysis • ETISEO General Conclusion
Introduction • There are many evaluation initiatives with different objectives • Individual works • projects: CAVIAR, ILids, VACE, CLEAR, CANTATA,… • Workshops: PETS, VS, AVSS (CREDS),… Issues: • Not standard annotation (ground truth) • Lack of analysis of Video Data • which specific video processing problems a sequence contains • how difficult these problems are • Lack of analysis of metrics • Numbers, base-line algorithm
ETISEO Project • 2 years duration, from January 2005 to December 2006 To evaluate vision techniques for video surveillance applications. • Goals: • Unbiased and transparent evaluation protocol (no funding) • Large involvement (32 international teams) • Meaningful evaluation • provide the strengths and weaknesses of metrics • to help developers to detect specificshortcomingsdepending on • scene type (apron, building entrance etc.) • video processing problem (shadows, illumination change etc.) • difficulty level (e.g. strong or weak shadows)
ETISEO Project • Approach: 3 critical evaluation concepts • Ground truth definition • Rich and up to the event level • Give clear and precise instructions to the annotator • E.g., annotate both visible and occluded part of objects • Selection of test video sequences • Follow a specified characterization of problems • Study one problem at a time, several levels of difficulty • Metric definition • various metrics for each video processing task • Performance indicators: sensitivity,precision and F-score. • A flexible and automatic evaluation tool, a visualization tool.
4 Companies: • Barco- Capvidia NV- VIGITEC SA/NV • - Robert Bosch GmbH • 12 Academics: • - Lab. LASL University ULCO Calais- Nizhny Novgorod State University- Queen Mary, University of London- Queensland University of Technology- INRIA-ORION- University of Southern California - Université Paris Dauphine- University of Central Florida- University of Illinois at Urbana-Champaign- University of Maryland- University of Reading- University of Udine ETISEO Project :Large participation (16 active international teams)
ETISEO : Video Data • Large annotated data set • 85 video clips with GT, • organized into • scene types : apron, building entrance, corridor, road, metro station, • video processing problems : noise, shadow, crowd, … • sensor types : one\multi-views, visible\IR, compression…
Video Data :Airport Apron Multi-view Silogic Toulouse – France
Video Data :INRETS Light Changes Building Entrance INRETS-LEOST Villeneuve d’Ascq – France Car Park
Video Data :CEA Corridor Video Type & Quality Street
Video Data :RATP Subway People Density
ETISEO : Results • Detection of physical objects Evaluation on 6 videos Detection rate 16 Teams
ETISEO : Results • Tracking of physical objects Detection rate Teams
ETISEO: Results • Good performance comparison per video: automatic, reliable, consistent metrics: • 16 participants: • 8 teams achieved high quality results • 9 teams performed event recognition • 10 teams produced results on all priority sequences • Best algorithms: combine moving regions and local descriptors • A few limitations: • Algorithm results depend on time processing (RT), manpower (parameter tuning), previous similar experiences, learning stage required or not…: questionnaire • Lack of understanding of the evaluation rules (output XML, time-stamp, ground truth, number of processed videos, frame rate, start frame…) • Video subjectivity: background, masks, GT (static, occluded, far, portable, contextual object, event) • Many metrics and evaluation parameters • Just evaluation numbers, no base-line algorithm • Need of two other analyses: • Metric Analysis: define for each task: • Main metrics: discriminate and meaningful • Complementary metrics: provide additional information • Video Data Analysis: impact of videos on evaluation • define a flexible evaluation tool to adapt GT wrt videos
Metric Analysis : Object detection task • Main metric: Number of objects • Evaluate the number of detected objects matching reference objects using bounding box • Unbiased towards large, homogenous objects • Difficult to evaluate object detection quality • Complementary metric: Object area • Evaluate the number of pixels in reference data that have been detected • Evaluate the object detection quality • Biased toward large, homogenous objects
Metric Analysis : Example (1) Algorithm 9 1 14 28 12 13 32 F-Score 0.49 0.49 0.42 0.4 0.39 0.37 0.37 Algorithm 8 19 20 17 29 3 15 F-Score 0.35 0.33 0.32 0.3 0.24 0.17 0.11 • Sequence ETI-VS2-BE-19-C1 has one big object (car) and several small and weakly contrasted objects (people) • Algorithm 9 correctly detects more objects than algorithm 13(metric: Number of objects) Performance results using the metric “number of objects”
Metric Analysis : Example (2) Algorithm 1 13 9 32 14 12 20 F-Score 0.83 0.71 0.69 0.68 0.65 0.65 0.64 Algorithm 19 28 17 3 29 8 15 F-Score 0.64 0.59 0.55 0.54 0.51 0.5 0.3 • Using metric Object area, biased toward big object (car): • algorithm 13 cannot detect some small objects (people), • algorithm 9 has detected difficult objects at low precision. • Metric Object area is still useful: • it differentiates algorithms 1 and 9: both are good at detecting objects but algorithm 1 is more precise Performance results using the metric “object area”
Metric Analysis : Advantages & Limitations • Advantages : • various metrics for every video processing task. • analysis of the metric strengths and weaknesses and how to use them. • insight into video analysis algorithms: for example, shadows, merge • Still some limitations : • Evaluation results are useful for developers but not for end-users. • Ok, not a competition nor benchmarking • But difficult to judge if one algorithm is good enough for a particular application, or type of videos.
ETISEO : Video Data Analysis ETISEO limitations: • Generalization of evaluation results is subjective : • comparing tested and new videos • Selection of videos according to difficulty levels is subjective • Videos have only qualitative scene description: eg. strong or weak shadow • Two annotators may assign 2 different difficulty levels • One video may contain severalvideo processingproblems at many difficulty levels • The global difficulty level is not sufficient to identify algorithm's specific problems for improvement
Video Data Analysis Objectives of Video Data Analysis : • Study dependencies between videos and video processing problems to • Characterize videos with objective difficulty levels • Determine algorithms capacity in solving one video processing problem. Approach: • To treat each video processing problem separately • Define a measure to compute difficulty levels of videos (or other input data) • Select videos containing only the current problems at various difficulty levels • For each algorithm, determine the highest difficulty level for which this algorithm still has acceptable performance. Approach validation : applied to two problems • Detection of weakly contrasted objects • Detection of objects mixed with shadows
Video Data Analysis :Detection of weakly contrasted objects • Video processing problem definition : the lower the object contrast, the worse the object detection performance • For one algorithm, determine the lowest object contrast for which this algorithm has an acceptable performance • Issue:one blob may contain many regions at several contrast levels
Video Data Analysis : conclusion • Achievements: • An evaluation approach to generaliseevaluation results. • Implementation of this approach for 2 problems. • Limitations: • Need to validate this approach for more problems. • Works well if the video contains only one problem. • If not, detects the upper bound of algorithm capacity. • The difference between the upper bound and the real performance may be significant if: • The test video contains several video processing problems • The same set of parameters is tuned differently to adapt to several dependent problems
General Conclusion • Achievements: • Good performance comparison per video: automatic, reliable, consistent metrics. • Emphasis on gaining insight into video analysis algorithms (shadows, occlusion,..) • A few limitations: • Data and rule subjectivity: background, masks, ground truth,… • Partial solutions for Metric and Video dependencies • Future improvements: flexible evaluation tool • Given a video processing problem: • Selection of metrics • Selection of reference videos • Selection of Ground Truth : filters for reference data, sparse GT for long videos • ETISEO’s video dataset and automatic evaluation tools are publicly available for research purposes: http://www-sop.inria.fr/orion/ETISEO/
Video Data Analysis :Detection of weakly contrasted objects • At each contrast level, the algorithm performance is x/m • x: number of blobs containing current contrast level detected by a given algorithm • m: number of all blobs containing current contrast level • Algorithm capacity: the lowest contrast level for which algorithm performance is bigger than a given threshold
Video Data Analysis :Detection of weakly contrasted objects • Error rate threshold to determine algorithm capacity: 0.5