Artificial Vision in Road Vehicles

Artificial Vision in Road Vehicles By: MASSIMO BERTOZZI, ALBERTO BROGGI, MASSIMO CELLARIO, ALESSANDRA FASCIOLI, PAOLO LOMBARDI, AND MARCO PORTA Presented by: Ali Agha April 6th, 2009

Main Motivation • Help the driver in case of failure, for example, due to a lack of concentration or due to drowsiness.

Active Sensors in ITS • Example: Laser-based sensors and millimeter-wave radars • Drawbacks • Low spatial resolution • Slow scanning speed and • Expensive • Reflection problems • maximum signal level must comply with some safety rules • interference among sensors of the same type • Advantages: • mm-wave radars are robust to rain and fog • measure some quantities, such as movement, in a more direct way • require less performing computing resources, as they acquire a considerably lower amount of data.

Passive Sensors in ITS • Example: Vision-based sensors • Advantages: • Noninvasive • Useful in specific applications (lane localization, traffic signs recognition, Obstacle identification) • Drawbacks: • Not robust in foggy, night, or direct sunshine conditions.

Consideration in design a vision system for ITS applications • ITS systems require faster processing than other applications, since vehicle speed is bounded by the processing rate. • computing engines cannot be based on expensive processors. • no assumptions can be made on scene illumination or contrast and the process should be robust to environmental conditions, such as • sun, rain, or fog and their dynamic changes such as transitions between sun and shadow, or the entrance or exit from a tunnel. • robustness to vehicle’s movements and handling the drifts in the camera’s calibration

Road Following

Lane Detection • Two basic problem in Lane Detection are: • The presence of shadows • Occlusion, caused by other vehicles

Focus of attention • Due to both physical and continuity constraints, the processing of the whole image can be replaced by the analysis of specific regions of interest only (the so-called focus of attention), in which the features of interest are more likely to be found.

Focus of attention • J. Goldbeck, et al. (1998) • R. Chapuis, et al. (2001) employs a model both for the road and the vehicle’s dynamic to determine the road portion where it is most likely to find lane markings

Focus of attention • J. Goldbeck, et al. (1998) • R. Chapuis, et al. (2001) Dynamically determination of WOI by means of statistical methods; According to the current state and previously detected WOIs.

Fixed Lane Width • The assumption of a fixed or smoothly varying lane width allows the enhancement of the search criterion, limiting the search to almost parallel lane markings.

Fixed Lane Width • K. Kim, et al. (1995) • D. Pomerleau, et al. (1996) lane markings can be detected using both neural networks and simple vision algorithms: two parallel stripes of the acquired image are selected and filtered using Gaussian masks and zero crossing to find vertical edges. The result is matched against a given model.

Fixed Lane Width • K. Kim, et al. (1995) • D. Pomerleau, et al. (1996) based on the processing of the image portion The determination of the curvature is carried out according to a number of possible curvature models

Road Shape • The reconstruction of road geometry can be simplified by assumptions on its shape.

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) lane markings are modeled as clothoids. In a clothoid the curvature depends linearly on the curvilinear reference. This model has the advantage that the knowledge of two parameters only allows the full localization of lane markings and the computation of other parameters

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) dynamic programming optimization method is used to chose among center-line candidates representing the actual geometry of the road

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) uses a polynomial representation for lane markings. lane markings are modeled as parabolas and a simplified Hough transform is used to accomplish the fitting procedure.

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) relies on a polynomial curve. It assumes a flat road with either continuous or dashed bright lane markings. The history of previously located lane markings is used to determine the region of interest

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) exploits a polynomial road modelization to calculate the impact distance from the vehicle to the nearest road side by considering the intersection between the straight line trajectory followed in case of a driver loss of control and the polynomial function describing the road side

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) proposed to use concentric circles to represent lane boundaries. circular shape models can in fact be better choices than polynomial approximations

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) adopt a more generic model for the road. uses a contour-based method. Actually, only straight or small curved roads without intersections are included in this model. The road model is used to follow contours formed by pixels that feature a significant gradient direction value.

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) uses an edge linking process based on Contour Chains and Causal Neighborhood Windows (areas of interest connected to edge elements). After an initial segmentation phase, the longest chains with slope angles close to 45 and 135 degrees are searched.

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) based on a linear lane model, where road markers are reconstructed as sequences of straight lines.

Road Shape • Lützeler, et al. & Franke, et al. (1998) & Goldbeck, et al. (1999) • K. A. Redmill, et al. (2001) • S. L. Michael, et al. (1997) • K. A. Redmill, et al. (1997) F. Chausse, et al. (2000) • J. Goldbeck, et al. (1999) R. Risack, et al. (1998) • S. M.Wong, et al. (1999) X. Youchun, et al. (2000) • A. Broggi, et al. (1995) & S. Denasi, et al. (1994) A generic triangular road model is proposed

A priori knowledge on the road surface/slope • The knowledge of the specific camera calibration together with the assumption of an a priori knowledge on the road (i.e., a flat road without bumps) can be exploited to ease the localization of features and/or to simplify the mapping between image pixels and their correspondent world coordinates.

A priori knowledge on the road surface/slope • M. Bertozzi, et al. (1998) The GOLD system exploit the assumption of a flat road in front of the vehicle. The lane markings detection is performed in a different image domain, representing a bird’s eye view of the road, which can be obtained thank to the flat road assumption.

Lane Detection (Summery)

Pedestrian Detection

Segmentation with motion • Pros: • Use temporal information and is reliable • Cons: • Does not detect standing pedestrian • Needs sequence of a few frames • Just for detecting moving objects not their velocity

Segmentation with motion • R. Polana et al. (1995) • S. McKenna et al. (1997) • R. Cutler et al. (2000) motion detection with optical flow: They analyzes the scene with a discrete cube, where they assign to each region its average optical flow.

Segmentation with motion • R. Polana et al. (1995) • S. McKenna et al. (1997) • R. Cutler et al. (2000) uses a zero-crossing detection algorithm using the convolution of a spatio-temporal Gaussian

Segmentation with motion • R. Polana et al. (1995) • S. McKenna et al. (1997) • R. Cutler et al. (2000) uses a subtraction between an image at time t and a version of the same image stabilized with respect to image at instant t-taw.

Segmentation with Stereo • D. Beymer et al. (1999) • L. Zhao et al. (2000) In surveillance applications, stereo analysis is sometimes used as a cue to build a disparity map of the background for use with background subtraction.

Segmentation with Stereo • D. Beymer et al. (1999) • L. Zhao et al. (2000) Range thresholding based on stereo analysis for segmentation

Focus of Attention • A. Broggi et al. (2000) • C. Curio et al. (2000) salient regions in opportune feature maps are interpreted as candidates for pedestrians. In the GOLD system, vertical symmetries are associated with candidates for standing pedestrians

Focus of Attention • A. Broggi et al. (2000) • C. Curio et al. (2000) The focus of attention is directed by a composition of 1) a map of the local image entropy, -2) a model-matching module with the shape of a representing human legs, and -3) finally inverse perspective mapping (binocular vision) for the short distance field.

Recognition of Human Gait • C. Wohler et al. (2000) • R. Cutler et al. (2000) • C. Curio et al. (2000) the ATDNN performs a local spatio-temporal processing to detect the typical pattern of the movement

Recognition of Human Gait • C. Wohler et al. (2000) • R. Cutler et al. (2000) • C. Curio et al. (2000) Periodicity of the human gait is often recognized with traditional methods like the Fourier transform

Recognition of Human Gait • C. Wohler et al. (2000) • R. Cutler et al. (2000) • C. Curio et al. (2000) The periodic movement detected is correlated to an experimental curve derived from the statistical average of human gait periods

Recognition of Human Shape • D. Beymar et al. (1999) & A. Broggi et al. (2000) • H. Fujiyoshi et al. (2000) • D. Gavrila et al. (2000) • V. Philomin et al. (2000) • C. Papageorgiou et al. (1999) & A. Mohan et al. (2001) employ a model for the head and shoulders. This approach is very sensible to scale variation, so multiple models of different scales are needed.

Recognition of Human Shape • D. Beymar et al. (1999) & A. Broggi et al. (2000) • H. Fujiyoshi et al. (2000) • D. Gavrila et al. (2000) • V. Philomin et al. (2000) • C. Papageorgiou et al. (1999) & A. Mohan et al. (2001) uses a skeletonization procedure. calculate first the centroid of the area and then the distances from the centroid to each border points.

Recognition of Human Shape • D. Beymar et al. (1999) & A. Broggi et al. (2000) • H. Fujiyoshi et al. (2000) • D. Gavrila et al. (2000) • V. Philomin et al. (2000) • C. Papageorgiou et al. (1999) & A. Mohan et al. (2001) generic forms are tried first, and similar and more detailed shapes afterwards.

Recognition of Human Shape • D. Beymar et al. (1999) & A. Broggi et al. (2000) • H. Fujiyoshi et al. (2000) • D. Gavrila et al. (2000) • V. Philomin et al. (2000) • C. Papageorgiou et al. (1999) & A. Mohan et al. (2001)

Pedestrian Detection (Summery)

Artificial Vision in Road Vehicles