1 / 44

Place Recognition and Lifelong Mapping

Place Recognition and Lifelong Mapping. Kurt Konolige, James Bowman, JD Chen, Patrick Mihelich Willow Garage Michael Colander, Vincent Lepetit, Pascal Fua Ecole Polytechnique Federal de Lausanne. Konolige et al. View-Based Maps, RSS, 2009

gertrude
Download Presentation

Place Recognition and Lifelong Mapping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Place Recognition and Lifelong Mapping Kurt Konolige, James Bowman, JD Chen, Patrick Mihelich Willow Garage Michael Colander, Vincent Lepetit, Pascal Fua Ecole Polytechnique Federal de Lausanne Konolige et al. View-Based Maps, RSS, 2009 Konolige and Bowman, Lefelong Visual Maps, IROS 2009 Konolige et al. Mapping, Navigation and Learning for Off-road Traversal, JFR, 2008 Konolige and Agrawal, FrameSLAM: from Bundle Adjustment to Realtime Visual Mapping, TRO, 2008

  2. Willow Garage • PR2 Mobile Manipulation Platform • Open-source robotics software • ROS • OpenCV • Robotics and vision algorithms

  3. p2 p1 p3 From 2D laser maps to VIEW MAPS Locally metric Global manifold

  4. p2 p1 p3 [Grisetti et al.] Toro VSLAM by VIEW MAPS • View Maps: set of stereo views connected by nonlinear gaussian constraints Continuous recognition Locally metric Global manifold Continuous detection

  5. Crusher Visual Odometry Stereo: [Matthies, Lacroix, Agrawal, Comport, …] Monocular: [Nister05, …] Multi-frame: [Engels06, Mourignon06, …] CRUSHER Carnegie Mellon NREC Vehicle 5 km autonomous traverse Rough terrain Log file data

  6. Place Recognition: Vocabulary Trees[Nister and Stewenius CVPR06] • “Bag of words” retrieval • Vocab tree created offline • For recognition: • Image keypoints extracted • Tree encodes approximate NN search • Inverted index of images at leaves [Cummins and Newman ICRA07 Cullmer et al. ACRA08 Fraundorfer et al. IROS07 Eade and Drummond BMVC08 Williams et al. ICCV07] [Image from Nister and Stewenius CVPR06]

  7. Place Recognition: Vocabulary Trees[Nister and Stewenius] • “Bag of words” • Vocab tree created offline • New images queried and added online Performance on Indoor dataset

  8. Geometric Check How good a rejection filter is the geometric check?

  9. Kidnapped Robot / Relocalization

  10. Trajectory synthesis

  11. Indoor VSLAM with View Maps

  12. Place Recognition after 1 week

  13. Challenges • Robust place recognition • Use more stables features, e.g., lines [Jana Kosecka] • Learn discriminating features with their geometry • Relax the geometry • Sub-parts: chairs, tables can move • No geometry, e.g., FAB-MAP [Cummins and Newman] • Map repair: how to integrate new information • Update local metric maps with changes • What happens when PR fails?

  14. Visual environment change • Challenges for lifelong maps: • Map stitching • Map repair • View deletion • Robust recognition

  15. View deletion strategy • View clusters • Distance measure between views • c(v,v’) = k/m – 1, k inliers in m matches • A cluster of set S is a maximally connected subset of S • Neighborhood of v is a set S reachable from v within a distance ndand angle na • LRU algorithm • Max size Q for any neighborhood • Preferentially thin clusters • Delete oldest clusters if necessary

  16. Visual Odometry Stereo: [Matthies, Lacroix, Agrawal, Comport, …] Monocular: [Nister05, …] Multi-frame: [Engels06, Mourignon06, …] - no registration - high precision Indoor Willow Garage PR2 1km indoor trajectories Online

  17. Urban Scenes[images courtesy Andrew Comport, INRIA] • Outdoor sequence in Versailles • 1 m stereo baseline, narrow FOV • ~400 m sequence • Average frame distance: 0.6 m • Max frame distance: 1.1 m • 30 - 88 Hz implementation

  18. LAGR [Learning Applied to Ground Robotics] 200 m autonomous traverse Off-road terrain 15 Hz implementation Autonomous Off-Road Terrain Traversal

  19. Visual SLAM Optimal solution: Bundle Adjustment • ~1000 camera poses • ~1M 3D points • Several days to solve • NxN image matching

  20. Visual SLAM Landmarks EKF Visual SLAM [Davison02, Sim03, Solá05, …] - small-scale (On2) - robustness? FastSLAM [Se03, Eade07, Howard07] - large-scale (O log(n)) Hybrid (PTAM, Submaps, SWF) [Klein07, Eade07, Sibley07] - small scale Frames Frame-based SLAM [Lu+Milios97, Gutmann99, Grisetti07, Konolige07/08] - large-scale (On) - robustness • ~1000 camera poses • ~1M 3D points • Several days to solve • NxN image matching

  21. Vision Tasksrealtime Local Maps Long-range motion estimation Global Maps – Place recognition and local mapre-use [Andrew Comport ICRA 2007]

  22. q2 q1 p3 p2 p1 q3 p3 p2 p1 Visual Odometry for Motion Estimation Stereo: [Matthies, Lacroix, Agrawal, Comport, …] Monocular: [Nister05, …] Multi-frame: [Engels06, Mourignon06, …] - no registration - precision? Local Maps no registration Long-range motion estimation GPS-less estimation

  23. 6DOF Visual Odometry Principle (SfM)

  24. left right T T+1 Visual Odometry • Extract features • - Harris, FAST, SIFT, CenSurE • Match features • DETECTION, not TRACKING • Across successive left images • Stereo: Across left/right stereo images • Find largest consistent subset of matches • Stereo: 3 non-collinear matches yield motion estimate • Monocular: 5 matches yield motion estimate* • RANSAC method • Bundle adjust last N frames and their feature tracks

  25. Challenge of Outdoor Environments 5 Datasets - 3 km to 6 km trajectories (autonomous) - 10 Hz stereo, 1 m baseline - Max movement typically 0.8 m - RTK GPS for ground truth

  26. 5 Km 5 m 1 mrad ~ 0.06 deg Solutions Goal: 5 m error in 5 Km (0.1%) • 1. Minimize local drift • - Center-surround features for detection stability • - Incremental BA • - Calibration (remove bias) • 2. Minimize global angular drift • - Lever-arm problem • - IMU accelerometers give global tilt/roll • - Low-drift IMU for yaw drift • - Visual SLAM for loop closure

  27. Stable Feature Detection Corners vs. Center-surround Harris, FAST ~8 ms scaled SIFT, SURF CenSurE ~15 ms ~300 ms, ~150 ms Agrawal, Blas, Konolige CenSurE: Center-surround extrema for realtime feature detection and matching ECCV 2008

  28. Error and Calibration camera T vehicle trajectory, m Camera to vehicle transform T misalignment Stereo system miscalibration => bias trajectory, m

  29. Results, VO 5 km runs RTK GPS Ground Truth Run 1 Run 2

  30. IMU vs. VO • IMU: • High XYZ drift from accelerometers (t2) • Global gravity normal (noisy) – correct tilt/roll • Low drift yaw angle (~ 1 deg/hr, tactical grade IMU)

  31. Dataset Length RMS error MAX error course1-DTED4-run2 3129 m 5.70 m (0.18%) 10.06 m (0.32%) course2B-DTED4-run4 6440 m 5.10 m (0.08%) 8.19 m (0.13%) course2B-DTED5-run1 4712 m 6.09 m (0.13%) 10.70 m (0.23%) course3-DTED5-run1 5293 m 4.85 m (0.09%) 8.58 m (0.16%) course3-DTED4-run1 4920 m 9.16 m (0.19%) 15.30 m (0.31%) VO + IMU EKF predict VO Filter EKF IMU Filter update movieIMU.mov

  32. VO Conclusion • 1. Visual Odometry can provide precise trajectories in GPS-less environments • - Good features have high frame match rates • - Incremental bundle adjustment improves accuracy • ~ 5 cm / √m, ~0.15 deg / √m • 2. Integration with IMU is necessary for large-scale precision • - Noisy gravity normal corrects tilt/roll • - High-quality IMU for yaw correction

  33. Visual SLAM using Skeletons • Local registration is a small optimization problem (VO) • Loop closure is a larger but reducible optimization problem

  34. Marginalization c q z

  35. Long-Baseline Matching • Match using CenSure features • Good matches up to 10 m baseline • High sensitivity • High selectivity • High accuracy • Not invariant to Z-axis rotation 6.42 m distance 866 features 315 matched 101 inliers Frame 9 Frame 463

  36. FrameSLAM Results, Versaille Rond 133 frames, 29 links 35 ms PCG VO result FrameSLAM result

  37. FrameSLAM Results, Indoor Lab [courtesy Robert Sim] • Indoor lab sequence • 12 cm stereo baseline, wide FOV • ~100 m sequence, ~8200 key frames • 17 tack points in the VSLAM graph

  38. FrameSLAM Results, Indoor Lab [courtesy Robert Sim] • Indoor lab sequence • 12 cm stereo baseline, wide FOV • ~100 m sequence, ~8200 key frames • Green crosses are uncorrected VO; cyan environment points • Red segments are VSLAM-corrected poses; blue environment points

  39. Challenge of Outdoor Environments 5 Datasets - 3 km to 6 km trajectories (autonomous) - 10 Hz stereo, 1 m baseline - Max movement typically 0.8 m - RTK GPS for ground truth

  40. FrameSLAM Results, Crusher 5K x 2 VO run 1 VO run 2 RTK GPS run 1 42K key frames2.2K link frames286 links 3.3 s PCG

  41. FrameSLAM Results, Crusher 5K x 2 VO run 1 VO run 2 RTK GPS run 1 42K key frames2.2K link frames286 links 3.3 s PCG

  42. Small-area 3D Reconstruction Leaving Flatland Morisset, Subramanian [SRI] Rusu [TUM]

  43. 3D Reconstruction Pipeline VSLAM Maps IMU, Odometry Stereo images Hokuyo point cloud 3D Pose estimation Place recognition Octree voxels Meshes Registered Point Clouds Planes

  44. FrameSLAM Conclusion • VO provides accurate local registration • Reduction to frame-frame constraints eliminates all feature variables • => approximation • Further reductions of frames to skeletons gives compact system • => Large systems can be solved quickly • Some method of place recognition is required for closing loops • In small areas, realtime 3D reconstruction Many … [Ishiguro01, Ulrich00, Barbosa02, … Recent: [Cummins07, Pronobis06, …]

More Related