Thesis

Simultaneous Recognition, Localization and Mapping for Wearable Visual Robots

R O Castle, The University of Oxford, December 2009.

Abstract

With the advent of ever smaller and more powerful portable computing devices, and ever smaller cameras, wearable computing is becoming more feasible. The ever increasing numbers of augmented reality applications are allowing users to view additional data about their world overlaid on their world using portable computing devices.

The main aim of this research is to enable a user of a wearable robot to explore large environments automatically viewing augmented reality at locations and on objects of interest.

To implement this research a wearable visual robotic assistant is designed and constructed. Evaluation of the different technologies results in a final design that combines a shoulder mounted self stabilizing active camera, and a hand held magic lens into a single portable system.

To enable the wearable assistant to locate known objects, a system is designed that combines an established method for appearance-based recognition with one for simultaneous localization and mapping using a single camera. As well as identifying planar objects, the objects are located relative to the camera in 3D by computing the image-to-database homography. The 3D positions of the objects are then used as additional measurements in the SLAM process, which routinely uses other point features to acquire and maintain a map of the surroundings, irrespective of whether objects are present or not.

The monocular SLAM system is then replaced with a new method for building maps and tracking. Instead of tracking and mapping in a linear frame-rate driven manner, this adopted method separates the mapping from the tracking. This allows higher density maps to be constructed, and provides more robust tracking. The flexible framework provided by this method is extended to support multiple independent cameras, and multiple independent maps, allowing the user of the wearable two-camera robot to escape the confines of the desk top and explore arbitrarily sized environments.

The final part of the work brings together the parallel tracking and multiple mapping system with the recognition and localization of planar objects from a database. The method is able to build multiple feature rich maps of the world and simultaneously recognize, reconstruct and localize objects within these maps. The object reconstruction process uses the spatially separated keyframes from the tracking and mapping processes to recognize and localize known objects in the world. These are then used for augmented reality overlays related to the objects.