Conference Papers

MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices

Kejie Li, Jia-Wang Bian, Robert Castle, Philip H.S. Torr, Victor Adrian Prisacariu, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, British Columbia, June 18 – 22, 2023. arXiv:2303.01932

Paper Project Page

Abstract

High-quality 3D ground-truth shapes are critical for 3D object reconstruction evaluation. However, it is difficult to create a replica of an object in reality, and even 3D reconstructions generated by 3D scanners have artefacts that cause biases in evaluation. To address this issue, we introduce a novel multi-view RGBD dataset captured using a mobile device, which includes highly precise 3D ground-truth annotations for 153 object models featuring a diverse set of 3D structures. We obtain precise 3D ground-truth shape without relying on high-end 3D scanners by utilising LEGO models with known geometry as the 3D structures for image capture. The distinct data modality offered by high-resolution RGB images and low-resolution depth maps captured on a mobile device, when combined with precise 3D geometry annotations, presents a unique opportunity for future research on high-fidelity 3D reconstruction. Furthermore, we evaluate a range of 3D reconstruction algorithms on the proposed dataset.

Object Recognition and Localization While Tracking and Mapping

R O Castle, and D W Murray, Proc 8th IEEE/ACM International Symposium on Mixed and Augmented Reality, Orlando, Florida, Oct 19 – 22, 2009. doi:10.1109/ISMAR.2009.5336477

Paper Poster

Abstract

This paper demonstrates how objects can be recognized, reconstructed, and localized within a 3D map, using observations and matching of SIFT features in keyframes. The keyframes arise as part of a frame-rate process of parallel camera tracking and mapping, in which the keyframe camera poses and 3D map points are refined using bundle adjustment. The object reconstruction process runs independently, and in parallel to, the tracking and mapping processes. Detected objects are automatically labelled on the user’s display using predefined annotations. The annotations are also used to highlight areas of interest upon the objects to the user.

Video-rate Localization in Multiple Maps for Wearable Augmented Reality

R O Castle, G Klein, and D W Murray, Proc 12th IEEE International Symposium on Wearable Computers, Pittsburgh PA, Sept 28 – Oct 1, 2008. This paper won the Best Paper award. doi:10.1109/ISWC.2008.4911577

Paper Demo Poster

Abstract

We show how a system for video-rate parallel camera tracking and 3D map-building can be readily extended to allow one or more cameras to work in several maps, separately or simultaneously. The ability to handle several thousand features per map at video-rate, and for the cameras to switch automatically between maps, allows spatially localized AR workcells to be constructed and used with very little intervention from the user of a wearable vision system. The user can explore an environment in a natural way, acquiring local maps in real-time. When revisiting those areas the camera will select the correct local map from store and continue tracking and structural acquisition, while the user views relevant AR constructs registered to that map.

Video-rate recognition and localization for wearable cameras

R O Castle, D J Gawley, G Klein, and D W Murray, Proc 18th British Machine Vision Conference, Warwick, Sept 2007. doi:10.5244/C.21.112

Paper

Abstract

Using simultaneous localization and mapping to determine the 3D surroundings and pose of a wearable or hand-held camera provides the geometrical foundation for several capabilities of value to an autonomous wearable vision system. The one explored here is the ability to incorporate recognized objects into the map of the surroundings and refer to them. Established methods for feature cluster recognition are used to identify and localize known planar objects, and their geometry is incorporated into the map of the surrounds using a minimalist representation. Continued measurement of these mapped objects improves both the accuracy of estimated maps and the robustness of the tracking system. In the context of wearable (or hand-held) vision, the system’s ability to enhance generated maps with known objects increases the map’s value to human operators, and also enables meaningful automatic annotation of the user’s surroundings.

Towards simultaneous recognition, localization and mapping for hand-held and wearable cameras

R O Castle, D J Gawley, G Klein, and D W Murray, Proc. International Conference on Robotics and Automation, Rome, April 2007. doi:10.1109/ROBOT.2007.364109

Paper

Abstract

This paper presents a system which combines single-camera SLAM (Simultaneous Localization and Mapping) with established methods for feature recognition. Besides using standard salient image features to build an on-line map of the camera’s environment, this system is capable of identifying and localizing known planar objects in the scene, and incorporating their geometry into the world map. Continued measurement of these mapped objects improves both the accuracy of estimated maps and the robustness of the tracking system. In the context of hand-held or wearable vision, the system’s ability to enhance generated maps with known objects increases the map’s value to human operators, and also enables meaningful automatic annotation of the user’s surroundings. The presented solution lies between the high order enriching of maps such as scene classification, and the efforts to introduce higher geometric primitives such as lines into probabilistic maps.