In the past several years, the field of computer vision has chalked up significant achievements, fueled by new algorithms (such as deep neural networks), new hardware (such as consumer 3D cameras) and new available processing (such as GPU’s). When we consider the problems that tomorrow’s household robots and autonomous vehicles will have to solve, however, there is evidently still a ways to go. In this talk, I will discuss current work within Intel’s Perceptual Computing on a scene understanding pipeline, the aim of which is to enable a far more comprehensive understanding of an environment than existing techniques currently provide. Key elements of the pipeline are a 3D reconstruction of the scene geometry, followed by segmentation of likely candidates, classification, and finally 3D registration to align models to the scene data. The overall effect is to move from a pixel-based reconstruction of the scene to one that integrates semantic understanding into the capture process.