I will describe a new computational framework for the estimation of physical scene values--like depth, surface normals, optical flow, etc.--from visual measurements, that can reason with local models (like planar depth, smooth shape, affine motion, etc.) which are expected to hold piecewise across the scene. Formulated as the minimization of a well-defined cost function, inference in this framework can be carried out very efficiently by a network of nodes where (a) each node reasons about the validity of the local model in one of a dense, overlapping set of regions that redundantly cover the image plane; and (b) all nodes collaborate to produce a globally consistent scene map. This architecture lends itself naturally to parallelization across multi-core and streaming architectures, and when evaluated on the application of stereo estimation, is found to yield more accurate estimates than comparable variational and MRF-based inference methods.