A widespread motivation for merging machine learning and control is enabling decision systems to incorporate feedback from sensors like cameras, microphones, and other high dimensional sensing modalities. This talk will highlight some of the pressing research challenges impeding such a merger. Grounding the discussion in control of autonomous vehicles from vision alone, I will present a possible approach to designing robust controllers when the sensing modality is a learned from rich perceptual data. This proposal will combine first steps towards quantifying uncertainty in perception systems, designing robust controllers with such uncertainty in mind, and guaranteeing performance of these designs.