Higher cortical areas carry a wide range of sensory, cognitive, and motor signals supporting complex goal-directed behavior. These signals are mixed in heterogeneous responses of single neurons, making it difficult to identify underlying mechanisms. I will present two approaches we developed for revealing interpretable circuit mechanisms from heterogeneous neural responses during cognitive tasks. First, I will show a flexible statistical framework for discovering single-trial neural population dynamics from spikes. Our framework simultaneously learns the dynamics and their nonlinear embedding in the neural activity space without rigid parametric assumptions. We applied this framework to recordings from the primate cortex during decision-making. The discovered dynamics were inconsistent with simple hypotheses proposed previously and instead revealed an attractor network mechanism. Second, I will show an approach for inferring an interpretable mechanistic model of a cognitive task—the latent circuit—from neural response data. Our theory enables us to causally validate the inferred circuit mechanism via patterned perturbations of activity and connectivity in the high-dimensional system. This work opens new possibilities for deriving testable mechanistic hypotheses from complex neural response data.