John R. Hershey

Model-based methods and deep neural networks have both been tremendously successful paradigms in machine learning. In model-based methods, problem domain knowledge can be built into the constraints of the model. In addition unsupervised inference tasks such as adaptation and clustering, are handled in a natural way. However these benefits typically come at the expense of difficulties during inference.

In contrast, deterministic deep neural networks are constructed in such a way that inference is straightforward, and discriminative training is relatively easy. However, their typically generic architectures often make it unclear how to incorporate specific problem knowledge or to perform flexible tasks such as unsupervised inference.

This tutorial surveys a variety of frameworks to provide the advantages of both approaches, including variational auto-encoders, neural conditional random fields, and "deep unfolding" in which the steps of inference algorithms become layers in a neural network. We show how such frameworks yield new understanding of conventional networks, and discuss novel networks for speech processing resulting from these frameworks. We then discuss what has been learned in recent work and provide a prospectus for future research in this area.

Prior to joining MERL in 2010, John spent 5 years at IBM's T.J. Watson Research Center in New York, leading a Noise Robust Speech Recognition team. He also spent a year as a visiting researcher in the speech group at Microsoft Research, after obtaining his Ph D from UCSD. He is currently working on machine learning for signal separation, speech recognition, language processing, and adaptive user interfaces.