Modeling Long-Distance Dependencies

Modeling Long-Distance Dependencies and Modular Representations for Natural Language Processing

Anna Rumshisky

Over the past several years, deep neural networks have produced stunning performance improvements in many areas of artificial intelligence, in some cases approaching or even surpassing human performance. Despite this success, some of the general intelligence tasks remain well out of reach of such models. This is particularly apparent in the more challenging tasks of natural language processing (NLP) that involve retaining and reasoning over both the accumulated knowledge and the contextual information. I will argue that part of the reason for this is that there have been few, if any, recent attempts to align computational models with how humans use language.

In this talk, I will discuss our work on modeling two aspects of human language processing. The first of these concerns the functional specialization of language-processing regions in the brain. Can computational models support a representational modularity that mimics this property? I will present a neural network architecture that uses adversarial training to learn such modular representation and attempts to dissociate meaning from form for linguistic input. I will show how this model can be used to generate sentences in a different style but with the same meaning as the original input. Second, I will address the well-known property of long-distance dependencies in language. In order to maintain coherent dialogue and form a consistent understanding of text, human speakers are able to process and maintain a representation of previously presented information. Can computational models mimic this ability to construct and maintain a global context representation? I will present a neural model that uses an updatable external memory component to capture contextual information, and uses this component to extract the information about the order and timing of events described in text.