Author Archives: James Pustejovsky

Fred Jelinek HU/CLSP PIRE Workshop 2014

July 7-11, 2014: Attending the Fred Jelinek HU/CLSP PIRE Workshop 2014 in Prague. Here are the slides from my talk, “Distinguishing Possible and Probably in in Linguistic Theory”, on July 7, 2014.

Comments Off on Fred Jelinek HU/CLSP PIRE Workshop 2014

Filed under News & Updates

ISOSpace in SemEval2015

The spatial language specification language ISOSpace is being used in the upcoming SemEval 2015 task, SpaceEval. Check out the latest version of the here.

Comments Off on ISOSpace in SemEval2015

Filed under News & Updates

Natural Language Annotation for Machine Learning

James Pustejovsky and Amber Stubbs


Create your own natural language training corpus for machine learning. Whether you’re working with English, Chinese, or any other natural language, this hands-on book guides you through a proven annotation development cycle—the process of adding metadata to your training corpus to help ML algorithms work more efficiently. You don’t need any programming or linguistics experience to get started.

Using detailed examples at every step, you’ll learn how the MATTER Annotation Development Process helps you Model, Annotate, Train, Test, Evaluate, and Revise your training corpus. You also get a complete walkthrough of a real-world annotation project.

  • Define a clear annotation goal before collecting your dataset (corpus)
  • Learn tools for analyzing the linguistic content of your corpus
  • Build a model and specification for your annotation project
  • Examine the different annotation formats, from basic XML to the Linguistic Annotation Framework
  • Create a gold standard corpus that can be used to train and test ML algorithms
  • Select the ML algorithms that will process your annotated data
  • Evaluate the test results and revise your annotation task
  • Learn how to use lightweight software for annotating texts and adjudicating the annotations

This book is a perfect companion to O’Reilly’s Natural Language Processing with Python.

Publication Date: November 1, 2012

Pustejovsky, James and Amber Stubbs. Natural Language Annotation and Machine Learning. O’Reilly Publishers, 2012.

Comments Off on Natural Language Annotation for Machine Learning

Filed under Uncategorized

Interpreting Motion: Grounded Representations for Spatial Language

Inderjeet Mani and James Pustejovsky

Interpreting Motion presents an integrated perspective on how language structures constrain concepts of motion and how the world shapes the way motion is linguistically expressed. Natural language allows for efficient communication of elaborate descriptions of movement without requiring a precise specification of the motion. Interpreting Motion is the first book to analyze the semantics of motion expressions in terms of the formalisms of qualitative spatial reasoning. It shows how motion descriptions in language are mapped to trajectories of moving entities based on qualitative spatio-temporal relationships. The authors provide an extensive discussion of prior research on spatial prepositions and motion verbs, devoting chapters to the compositional semantics of motion sentences, the formal representations needed for computers to reason qualitatively about time, space, and motion, and the methodology for annotating corpora with linguistic information in order to train computer programs to reproduce the annotation. The applications they illustrate include route navigation, the mapping of travel narratives, question-answering, image and video tagging, and graphical rendering of scenes from textual descriptions.

The book is written accessibly for a broad scientific audience of linguists, cognitive scientists, computer scientists, and those working in fields such as artificial intelligence and geographic information systems.

Publication Date: April 7, 2012


Comments Off on Interpreting Motion: Grounded Representations for Spatial Language

Filed under Uncategorized

The Generative Lexicon

James Pustejovsky

The Generative Lexicon presents a novel and exciting theory of lexical semantics that addresses the problem of the “multiplicity of word meaning”; that is, how we are able to give an infinite number of senses to words with finite means. The first formally elaborated theory of a generative approach to word meaning, it lays the foundation for an implemented computational treatment of word meaning that connects explicitly to a compositional semantics.In contrast to the static view of word meaning (where each word is characterized by a predetermined number of word senses) that imposes a tremendous bottleneck on the performance capability of any natural language processing system, Pustejovsky proposes that the lexicon becomes an active — and central — component in the linguistic description. The essence of his theory is that the lexicon functions generatively, first by providing a rich and expressive vocabulary for characterizing lexical information; then, by developing a framework for manipulating fine-grained distinctions in word descriptions; and finally, by formalizing a set of mechanisms for specialized composition of aspects of such descriptions of words, as they occur in context, extended and novel senses are generated.The subjects covered include semantics of nominals (figure/ground nominals, relational nominals, and other event nominals); the semantics of causation (in particular, how causation is lexicalized in language, including causative/unaccusatives, aspectual predicates, experiencer predicates, and modal causatives); how semantic types constrain syntactic expression (such as the behavior of type shifting and type coercion operations); a formal treatment of event semantics with subevents); and a general treatment of the problem of polysemy.

Language, Speech, and Communication series.

Publication Date: January 9, 1998

Pustejovsky, J. The Generative Lexicon, MIT Press, Cambridge. 1995.

Comments Off on The Generative Lexicon

Filed under Uncategorized

Annotating, Extracting and Reasoning about Time and Events

Frank Schilder, Graham Katz and James Pustejovsky

This state-of-the-art survey comprises a selection of the material presented at the International Dagstuhl Seminar on Annotating, Extracting and Reasoning about Time and Events, held in Dagstuhl Castle, Germany, in April 2005. The seminar centered around an emerging de facto standard for time and event annotation: TimeML. It features nine papers that detail current research and discuss open problems concerning annotation, temporal reasoning, and event identification.

Publication Date: December 14, 2007

Schilder, F., G. Katz, and Pustejovsky, James, ed. Annotating, Extracting, and Reasoning about Time and Event. Berlin: Springer, 2007.

Comments Off on Annotating, Extracting and Reasoning about Time and Events

Filed under Uncategorized

Semantics and the Lexicon

James Pustejovsky

This book integrates the research being carried out in the field of lexical semantics in linguistics with the work on knowledge representation and lexicon design in computational linguistics. It provides a stimulating and unique discussion between the computational perspective of lexical meaning and the concerns of the linguist for the semantic description of lexical items in the context of syntactic descriptions.

Publication Date: December 3, 2010, Edition: Softcover reprint of hardcover 1st ed. 1993

Pustejovsky, J. ed. Semantics and the Lexicon, Kluwer, Dordrecht, The Netherlands. 1993.

Comments Off on Semantics and the Lexicon

Filed under Uncategorized

Advances in Generative Lexicon Theory

James Pustejovsky, Pierrette Bouillon, Hitoshi Isahara and Kyoko Kanzaki

This collection of papers takes linguists to the leading edge of techniques in generative lexicon theory, the linguistic composition methodology that arose from the imperative to provide a compositional semantics for the contextual modifications in meaning that emerge in real linguistic usage. Today’s growing shift towards distributed compositional analyses evinces the applicability of GL theory, and the contributions to this volume, presented at three international workshops (GL-2003, GL-2005 and GL-2007) address the relationship between compositionality in language and the mechanisms of selection in grammar that are necessary to maintain this property. The core unresolved issues in compositionality, relating to the interpretation of context and the mechanisms of selection, are treated from varying perspectives within GL theory, including its basic theoretical mechanisms and its analytical viewpoint on linguistic phenomena.

Publication Date: December 19, 2012, Edition: 2013


Pustejovsky, James, Pierrette Bouillon, Hitoshi Isahara, Kyoko Kanzaki, Chungmin Lee. Advances in Generative Lexicon Theory. Springer, 2013.

Comments Off on Advances in Generative Lexicon Theory

Filed under Uncategorized

Lexical Semantics: The Problem of Polysemy

James Pustejovsky and Branimir Boguraev

Lexical ambiguity presents one of the most intractable problems for language processing studies and, not surprisingly, it is at the core of research in lexical semantics. Originally published as two special issues of the Journal of Semantics, this collection focuses on the problem of polysemy, from the point of view of practitioners of computational linguistics.


Publication Date: January 2, 1997
Pustejovsky, J. and B. Boguraev, eds. Lexical Semantics and the Problem of Polysemy, Oxford University Press, Oxford. 1997.

Comments Off on Lexical Semantics: The Problem of Polysemy

Filed under Uncategorized

The Language of Time: A Reader

Inderjeet Mani, James Pustejovsky and Robert Gaizauskas

This reader collects and introduces important work in linguistics, computer science, artificial intelligence, and computational linguistics on the use of linguistic devices in natural languages to situate events in time: whether they are past, present, or future; whether they are real or hypothetical; when an event might have occurred, and how long it could have lasted. Clear, self-contained editorial introductions to each area provide the necessary technical background for the non-specialist, explaining the underlying connections across disciplines.

Publication Date: August 11, 2005

Mani, I. J. Pustejovsky, R. Gaizauskas, (eds.) The language of time: readings in temporal information processing, Oxford University Press. 2005.


Comments Off on The Language of Time: A Reader

Filed under Uncategorized