Tandem Workshop on Optimality in Language and Geometric Approaches to Cognition
Workshop Berlin, December 11-13, 2010. Schützenstraße 18, 10117 Berlin (Mitte), ZAS.
Organizers: Anton Benz (ZAS, Berlin), Reinhard Blutner (ILLC, Amsterdam), Manfred Krifka (ZAS/HU Berlin), Peter beim Graben (HU Berlin), Nicolas Stindt (HU, DAAD).

Sponsored by ZAS (Berlin), and the NWO projects "asymmetry in grammar" (Petra Hendriks), "weak referentiality" (Henriette de Swart), "conflicts in interpretation" (Helen de Hoop)

Together with the Department of German Language and Linguistics at the Humboldt-Universität zu Berlin and with the Institute for Logic, Language and Computation at the Universiteit van Amsterdam, the Centre for General Linguistics (ZAS) in Berlin is going to organize a three-day 'Tandem Workshop on Optimality in Language and Geometric Approaches to Cognition' to be held at the Centre for General Linguistics (ZAS), December 11th - 13th, 2010. The aim of the workshop is to encompass symbolic Optimality Theory (OT) and geometric representations of cognitive states and processes at the syntactic, semantic and pragmatic levels. In particular, the workshop will focus on issues such as:

Part A.

  • OT syntax and OT parsing
  • Neural networks and OT
  • Gradedness and OT
  • OT and compositionality
  • OT semantics and OT pragmatics

Part B.

  • Conceptual spaces
  • Vagueness
  • Geometric models of meaning and compositionality
  • Geometric models of linguistic and perceptional ambiguity
  • Tensor product representations
  • Transient dynamic computation

Invited Participants:

     Part A

     Part B

Gerlof Bouma (Potsdam, D)
Petra Hendriks (Groningen, NL)
Lotte Hogeweg (Nijmegen, NL)
Geraldine Legendre (Baltimore, USA)
Paul Smolensky (Baltimore, USA)
Henriette de Swart (Utrecht, NL)
Ruben van de Vijver (Potsdam, D)
Henk Zeevat (Amsterdam, NL)
Harald Atmanspacher (Freiburg, D)
Stefan Evert (Osnabrueck, D)
Stefan Frank (London, UK)
Peter Gärdenfors (Lund, S)
Stefan Kiebel (Leipzig, D)
Eduardo Mizraji (Montevideo, Uruguay)
Sonja Smets (Groningen, NL)
Paul Smolensky (Baltimore, USA)

Pluralization in German: a challenge for frequency-based learning
Gerlof Bouma & Ruben van de Vijver
University of Potsdam.

German singular-plural pairs may involve two alternations: voicing alternation [wek]~[wege] and vowel alternation [hu:n]~[hy:ner]. Both alternations are approximately equally frequent in the lexicon: vowel and voicing alternation appear in about 1/4 of relevant lexemes.
   In production experiments, children generalized the voicing alternation to nonce words more often than the vowel alternation. Adult speakers applied voicing alternation to nonces in the same proportion as in the lexicon. Vowel alternation, on the other hand, remained underrepresented in this group of speakers, too (Van de Vijver & Baer-Henney, submitted)
   These discrepancies between the lexicon and the experimental data, but also within the experimental data itself, pose a problem for models of grammatical interaction that are primarily driven by frequency of occurrence, such as Stochastic OT with the Gradual Learning Algorithm.
  In this talk, we investigate several approaches to solving these problems in a competition-based model of grammar.

Van de Vijver & Baer-Henney. Submitted. Acquisition of alternations: the role of substance, frequency and phonotactics.

Online processing of bidirectional optimization
Petra Hendriks
Rijksuniversiteit Groningen

Although bidirectional optimization has been applied fruitfully to natural language to explain several semantic and pragmatic patterns within and across languages, it has been argued to be untenable as an online mechanism of sentence processing (Beaver & Lee 2004; Blutner & Zeevat 2004; Zeevat 2000). However, recent evidence from psycholinguistic studies and computational cognitive modeling of the acquisition and use of pronouns (e.g., van Rij, van Rijn & Hendriks 2010; in prep) suggests that bidirectional optimization is a local and online process that is subject to cognitive restrictions. This talk explores the various restrictions that may limit the application of bidirectional optimization during online sentence processing.

Optimality Theoretic Lexical Semantics
Lotte Hogeweg
Radboud University Nijmegen

Optimality Theory (OT) has been applied to lexical semantics in several studies (e.g. Zwarts 2004, 2008, Zeevat 2002, Fong 2005, Hogeweg 2009). The aim this talk is twofold. Firstly, I want to explore the consequences of an OT approach to lexical semantics in more detail. Secondly, since the works mentioned only addressed functional items like prepositions and discourse markers, I will investigate the applicability of OT for the analysis of content words.

An optimality-theoretic treatment of the hedonic implicatures of taste and smell
Manfred Krifka
HU Berlin & ZAS

Modeling comprehension of personal pronouns:
Bidirectional vs. Unidirectional Optimization in adults & children
Géraldine Legendre & Paul Smolensky
Johns Hopkins University, Baltimore

Young children do not seem to have a problem producing singular personal pronouns. However, two-and-a-half-year-old children have been experimentally shown to comprehend 1st and 2nd but not 3rd person pronouns in French (Legendre et al., 2010, GALANA). Unlike children, adults have no difficulty interpreting them, regardless of person. Building on Heim (1991) and Sauerland (2008)’s analysis of 3rd person pronouns as triggering an implicated presupposition, we argue that the difference has to do with adults’ ability to integrate both speaker’s and hearer’s perspectives modeled as bidirectional OT.

Embedding OT grammars in neural networks:
Discrete and gradient effects in production
Paul Smolensky
Johns Hopkins University, Baltimore

Telicity features of bare nominals
Henriëtte de Swart
University of Utrecht

are nominals (i.e. nominal structures without a determiner) have traditionally played an important role in aspectual theory. Since Verkuyl (1972), we know that the quantized/ cumulative nature of the nominal argument is responsible for the aspectual contrast between writing a letter in/*for an hour and writing letters for/*in an hour. However, in other languages bare plurals may also be compatible with a telic interpretation, e.g. Russian Petja pro-èitalperf stat’i, Peter perf-read-past articles, is generally translated as ‘Peter read the articles.’ Bare singulars are even more flexible, as they lead to telic interpretations (Slavic, Hebrew), atelic interpretations (Brazilian Portuguese) or both (Hindi). In Romance languages, bare singulars are restricted to weakly referential positions, which restricts their aspectual freedom. This paper focuses on number morphology, definiteness and discourse referentiality as relevant factors for telicity.
   Languages vary in whether or not they have grammaticized singular/plural distinctions, definite/indefinite articles, and we can model the different classes in an OT typology (cf. de Swart and Zwarts 2009, 2010). These grammatical differences have implications for number neutrality and referentiality features of bare nominals, which determine the possibilities for bare nominals of getting quantized/cumulative interpretations and participate in telic/atelic event descriptions. I argue that bare nominals derive their interpretation from the competition with overtly marked nominals under bidirectional optimization, and show how this view accounts for their cross-linguistic variation in aspectual behavior.

Parity and Automatic Self-Monitoring
Henk Zeevat
Universiteit van Amsterdam

The crucial question about any system of communication is parity: how
and at what level do the sender and the receiver agree with each other in successful communication. For natural language, the factor that makes this question non-trivial is the underdetermination of form by meaning: there are many meanings that can be expressed by the same form. This underdetermination is the product of syntactic ambiguity, word sense ambiguity, various types of anaphoricity to be resolved, discourse relations, and other context integration possibilities (relevance). Grice's theory of non-natural meaning puts parity at the level of speaker intention. Linguistic theories do not have anything to say about how parity is reached and how it is reached so often and with so little effort and in so little time. This is because of the Aristotelian conception of grammar as a relation between forms and meanings: underdetermination then merely predicts that parity is a rare event. The proper strategy for the speaker is to monitor her utterance for having the intended reading as the most probable one and replace it by a better one if that is possible within syntactic and lexical means of expression. For the hearer the only rational option is to go for the most probable reading. The best way to estimate that most probable reading is by emulating Bayesian interpretation combining normal cue-driven perception with a simulation of the speaker's formulation process. The combination of these processes of self-monitored production and Bayesian interpretation is the only theory that can predict a high degree of parity for linguistic communication.
     The paper will review a number of phenomena in syntax-semantics where automatised self-monitoring must be assumed for purely descriptive reasons. These include NP selection, word order freezing, optional case marking, and the optional marking of discourse relations. Automatic self-monitoring turns out to be a proper optimisation problem with different semantic features vying with each other for the scarce expressive means.

The Necker-Zeno Model
Harald Atmanspacher
IGPP Freiburg/Br.

The concept of temporal nonlocality is used to refer to states of a (classical) system that are not sharply localized in time but extend over a time interval of non-zero duration. We investigate the question whether, and how, such a temporal nonlocality can be tested in mental processes. For this purpose we exploit the empirically supported Necker-Zeno model for bistable perception, which uses formal elements of quantum theory but does not refer to anything like quantum physics of the brain. We derive so-called temporal Bell inequalities and demonstrate how they can be violated in this model.

Conceptual spaces for matching and representing preferences
Anton Benz  & Alexandra Strekalova
ZAS Berlin

Geometrical models of meaning and compositionality
Reinhard Blutner
Universiteit van Amsterdam

I defend the view that comprehending language is as direct as the perception of visual scenery. In both cases, beliefs are formed in a rather direct and holistic way, apparently without reference to serial processes of inferential interpretation. This view contrasts sharply with neo-Gricean and post-Gricean proposals for mechanisms of inferential interpretation. According to theses views inferences to the best explanation are required in order to determine what is said (and what is implicated). These inferences can be seen as enriching or specifying an underlying underdetermined representation. You cannot believe something automatically if it is the result of a (conscious) inference. I will argue that the division of labor between an underspecified representation and an inferential enrichment process has its roots in Boolean semantics, i.e. the underlying system that forms propositions and properties is a Boolean algebra.
    Geometric models of meaning, in contrast, are based on vector-algebras that form so-called orthoalgebras. They are much closer to neural network models that capture the subsymbolic nature of cognition. Gärdenfors (mental spaces), Lakoff (embodied cognition), and Fauconnier & Turner (conceptual blending) are typical representatives. Taking common examples from lexical pragmatics I will demonstrate that with the help of geometric (vector-based) models of pragmasemantics we are able to formulate non-inferential mechanisms of direct interpretation. I will explain how the proposed approach explicates pragmatic notions such as conceptual blending and modulation in a straightforward way.


Some Mathematical Insights Into Distributional Semantic Models
Stefan Evert
University of Osnabrueck

Distributional semantic models (DSM) -- also known as "word space" or "distributional similarity" models -- are based on the assumption that the meaning of a word can (at least to a certain extent) be inferred from its usage, i.e. its distribution in text. Therefore, these models dynamically build semantic representations -- in the form of high-dimensional vector spaces -- through a statistical analysis of the contexts in which words occur. DSMs are a promising technique for solving the lexical acquisition bottleneck by unsupervised learning, and their distributed representation provides a cognitively plausible, robust and flexible architecture for the organisation and processing of semantic information.
    The computational analysis of DSMs is based on their representation as a large, sparsely populated cooccurrence matrix, drawing on a range of well-known linear algebra techniques. Although DSMs have been in widespread use for almost 20 years in the field of computational linguistics, there is still much confusion about some of their basic mathematical properties.
     In my talk, I address a number of these issues, with a particular focus on the relation between term-context vs. term-term matrices, first-order vs. higher-order association, and syntagmatic vs. paradigmatic word spaces. One major result is an insight into the role of dimensionality reduction by singular value decomposition (SVD), which is related to the statistical technique of principal component analysis (PCA).

The dynamics of incremental sentence comprehension: A situation-space model
Stefan Frank
Division of Psychology and Language Sciences
University College London

A recent connectionist model (Frank, Haselager, & Van Rooij, 2009) treats sentence comprehension as the process of mapping sentences onto vectors in high-dimensional "situation space". A situation vector represents the state-of-affairs described by the sentence. I will show how this modelling framework naturally leads to formal measures of the amount of semantic and syntactic information conveyed by each word in a sentence. Next, an extension to the model is presented: Situation vectors change dynamically over time as words come in one-by-one, yielding simulated word-processing times. In line with empirical reading-time data, the word-information measures turn out to be predictive of word-processing times.

Using conceptual spaces to model actions and events
Peter Gärdenfors and Massimo Warglien
Lund University, Cognitive Science

Actions and events are central for a semantics of natural language. In this article we present a cognitively based model of these notions. After giving a general presentation of the theory of conceptual spaces, we suggest how the analysis of perceptual concepts can be extended to actions and events. Firstly, we will argue that action space can be analyzed in the same way as e.g. color space or shape space. Our hypothesis is that our categorization of actions to a large extent depends on our perception of forces. In line with this, an action will be described as a pattern of forces. An action category will be identified as a convex region of action space. We review some indirect evidence for this representation. Secondly, an event is represented as an interaction between an agent space and a patient space. The agent performs an action, i.e. exerts a force, that will change the properties of the patient, i.e. its location in patient space. This model of events will be suitable for an analysis of the semantics of verbs, in particular the distinction between manner and result verbs.

Stable heteroclinic sequences as a paradigm for dynamic psycholinguistics
Peter beim Graben
Dept. for German Language and Linguistics
Humboldt-Universität zu Berlin

According to the dynamical system approach to language, representational states such as phrase structure trees are identified with points in a dynamical system's state space. These states can be construed through Smolensky's tensor product representations and filler/role bindings of the linguistic material under study. Then, cognitive computations such as syntactic parsing correspond to continuous state space trajectories connecting the representational states in sequential order. However, little is known yet how these trajectories are described in terms of dynamical system theory. Recently, Rabinovich et al. suggested a possible solution for that problem: Representational states are regarded as saddle points, and the instable manifold of one state is connected to the stable manifold of its successor, thus forming a "stable heteroclinic sequence". In my presentation I shall speculate about possible consequences of this view for dynamic psycholinguistics. Syntactic reanalysis can then be interpreted in terms of bifurcations of the underlying dynamical system.

A hierarchy of time-scales and the brain
Stefan Kiebel
Max-Planck-Institut für Kognitions- und Neurowissenschaften

Currently, there is no theory that explains how the large-scale organization of the human brain can be related to our environment. This is astonishing because neuroscientists generally assume that the brain represents events in our environment by decoding sensory input. We propose that the brain models the auditory environment as a collection of hierarchical, dynamical systems, where slower environmental changes provide the context for faster changes. In addition, we suggest that there is a simple mapping between this temporal hierarchy and the anatomical hierarchy of the brain. This theory provides a framework for explaining a wide range of neuroscientific findings by a single computational principle.

Modeling the cognitive spatio-temporal operations using associative memories
and multiplicative contexts
Eduardo Mizraji
Group of Cognitive Systems Modeling
Sección Biofísica, Facultad de Ciencias,
Universidad de la República, Montevideo, Uruguay

Let us assume that the procedures displayed by the human cognition to organize our behaviors and to store structured information, require to code spatial and temporal relationships in specific neural modules. In the simplest cases, these spatio-temporal relations are condensed in single words (eg, in prepositions like “below”, “behind”, “in”, “before”, “after”, etc.). These words reflect the existence of neural operators that compute these spatial and temporal relations; these relations share some similarities with some logical gates, and we show in this work that they can be modeled by means of matrix neural memories that act on the Kronecker tensor products of vectors. In a further level of complexity, the cognitive system is capable of storing episodes that unfold in space and time. Here we are challenged by the important dilemma of the compatibility between the apparently stable large-scale neural anatomical connectivity with the plasticity of episodic
memories. To focus this dilemma, we analyze a model where the spatio-temporal structure of an episode is coded using contextual markers capable of linking
different memory modules in a versatile way. At the end, we analyze possible experimental evaluations for the proposed models.

Dynamic Conditionals as a Unifying Setting for Information Change: From Quantum Logic to Dynamic Belief Revision
Sonja Smets
Rijksuniversiteit Groningen

I will focus on the role of logical dynamics, and in particular on the role of dynamic-epistemic notions of "conditional", in understanding and modeling apparently very different phenomena: (1) knowledge-updates due to inter-agent communication; (2) belief-revising mechanisms used by agents to change their theories (or beliefs) when faced with new information; (3) the qualitative behavior of quantum systems when subjected to measurements by an observing agent. A dynamic-modal approach is new in the context of quantum logic (being first introduced in my joint work with A. Baltag on dynamic quantum logic), but in the context of knowledge-update and belief-revision it links closely to the work on Dynamic-Epistemic Logic.
    First I introduce and compare various notions of dynamic-epistemic/doxastic conditionals, all in terms of the different dynamic-modal laws they satisfy. Next I show that these laws encode different types of information changes, presenting both the differences and some striking similarities between the belief-revision conditional and quantum learning through measurement.
   My comparative discussion aims to show the fruitfulness of the dynamic-epistemic approach to conditionals as a unifying setting for analyzing, comparing and classifying various forms of information flow, in terms of their underlying "logical dynamics".

Embedding the discrete within the continuous: Processing implications
of tensor product representations for linguistic production
Paul Smolensky
John Hopkins University, Baltimore