Type-logical grammars

Before introducing Combinatory Categorial Grammar, I want to step back and describe the broader tradition it comes from. Categorial grammars have their roots in logic, and the connection between parsing and logical deduction is not merely an analogy—in the type-logical tradition, parsing is deduction in a formal logic. This perspective deepens the deductive parsing framework we developed earlier: there, I described parsing as operating on items using inference rules; here, the items are formulas and the inference rules are rules of a logical calculus.

The Lambek calculus

The starting point is the Lambek calculus (Lambek 1958), one of the earliest formal systems for natural language syntax. In the Lambek calculus, each word in the lexicon is assigned one or more categories (or types), and a string is grammatical if and only if the categories assigned to its words can be combined—using the rules of the calculus—to yield a designated goal category (typically \(S\)).

Categories are built from a set of atomic types \(\{S, NP, N, \ldots\}\) using two binary connectives:

  • \(A / B\) (right implication or forward slash): “something that, given a \(B\) to its right, yields an \(A\)
  • \(B \backslash A\) (left implication or backward slash): “something that, given a \(B\) to its left, yields an \(A\)

For example, an English intransitive verb like runs might be assigned the category \(NP \backslash S\): given an NP to its left, it yields an \(S\). A transitive verb like loves might be assigned \((NP \backslash S) / NP\): given an NP to its right, it yields something that, given an NP to its left, yields an \(S\).

The key rules of the Lambek calculus are:

Right slash elimination (forward application): \[\frac{\Gamma \vdash A / B \quad \Delta \vdash B}{\Gamma, \Delta \vdash A}\]

Left slash elimination (backward application): \[\frac{\Delta \vdash B \quad \Gamma \vdash B \backslash A}{\Delta, \Gamma \vdash A}\]

Right slash introduction (hypothetical reasoning): \[\frac{\Gamma, B \vdash A}{\Gamma \vdash A / B}\]

Left slash introduction: \[\frac{B, \Gamma \vdash A}{\Gamma \vdash B \backslash A}\]

The elimination rules are just function application: if you have something that needs a \(B\) and you have a \(B\), you can combine them. The introduction rules are more interesting—they correspond to hypothetical reasoning: you can prove \(A/B\) by assuming \(B\) and deriving \(A\).

The Curry-Howard correspondence

One of the most beautiful results in this area is that derivations in the Lambek calculus correspond, via the Curry-Howard correspondence, to terms in the simply typed lambda calculus. This means that a syntactic derivation is a semantic representation: the proof that a string is grammatical simultaneously computes its meaning as a lambda term.

For instance, the derivation of John runs from \(NP\) and \(NP \backslash S\) via backward application corresponds to the lambda term \(\text{runs}(\text{john})\). The compositional semantics is determined by the derivation itself, without any additional machinery. I won’t develop this connection further here (it would take us into formal semantics), but I mention it because it reveals something important about the type-logical tradition: syntax, semantics, and logic are not separate modules but aspects of the same formal system Moortgat (1997).

Substructural logics and their grammars

The Lambek calculus is a substructural logic: it lacks the structural rules that classical logic takes for granted (Morrill 2011). Specifically, it does not have:

  • Weakening: you cannot ignore hypotheses (\(\Gamma, A, \Delta \vdash C\) does not imply \(\Gamma, \Delta \vdash C\))
  • Contraction: you cannot use a hypothesis more than once
  • Exchange (commutativity): the order of hypotheses matters (\(A, B \vdash C\) does not imply \(B, A \vdash C\))

These restrictions are exactly what we want for syntax: the order of words matters (no exchange), every word must be used (no weakening), and no word can be used twice (no contraction).

Different categorial formalisms arise by selectively adding back some of these structural rules or adding controlled versions of them. This is where Combinatory Categorial Grammar enters the picture: CCG extends the Lambek calculus with specific additional rules—composition and type-raising—that permit more flexible constituency while remaining within the MCS boundary. We turn to CCG in the next section.

References

Carpenter, Bob. 1997. Type-Logical Semantics. MIT Press.
Lambek, Joachim. 1958. “The Mathematics of Sentence Structure.” The American Mathematical Monthly 65 (3): 154–70.
Moortgat, Michael. 1997. “Categorial Type Logics.” In Handbook of Logic and Language. Elsevier.
Morrill, Glyn. 2011. Categorial Grammar: Logical Syntax, Semantics, and Processing. Oxford University Press.