Analysis as deduction

In the morphological-patterns module, we implemented CKY and Earley as specific algorithms with specific data structures—charts, chart entries, loops over spans. These implementations work, but they obscure something important: CKY and Earley are instances of a more general pattern. Both can be described as systems of inference rules operating on items, and there is a single, uniform algorithm—agenda-based parsing—that can execute any such system.

This framework, due to Shieber et al. (1995; see also Sikkel 1997), is called analysis as deduction (or parsing as deduction). It will be particularly important for the final project, where you’ll implement an agenda-based parser for Multiple Context-Free Grammars using the inference rules from Kallmeyer (2013).

Parsing as logical deduction

The idea is to describe a parsing algorithm by specifying four things:

Items: the pieces of partial information that the parser manipulates. In CKY, an item is a triple $[A, i, j]$ meaning “nonterminal $A$ spans positions $i$ through $j$.” In Earley, an item is a dotted rule $[A \rightarrow \alpha \bullet \beta, i, j]$ meaning “we’ve recognized $\alpha$ starting at position $i$ and are now at position $j$, expecting $\beta$.”
Axioms: the items that are true without any inference. In CKY, the axioms are $[A, i, i+1]$ for every terminal rule $A \rightarrow w_i$.
Inference rules: ways of combining existing items to produce new ones. In CKY, the key rule is:

\[\frac{[B, i, k] \quad [C, k, j]}{[A, i, j]} \quad \text{if } A \rightarrow B\;C \in R\]

This says: if $B$ spans $(i, k)$ and $C$ spans $(k, j)$, and there’s a rule $A \rightarrow B\;C$, then $A$ spans $(i, j)$.

Goal: the item we’re looking for. In CKY, the goal is $[S, 0, n]$—the start symbol spanning the entire input.

CKY as a deductive system

Let me recast the CKY algorithm we developed for morphology in this framework. Given a grammar $G = (V, \Sigma, R, S)$ in Chomsky Normal Form and an input string $w_1 \ldots w_n$:

Items: $[A, i, j]$ where $A \in V$ and $0 \leq i < j \leq n$

Axioms: $\dfrac{}{[A, i, i+1]}$ for each $A \rightarrow w_{i+1} \in R$

Inference rules:

\[\frac{[B, i, k] \quad [C, k, j]}{[A, i, j]} \quad \text{for each } A \rightarrow B\;C \in R\]

Goal: $[S, 0, n]$

This is exactly the same algorithm as the chart-filling procedure in cky-parsing.qmd—just described differently. The deductive formulation separates the logic of the parser (what items can be derived) from the control (the order in which items are processed). You can change the logic without changing the control, or vice versa.

Earley as a deductive system

Similarly, the Earley algorithm can be recast:

Items: $[A \rightarrow \alpha \bullet \beta, i, j]$ (dotted rules with start and current position)

Axioms (Predict): $\dfrac{}{[S \rightarrow \bullet \gamma, 0, 0]}$ for each $S \rightarrow \gamma \in R$

Inference rules:

Predict: $\dfrac{[A \rightarrow \alpha \bullet B\;\beta, i, j]}{[B \rightarrow \bullet \gamma, j, j]}$ for each $B \rightarrow \gamma \in R$

Scan: $\dfrac{[A \rightarrow \alpha \bullet w_{j+1}\;\beta, i, j]}{[A \rightarrow \alpha\;w_{j+1} \bullet \beta, i, j+1]}$

Complete: $\dfrac{[A \rightarrow \alpha \bullet B\;\beta, i, k] \quad [B \rightarrow \gamma \bullet, k, j]}{[A \rightarrow \alpha\;B \bullet \beta, i, j]}$

Goal: $[S \rightarrow \gamma \bullet, 0, n]$ for some $S \rightarrow \gamma \in R$

Agenda-based parsing

Given a deductive system (items, axioms, inference rules, goal), there is a generic algorithm that finds all derivable items:

Initialize an agenda (a queue or priority queue) with the axioms.
Initialize a chart (a set of derived items), initially empty.
While the agenda is not empty:
1. Remove an item $I$ from the agenda.
2. If $I$ is already in the chart, skip it.
3. Add $I$ to the chart.
4. For each inference rule and each combination of items in the chart that, together with $I$, trigger that rule, add the consequent item to the agenda (if not already in the chart).
If the goal item is in the chart, the input is accepted.

This is the algorithm you will implement in the final project. The reason I’m spending time on it here is that it is completely general: the same agenda-based algorithm works for CKY, Earley, TAG parsing, and MCFG parsing. The only thing that changes between these parsers is the set of items and inference rules; the control structure is identical. For the final project, you’ll use the inference rules from Kallmeyer (2013), which generalize the CKY rules above to handle the multi-component yields that MCFGs allow.

References

Kallmeyer, Laura. 2013. “Linear Context-Free Rewriting Systems.” Language and Linguistics Compass 7 (1): 22–38. https://doi.org/10.1002/lnc3.359.

Shieber, Stuart M., Yves Schabes, and Fernando C. N. Pereira. 1995. “Principles and Implementation of Deductive Parsing.” The Journal of Logic Programming, Computational Linguistics and Logic Programming, vol. 24 (1): 3–36. https://doi.org/10.1016/0743-1066(95)00035-I.

Sikkel, Klaas. 1997. Parsing Schemata. Springer.

--- title: Analysis as deduction bibliography: ../references.bib --- In the [morphological-patterns module](../morphological-patterns/cky-parsing.qmd), we implemented CKY and Earley as specific algorithms with specific data structures—charts, chart entries, loops over spans. These implementations work, but they obscure something important: CKY and Earley are instances of a more general pattern. Both can be described as systems of *inference rules* operating on *items*, and there is a single, uniform algorithm—*agenda-based parsing*—that can execute any such system. This framework, due to @shieber_principles_1995 [see also @sikkel1997parsing], is called *analysis as deduction* (or *parsing as deduction*). It will be particularly important for the [final project](../assignments/final-project.ipynb), where you'll implement an agenda-based parser for Multiple Context-Free Grammars using the inference rules from @kallmeyer_linear_2013. ## Parsing as logical deduction The idea is to describe a parsing algorithm by specifying four things: 1. **Items**: the pieces of partial information that the parser manipulates. In CKY, an item is a triple $[A, i, j]$ meaning "nonterminal $A$ spans positions $i$ through $j$." In Earley, an item is a dotted rule $[A \rightarrow \alpha \bullet \beta, i, j]$ meaning "we've recognized $\alpha$ starting at position $i$ and are now at position $j$, expecting $\beta$." 2. **Axioms**: the items that are true without any inference. In CKY, the axioms are $[A, i, i+1]$ for every terminal rule $A \rightarrow w_i$. 3. **Inference rules**: ways of combining existing items to produce new ones. In CKY, the key rule is: $$\frac{[B, i, k] \quad [C, k, j]}{[A, i, j]} \quad \text{if } A \rightarrow B\;C \in R$$ This says: if $B$ spans $(i, k)$ and $C$ spans $(k, j)$, and there's a rule $A \rightarrow B\;C$, then $A$ spans $(i, j)$. 4. **Goal**: the item we're looking for. In CKY, the goal is $[S, 0, n]$—the start symbol spanning the entire input. ## CKY as a deductive system Let me recast the CKY algorithm we [developed for morphology](../morphological-patterns/cky-parsing.qmd) in this framework. Given a grammar $G = (V, \Sigma, R, S)$ in Chomsky Normal Form and an input string $w_1 \ldots w_n$: **Items:** $[A, i, j]$ where $A \in V$ and $0 \leq i < j \leq n$ **Axioms:** $\dfrac{}{[A, i, i+1]}$ for each $A \rightarrow w_{i+1} \in R$ **Inference rules:** $$\frac{[B, i, k] \quad [C, k, j]}{[A, i, j]} \quad \text{for each } A \rightarrow B\;C \in R$$ **Goal:** $[S, 0, n]$ This is exactly the same algorithm as the chart-filling procedure in [cky-parsing.qmd](../morphological-patterns/cky-parsing.qmd)—just described differently. The deductive formulation separates the *logic* of the parser (what items can be derived) from the *control* (the order in which items are processed). You can change the logic without changing the control, or vice versa. ## Earley as a deductive system Similarly, the [Earley algorithm](../morphological-patterns/earley-parsing.qmd) can be recast: **Items:** $[A \rightarrow \alpha \bullet \beta, i, j]$ (dotted rules with start and current position) **Axioms (Predict):** $\dfrac{}{[S \rightarrow \bullet \gamma, 0, 0]}$ for each $S \rightarrow \gamma \in R$ **Inference rules:** *Predict:* $\dfrac{[A \rightarrow \alpha \bullet B\;\beta, i, j]}{[B \rightarrow \bullet \gamma, j, j]}$ for each $B \rightarrow \gamma \in R$ *Scan:* $\dfrac{[A \rightarrow \alpha \bullet w_{j+1}\;\beta, i, j]}{[A \rightarrow \alpha\;w_{j+1} \bullet \beta, i, j+1]}$ *Complete:* $\dfrac{[A \rightarrow \alpha \bullet B\;\beta, i, k] \quad [B \rightarrow \gamma \bullet, k, j]}{[A \rightarrow \alpha\;B \bullet \beta, i, j]}$ **Goal:** $[S \rightarrow \gamma \bullet, 0, n]$ for some $S \rightarrow \gamma \in R$ ## Agenda-based parsing Given a deductive system (items, axioms, inference rules, goal), there is a generic algorithm that finds all derivable items: 1. Initialize an *agenda* (a queue or priority queue) with the axioms. 2. Initialize a *chart* (a set of derived items), initially empty. 3. While the agenda is not empty: a. Remove an item $I$ from the agenda. b. If $I$ is already in the chart, skip it. c. Add $I$ to the chart. d. For each inference rule and each combination of items in the chart that, together with $I$, trigger that rule, add the consequent item to the agenda (if not already in the chart). 4. If the goal item is in the chart, the input is accepted. This is the algorithm you will implement in the [final project](../assignments/final-project.ipynb). The reason I'm spending time on it here is that it is completely general: the same agenda-based algorithm works for CKY, Earley, TAG parsing, and MCFG parsing. The only thing that changes between these parsers is the set of items and inference rules; the control structure is identical. For the final project, you'll use the inference rules from @kallmeyer_linear_2013, which generalize the CKY rules above to handle the multi-component yields that [MCFGs](mildly-context-sensitive/mcfgs-and-lcfrs.qmd) allow.