# Parsing and Hypergraphs

(Klein and Manning, 2004) at New developments in parsing technology

My favorite things to combine: graph theory and NLP. Today’s paper takes a look at parsing from the lens of Dijkstra’s algorithm for shortest paths. The authors plainly state the **key idea** of this paper:

There is intuitively very little difference between (a) combining subtrees to form a tree, (b) combining hypotheses to form a conclusion, and (c) visiting all tail nodes of a hyperarc before traversing to a head node.

They use this view to propose a parsing algorithm.

**Directed hypergraphs** are like normal directed graphs, except each **edge** (or **arc**) can connect *multiple* source nodes to *multiple* target nodes. (I find this language clearer than ‘head’ and ‘tail’.) Klein and Manning rely only on edges with only multiple sources, which are called **B-arcs**. Now, in the logical deduction view, each **vertex** (or **node**) represents a proposition, so B-arcs represent Horn clauses. The simplest example is a **definite clause**: check off all the boxes and you know the result is true. “Laptop is packed” and “charger is packed” and “teeth are brushed” means “ready to go to work”. Now, a path in the graph encodes a deduction—a sequence of these edges used to reach a result.

Let’s connect this to parsing. Parsing starts off with a set of production rules—a grammar, \(G\)— and a lattice \(L\) saying which words occur at which spans of the sentence. So let’s initialize each of the \(O(n^2)\) spans as a node. We’ll also create an arc for each production rule connecting its descendants to the span it contains. Finally, we create a special source node, which connects to each word. Now, a parse of a sentence becomes a path from the source node to the root edge.

To score these, we associate a score with each edge in, say, the max-plus semiring. Then you run a shortest-path algorithm (like Dijkstra’s) on the graph to get scores.

There’s more to this paper about efficiency tricks they employed and how there are connections to Knuth’s notion of *superior grammars* and do deterministic finite-state automata, but the big idea is the connection. That’s the part that will change how you think about each problem.