Publications

Toward Automated Grammar Extraction via Semantic Labeling of Parser Implementations. Carson Harmon, Bradford Larsen, and Evan Sultanik. LangSec 2020.

Complex data formats such as Microsoft Office documents and PDFs are frequently specified by a reference implementation or an ambiguous natural language document, not by a formal grammar. As a consequence, it is usually not clear precisely what inputs are accepted by a parser program, and the details will vary between implementations. In this paper we combine a dynamic taint analysis of a parser implementation with a novel algorithm that automatically labels functions in the implementation with the parts of an inferred grammar that they are responsible for processing. This work is a first step in achieving automated grammar extraction from arbitrary, non-context-free parsers.

Paper (PDF) / Slides (PDF) / Presentation (YouTube)

Engineering definitional interpreters. Jan Midtgaard, Norman Ramsey, and Bradford Larsen. PPDP '13.

A definitional interpreter should be clear and easy to write, but it may run 4–10 times slower than a well-crafted bytecode interpreter. In a case study focused on implementation choices, we explore ways of making definitional interpreters faster without expending much programming effort. We implement, in OCaml, interpreters based on three semantics for a simple subset of Lua. We compile the OCaml to x86 native code, and we systematically investigate hundreds of combinations of algorithms and data structures. In this experimental context, our fastest interpreters are based on natural semantics; good algorithms and data structures make them 2–3 times faster than naïve interpreters. Our best interpreter, created using only modest effort, runs only 1.5 times slower than a mature bytecode interpreter implemented in C.

Paper (PDF) / Extended Paper (PDF) / Code (tgz)

Simple Optimizations for an Applicative Array Language for Graphics Processors. Bradford Larsen. DAMP '11.

Array programming allows high-level use of fast graphics processors for general-purpose computing. This paper shows a few optimizations, such as array fusion and automatic use of GPU shared memory, that can yield order-of-magnitude reductions in constant factors. These powerful optimizations are especially simple to implement given a functional source language.

Paper (PDF) / Slides (PDF) / Poster (PDF) / Code (tgz)

Searching Without a Heuristic: Efficient Use of Abstraction. Bradford Larsen, Ethan Burns, Wheeler Ruml, and Robert C. Holte. AAAI-10.

The A* algorithm is the grandfather of many heuristic search algorithms that find cheapest paths through a graph. A* must be supplied with an admissible (i.e., non-overestimating) heuristic to find optimal solutions; in many domains such heuristics are hard to invent. This paper presents Switchback, a novel hierarchical heuristic search algorithm (which derives an admissible heuristic from a domain abstraction function as it executes) that avoids inefficiencies of its predecessors, Hierarchical A* and Hierarchical IDA*.

Paper (PDF) / Slides (PDF) / Code (tgz) / Code (GitHub)

A DSM Protocol Aware of both Thread Migration and Memory Constraints. Ronald Veldema, Bradford Larsen, and Michael Philippsen. PDCS '08.

A distributed shared memory system allows a computer cluster to appear to the programmer as a single multicore computer. Such a system must handle access of remote data, typically through either replicating shared data (which costs memory) or through migrating the relevant data or computation (which can take time). This paper shows how data replication and migration of computation can be effectively combined in a distributed shared memory system designed for out-of-core computation in memory-intensive applications.

This work was awarded Best Paper in the “Distributed Information and Systems” category.

Paper (PDF)