Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier



Open Access Dissertation

Document Type


Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded


Month Degree Awarded


First Advisor

Lee Spector

Second Advisor

David Jensen

Third Advisor

Yuriy Brun

Fourth Advisor

Adam Porter

Subject Categories

Artificial Intelligence and Robotics


Software developers routinely create tests before writing code, to ensure that their programs fulfill their requirements. Instead of having human programmers write the code to meet these tests, automatic program synthesis systems can create programs to meet specifications without human intervention, only requiring examples of desired behavior. In the long-term, we envision using genetic programming to synthesize large pieces of software. This dissertation takes steps toward this goal by investigating the ability of genetic programming to solve introductory computer science programming problems.

We present a suite of 29 benchmark problems intended to test general program synthesis systems, which we systematically selected from sources of introductory computer science programming problems. This suite is suitable for experiments with any program synthesis system driven by input/output examples. Unlike existing benchmarks that concentrate on constrained problem domains such as list manipulation, symbolic regression, or boolean functions, this suite contains general programming problems that require a range of programming constructs, such as multiple data types and data structures, control flow statements, and I/O. The problems encompass a range of difficulties and requirements as necessary to thoroughly assess the capabilities of a program synthesis system. Besides describing the specifications for each problem, we make recommendations for experimental protocols and statistical methods to use with the problems.

This dissertation's second contribution is an investigation of behavior-based parent selection in genetic programming, concentrating on a new method called lexicase selection. Most parent selection techniques aggregate errors from test cases to compute a single scalar fitness value; lexicase selection instead treats test cases separately, never comparing error values of different test cases. This property allows it to select parents that specialize on some test cases even if they perform poorly on others. We compare lexicase selection to other parent selection techniques on our benchmark suite, showing better performance for lexicase selection. After observing that lexicase selection increases exploration of the search space while also increasing exploitation of promising programs, we conduct a range of experiments to identify which characteristics of lexicase selection influence its utility.