Publication:
General Program Synthesis from Examples Using Genetic Programming with Parent Selection Based on Random Lexicographic Orderings of Test Cases

dc.contributor.advisorLee Spector
dc.contributor.advisorDavid Jensen
dc.contributor.advisorYuriy Brun
dc.contributor.advisorAdam Porter
dc.contributor.authorHelmuth, Thomas
dc.contributor.departmentUniversity of Massachusetts Amherst
dc.date2024-03-27T18:50:57.000
dc.date.accessioned2024-04-26T16:09:14Z
dc.date.available2024-04-26T16:09:14Z
dc.date.submittedSeptember
dc.date.submitted2015
dc.description.abstractSoftware developers routinely create tests before writing code, to ensure that their programs fulfill their requirements. Instead of having human programmers write the code to meet these tests, automatic program synthesis systems can create programs to meet specifications without human intervention, only requiring examples of desired behavior. In the long-term, we envision using genetic programming to synthesize large pieces of software. This dissertation takes steps toward this goal by investigating the ability of genetic programming to solve introductory computer science programming problems. We present a suite of 29 benchmark problems intended to test general program synthesis systems, which we systematically selected from sources of introductory computer science programming problems. This suite is suitable for experiments with any program synthesis system driven by input/output examples. Unlike existing benchmarks that concentrate on constrained problem domains such as list manipulation, symbolic regression, or boolean functions, this suite contains general programming problems that require a range of programming constructs, such as multiple data types and data structures, control flow statements, and I/O. The problems encompass a range of difficulties and requirements as necessary to thoroughly assess the capabilities of a program synthesis system. Besides describing the specifications for each problem, we make recommendations for experimental protocols and statistical methods to use with the problems. This dissertation's second contribution is an investigation of behavior-based parent selection in genetic programming, concentrating on a new method called lexicase selection. Most parent selection techniques aggregate errors from test cases to compute a single scalar fitness value; lexicase selection instead treats test cases separately, never comparing error values of different test cases. This property allows it to select parents that specialize on some test cases even if they perform poorly on others. We compare lexicase selection to other parent selection techniques on our benchmark suite, showing better performance for lexicase selection. After observing that lexicase selection increases exploration of the search space while also increasing exploitation of promising programs, we conduct a range of experiments to identify which characteristics of lexicase selection influence its utility.
dc.description.degreeDoctor of Philosophy (PhD)
dc.description.departmentComputer Science
dc.identifier.doihttps://doi.org/10.7275/7408407.0
dc.identifier.orcidN/A
dc.identifier.urihttps://hdl.handle.net/20.500.14394/19681
dc.relation.urlhttps://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1477&context=dissertations_2&unstamped=1
dc.source.statuspublished
dc.subjectgenetic programming
dc.subjectparent selection
dc.subjectbenchmarks
dc.subjectprogram synthesis
dc.subjectArtificial Intelligence and Robotics
dc.titleGeneral Program Synthesis from Examples Using Genetic Programming with Parent Selection Based on Random Lexicographic Orderings of Test Cases
dc.typeopenaccess
dc.typearticle
dc.typedissertation
digcom.contributor.authorisAuthorOfPublication|email:thelmuth@cs.umass.edu|institution:University of Massachusetts Amherst|Helmuth, Thomas
digcom.identifierdissertations_2/465
digcom.identifier.contextkey7408407
digcom.identifier.submissionpathdissertations_2/465
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
thesis.pdf
Size:
1.06 MB
Format:
Adobe Portable Document Format
Collections