Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.
Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.
Author ORCID Identifier
N/A
AccessType
Open Access Dissertation
Document Type
dissertation
Degree Name
Doctor of Philosophy (PhD)
Degree Program
Computer Science
Year Degree Awarded
2014
First Advisor
Erik G. Learned-Miller
Second Advisor
Allen Hanson
Third Advisor
Rui Wang
Subject Categories
Computer Sciences
Abstract
The area of scene text recognition focuses on the problem of recognizing arbitrary text in images of natural scenes. Examples of scene text include street signs, business signs, grocery item labels, and license plates. With the increased use of smartphones and digital cameras, the ability to accurately recognize text in images is becoming increasingly useful and many people will benefit from advances in this area.
The goal of this thesis is to develop methods for improving scene text recognition. We do this by incorporating new types of information into models and by exploring how to compose simple components into highly e_ective systems. We focus on three areas of scene text recognition, each with a decreasing number of prior assumptions. First, we introduce two techniques for character recognition, where word and character bounding boxes are assumed. We describe a character recognition system that incorporates similarity information in a novel way and a new language model that models syllables in a word to produce word labels that can be pronounced in English. Next we look at word recognition, where only word bounding boxes are assumed. We develop a new technique for segmenting text for these images called bilateral regression segmentation, and we introduce an open-vocabulary word recognition system that uses a very large web-based lexicon to achieve state of the art recognition performance. Lastly, we remove the assumption that words have been located and describe an end-to-end system that detects and recognizes text in any natural scene image.
DOI
https://doi.org/10.7275/5474920.0
Recommended Citation
Feild, Jacqueline, "Improving Text Recognition in Images of Natural Scenes" (2014). Doctoral Dissertations. 37.
https://doi.org/10.7275/5474920.0
https://scholarworks.umass.edu/dissertations_2/37