Tracking 19th Century Late Blight from Archival Documents using Text Analytics and Geoparsing
In 1845, Ireland's potato crop was struck by a devastating potato disease that killed Ireland’s crop caused devastation for seven years and led to mass starvation and emigration from the country. The cause of the potato destruction was a fungus-like plant pathogen. There are several theories about the origin of the disease and the source of the 19th century outbreaks. We use historical documents contemporary to that time to investigate spatial information that might inform these mysteries. We present methodologies for automatically extracting information from these voluminous data sources. We identify and map geographic locations that are proximate in the text to key terms related to potato blight. Data sources include agricultural documents with extensive discussions of crop yields and failures, seed export and import, and weather conditions, along with location names. We apply natural language processing tools and geoparsing to automate text mining of the data within narrative passages. We couple these to mine the relationships between locations and reports of potato disease. Results are displayed in an interactive Web mapping tool for users to spatially explore the pertinent data for trends in the emergence of 19th century late blight.
Tateosian, Laura; Guenter, Rachael; Yang, Yi-Peng; and Ristaino, Jean
"Tracking 19th Century Late Blight from Archival Documents using Text Analytics and Geoparsing,"
Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings: Vol. 17
, Article 17.
Available at: https://scholarworks.umass.edu/foss4g/vol17/iss1/17
Computer Sciences Commons, Geographic Information Sciences Commons, Plant Pathology Commons