Publication:
Topic Regression

dc.contributor.advisorAndrew McCallum
dc.contributor.advisorDavid Blei
dc.contributor.advisorDavid Jensen
dc.contributor.authorMimno, David
dc.contributor.departmentUniversity of Massachusetts Amherst
dc.date2023-09-23T06:34:36.000
dc.date.accessioned2024-04-26T19:50:36Z
dc.date.available2024-04-26T19:50:36Z
dc.date.issued2012-02-01
dc.description.abstractText documents are generally accompanied by non-textual information, such as authors, dates, publication sources, and, increasingly, automatically recognized named entities. Work in text analysis has often involved predicting these non-text values based on text data for tasks such as document classification and author identification. This thesis considers the opposite problem: predicting the textual content of documents based on non-text data. In this work I study several regression-based methods for estimating the influence of specific metadata elements in determining the content of text documents. Such topic regression methods allow users of document collections to test hypotheses about the underlying environments that produced those documents.
dc.description.degreeDoctor of Philosophy (PhD)
dc.description.departmentComputer Science
dc.identifier.doihttps://doi.org/10.7275/2646883
dc.identifier.urihttps://hdl.handle.net/20.500.14394/38970
dc.relation.urlhttps://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1520&context=open_access_dissertations&unstamped=1
dc.source.statuspublished
dc.subjectMachine Learning
dc.subjectTopic Modeling
dc.subjectComputer Sciences
dc.titleTopic Regression
dc.typedissertation
dc.typearticle
dc.typedissertation
digcom.contributor.authorisAuthorOfPublication|email:david.mimno@gmail.com|institution:University of Massachusetts Amherst|Mimno, David
digcom.identifieropen_access_dissertations/520
digcom.identifier.contextkey2646883
digcom.identifier.submissionpathopen_access_dissertations/520
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Mimno_umass_0118D_10907.pdf
Size:
6.4 MB
Format:
Adobe Portable Document Format
Collections