Publication Date
2001
Abstract
Hierarchies have long been used for organization, summarization, and access to information. In this proposal we define summarization in terms of a probabilistic language model and use the definition to explore new techniques for automatically generating topic hierarchies. One technique applies a graph-theoretic algorithm, which is an approximation of the Dominating Set Problem. Another technique uses an entropy-based approach to choose topic terms. Both techniques efficiently select terms according to a language model. We compare the new techniques to previous methods proposed for constructing topic hierarchies including subsumption and lexical hierarchies, as well as words found using TF.IDF. Our preliminary results show that the new techniques perform as well as or better than these other techniques. We plan to evaluate the two techniques further through user studies as well as computer simulations. We will also develop a demo for better interaction with users.
Recommended Citation
Lawrie, Dawn, "LANGUAGE MODELS FOR HIERARCHICAL SUMMARIZATION (PROPOSAL FOR DISSERTATION)" (2001). Computer Science Department Faculty Publication Series. 78.
Retrieved from https://scholarworks.umass.edu/cs_faculty_pubs/78
Comments
This paper was harvested from CiteSeer