Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

A generative theory of relevance

Victor Lavrenko, University of Massachusetts Amherst


We present a new theory of relevance for the field of Information Retrieval. Relevance is viewed as a generative process, and we hypothesize that both user queries and relevant documents represent random observations from that process. Based on this view, we develop a formal retrieval model that has direct applications to a wide range of search scenarios. The new model substantially outperforms strong baselines on the tasks of ad-hoc retrieval, cross-language retrieval, handwriting retrieval, automatic image annotation, video retrieval, and topic detection and tracking. Empirical success of our approach is due to a new technique we propose for modeling exchangeable sequences of discrete random variables. The new technique represents an attractive counterpart to existing formulations, such as multinomial mixtures, pLSI and LDA: it is effective, easy to train, and makes no assumptions about the geometric structure of the data.

Subject Area

Computer science|Information systems

Recommended Citation

Lavrenko, Victor, "A generative theory of relevance" (2004). Doctoral Dissertations Available from Proquest. AAI3152722.