Graduation Year


Document Type




Degree Granting Department

Information Systems and Decision Sciences

Major Professor

Dr. Richard Will

Co-Major Professor

Dr. Manish Agrawal

Committee Member

Dr. Terry Sincich

Committee Member

Dr. Balaji Padmanabhan


Context, Exploration, Information Retrieval, Learning, Relevance Feedback, Text Mining


This dissertation examines the impact of exploration and learning upon eDiscovery information retrieval; it is written in three parts. Part I contains foundational concepts and background on the topics of information retrieval and eDiscovery. This part informs the reader about the research frameworks, methodologies, data collection, and instruments that guide this dissertation.

Part II contains the foundation, development and detailed findings of Study One, "The Relationship of Exploration with Knowledge Acquisition." This part of the dissertation reports on experiments designed to measure user exploration of a randomly selected subset of a corpus and its relationship with performance in the information retrieval (IR) result. The IR results are evaluated against a set of scales designed to measure behavioral IR factors and individual innovativeness. The findings reported in Study One suggest a new explanation for the relationship between recall and precision, and provide insight into behavioral measures that can be used to predict user IR performance.

Part II also reports on a secondary set of experiments performed on a technique for filtering IR results by using "elimination terms." These experiments have been designed to develop and evaluate the elimination term method as a means to improve precision without loss of recall in the IR result.

Part III contains the foundation, and development of Study Three, "A New System for eDiscovery IR Based on Context Learning and Relevance." This section reports on a set of experiments performed on an IT artifact, Legal Intelligence®, developed during this dissertation.

The artifact developed for Study Three uses a learning tool for context and relevance to improve the IR extraction process by allowing the user to adjust the IR search structure based on iterative document extraction samples. The artifact has been developed based on the needs of the business community of practitioners in the domain of eDiscovery; it has been instantiated and tested during Study Three and has produced significant results supporting its feasibility for use. Part III contains conclusions and steps for future research extending beyond this dissertation.