Graduation Year
2012
Document Type
Dissertation
Degree
Ph.D.
Degree Granting Department
Information Systems and Decision Sciences
Major Professor
Dr. Richard Will
Co-Major Professor
Dr. Manish Agrawal
Committee Member
Dr. Terry Sincich
Committee Member
Dr. Balaji Padmanabhan
Keywords
Context, Exploration, Information Retrieval, Learning, Relevance Feedback, Text Mining
Abstract
This dissertation examines the impact of exploration and learning upon eDiscovery information retrieval; it is written in three parts. Part I contains foundational concepts and background on the topics of information retrieval and eDiscovery. This part informs the reader about the research frameworks, methodologies, data collection, and instruments that guide this dissertation.
Part II contains the foundation, development and detailed findings of Study One, "The Relationship of Exploration with Knowledge Acquisition." This part of the dissertation reports on experiments designed to measure user exploration of a randomly selected subset of a corpus and its relationship with performance in the information retrieval (IR) result. The IR results are evaluated against a set of scales designed to measure behavioral IR factors and individual innovativeness. The findings reported in Study One suggest a new explanation for the relationship between recall and precision, and provide insight into behavioral measures that can be used to predict user IR performance.
Part II also reports on a secondary set of experiments performed on a technique for filtering IR results by using "elimination terms." These experiments have been designed to develop and evaluate the elimination term method as a means to improve precision without loss of recall in the IR result.
Part III contains the foundation, and development of Study Three, "A New System for eDiscovery IR Based on Context Learning and Relevance." This section reports on a set of experiments performed on an IT artifact, Legal Intelligence®, developed during this dissertation.
The artifact developed for Study Three uses a learning tool for context and relevance to improve the IR extraction process by allowing the user to adjust the IR search structure based on iterative document extraction samples. The artifact has been developed based on the needs of the business community of practitioners in the domain of eDiscovery; it has been instantiated and tested during Study Three and has produced significant results supporting its feasibility for use. Part III contains conclusions and steps for future research extending beyond this dissertation.
Scholar Commons Citation
Hyman, Harvey Stuart, "Learning and Relevance in Information Retrieval: A Study in the Application of Exploration and User Knowledge to Enhance Performance" (2012). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/4083