Graduation Year

2007

Document Type

Dissertation

Degree

Ph.D.

Degree Granting Department

Computer Science and Engineering

Major Professor

Ken Christensen, Ph.D.

Keywords

P2P, Protocols, Networks, Energy efficiency, Performance evaluation

Abstract

Current estimates are that more than nine million PCs in the U.S. are part of peer-to-peer (P2P) file sharing overlay networks on the Internet. These P2P hosts generate about 20% of the traffic on the Internet and consume about 7.8 TWh/yr equal to $630 million per year. File search in a P2P network is based on a wasteful paradigm of broadcasting query messages. Reducing P2P overhead traffic to reduce bandwidth waste and enabling power management to reduce electricity usage are clearly of great interest. In this dissertation, two new search paradigms with reduced overhead traffic are investigated. The new Targeted Search method uses statistics from previous searches to target future searches. Targeted Search is shown to reduce query overhead traffic when compared to broadcast-based search used by Gnutella.

The new Broadcast Updates with Local Look-up Search (BULLS) protocol enables new capabilities including power management and reduces overhead traffic by enabling a local look-up of shared files. BULLS hosts periodically broadcast changes in their list of files shared and build a table of shared files by all other hosts. Power management in P2P networks is studied as an application of the minimum set cover problem. A reduction in overall energy consumption is achieved by powering down hosts that have all of their shared files fully shared (or covered) by other hosts.

A new set cover heuristic -- called the Random Map Out (RMO) algorithm --is introduced and compared to the well-known Greedy heuristic. The algorithms are evaluated for minimum set cover size and computational complexity (number of comparisons). The RMO algorithm requires significantly less comparisons than Greedy and still achieves a set cover size within a few percent of that of Greedy. Additionally, the RMO algorithm can be distributed and independently executed by each host with reduced complexity per host where the Greedy heuristic does not reduce in complexity by being distributed. With RMO there is a non-zero probability of a given file being "lost" (not in set cover). The probability of this event is modeled and numerical results show that the probability of a file being lost is practically insignificant.

Share

COinS