Paper #: [5.1 P2P] 15 Title: Making Gnutella-like P2P Systems Scalable by Yatin Ghawathe, etc. PROBLEM Current Gnutella systems have poor scalability due to the flood-based query approach. This approach is not scalable because the load on each node grows with the number of queries. The super node concept, which was made popular by Kazaa, was an improvement but that doesn’t solve all issues. In addition, only moving from flood-based query to random walk will also not solve the problem because with random walk, queries may get stuck in a queue at a busy node and get delayed or even worst may never yield a response and die. Consequently a system is needed that can handle a high query rate and is scalable so that the system can function well as the system size increases and nodes move in and out of the network. NEW IDEAS AND STRENGTHS The authors created the Gia network which is decentralized and unstructured. Gia uses random walks instead of Gnutella’s flood-based query approach and tries to structure the network by taking into consideration the different amount of resources available in each node. This solution aims to fix the lack of flow control in Gnutella with a token-based flow control algorithm and by sending queries to nodes with high-degree and high-capacity. Gia also takes into consideration that most queries are for popular items, which the paper refers to as "hay", and not hard to find items, which are referred to as "needles". The authors did a good job of explaining why the various proposed solutions by other individually will not make a vast improvement, but a combination of new improvements can make a few orders of magnitude difference. The authors also did a good job of describing why distributed hash tables are not the answer because the main use of P2P systems tends to be keyword searches and not exact searches. WEAKNESSES AND EXTENSIONS The authors proposed utilizing internet connection speed, CPU speed, disk latencies, and other items into consideration which nodes to send queries in the network. However, the experiment only used the bandwidth during configuration. I would have liked to have seen an experiment that took more than bandwidth into consideration when determining the degree of a node. I would imagine that some computation would need to be done during the process. Consequently, the amount of traffic that a 200 MHz Pentium I is capable of handling versus as 3.0 GHz Pentium 4 would be significantly different.