CS6675/4675 Advanced Internet Systems and Application Development

Instructor: Professor Ling Liu

Course Readings

Attention: The information contained in this page is subject to changes.

|Requirement | Required Readings | General/Recommended Readings| Reading Summary Template|

Reading Summary Requirement


You are expected to read the material each week and write 2-3 paragraphs per reading giving your impressions and thoughts. The summaries should be informal and brief, and should consist of your own comments on the readings, NOT a rehash of the content.

You should submit your summaries on T-Square. Late assignment will NOT be accepted unless approved in advance by the instructor.

Reading Summary Guidelines:

The summary for each reading assignment is expected to consist of 1 paragraph on each of the following three aspects: (1) the positive aspect of the paper; (2) the negative aspect of the paper; and (3) a brief discussion on how the idea or method proposed or used in evaluation may be applied to your own project for the course.

You may want to keep these guidelines in mind when reading papers.

You may find the following short article helpful:

Efficient Reading of Papers in Science and Technology. By Michael J. Hanson and updated by D. McNamee.

Areas of Readings

You are expected to read papers in the required reading list, but only write summary for one paper selected from the list associated with each lecture. Please use the Summary Template to write the reading summaries and follow the summary submission suggestions to submit your summaries.

  1. Search Engine Technology
  2. Web Servers
    1. Web Servers Issues
    2. Web Proxies and Web Caching
    3. Web Prefetching
    4. WWW Workloads
  1. Application Servers
  2. Internet Computing System Basics
    1. Performance Issues
    2. Naming Issues
  1. Advanced Internet Systems
    1. Peer to Peer Computing
    2. Mobile Computing
    3. Sensor, Stream, and Continual Query
    4. RFID
    5. Geo-Location Based Services and Applications
    6. Spatial Indexing and Spatial Mining
  1. Big Data Analysis
    1. Big Data Analysis
    2. Collaborative Filtering and Content based Recommendation
    3. Association Mining
    4. Data Clustering
    5. Classification based Machine Learning
  1. Cloud Computing
  2. Security and Privacy for Internet Applications
    1. Security
    2. Privacy
    3. Location Privacy
    4. Trust Management
    5. Web Spam and Denial of Services Attacks

General/Recommended Course Reading List

1.     Search Engine Technology (top^)

Questions: How to build a search engine that scales up as the Web grows?


1.     Google, The Anatomy of a Large-scale Hypertextual Web Search Engine. Sergey Brin and Lawrence Page. In 7th Int. Conf. WWW, Brisbane, Australia, April 1998.

2.      Inktomi: An Investigation of Documents from the World Wide Web Allison Woodruff, Paul M. Aoki, Eric Brewer, Paul Gauthier, and Lawrence A. Rowe  

3.      Harvest: Scalable Internet Resource Discovery: Research Problems and Approaches C. Mic Bowman (Tranarc Corp.), Peter Danzig (Univ. Southern California), Udi Manber (Univ. of Arizona), and Michael Schwartz (Univ. Colorado), Appeared in CACM 1994 (Download Harvest Indexer) (Harvest Papers)

4.      Harvest: The Harvest Information Discovery and Access System C. Mic Bowman, Peter B. Danzig, Darren R. Hardy, Udi Manber and Michael F. Schwartz, Computer Networks and ISDN Systems, 28 (1995) pp. 119-125

5.      Harvest: A Scalable,Customizable Discovery and Access System C. Mic Bowman, Peter B. Danzig, Darren R. Hardy, Udi Manber, Michael F. Schwartz, and Duane P. Wessels, Technical Report CU-CS-732-94, Department of Computer Science, University of Colorado,Boulder, August 1994 (revised March 1995).

6.      Customized Information Extraction as a Basis for Resource Discovery Darren R. Hardy and Michael F. Schwartz, ACM Transactions on Computer Systems.

7.      Indie: Distributed Indexing of autonomous Internet Services Peter Danzig, Shih-Hao Li, Katia Obraczka. Journam of Computer Systems, 5(4), 1992. Original description of Indie in 1991 ACM SIGIR

8.      Internet resource discovery services Katia Obraczka, Peter Danzig, and Shih-Hao Li, IEEE Computer, Sept. 1993.

9.      Research Problems for Scalable Internet Resource Discovery , C. Mic Bowman, Peter B. Danzig, and Michael F. Schwartz, 1993 IEEE Computer.

10.  GLIMPSE: A Tool to Search Through Entire File Systems Udi Manber and Sun Wu (Univ. of Arizona), Technical Report TR 93-34, Department of Computer Science, University of Arizona, October, 1993. (Glimpse Home Page)

11.  WebGlimpse--Combining Browsing and Searching Udi Manber, Mike Smith, and Burra Gopal (Univ. of Arizona), to appear in the Proceedings of the 1997 Usenix Technical Conference, January 1997.(WebGlimpse Home Page) (WebGLIMPSE Publications)

12.  Mercator: A Scalable, Extensible Web Crawler Allan Heydon and Marc Najork, Compaq Systems Research Center  (Mercator Project) (html)

13.  A technique for measuring the relative size and overlap of public Web search engines Krishna Bharat and Andrei Broder (DIGITAL, Systems Research Center).

14.  The Connectivity Server: fast access to linkage information on the Web Krishna Bharata, Andrei Brodera, Monika Henzingera, Puneet Kumara, and Suresh Venkatasubramanian, Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, pages 469-477. Elsevier Science, April 1998.

15.  Efficient Crawling through URL Ordering Junghoo Cho, Hector Garcia-Molina, and Lawrence Page, Proceedings of the 7thInternational World Wide Web Conference, pages 161-172, April 1998

16.  Crawling towards Eternity: Building an Archive of the World Wide Web Mike Burner, Web Techniques Magazine, 2(5), May 1997

17.  The Truth about the Web: Crawling towards Eternity Z. Smith, Web Techniques Magazine, 2(5), May 1997 (No available link)

18.  Measuring Index Quality using Random Walks on the Web Monika Henzinger, Allan Heydon, Michael Mitzenmacher, and Marc A. Najork, Proceedings of the 8th International World Wide Web Conference, pages 213-225, May 1999

19.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery Soumen Chakrabarti, Martin van den Berg, Byron Dom, Proceedings of the 8thInternational World Wide Web Conference, May 1999

20.  Finding What People Want: Experiences with the WebCrawler Brian Pinkerton, Proceedings of the 8th International World Wide Web Conference, 1994

21.  SPHINX: A Framework for Creating Personal, Site-specific Web Creawlers Robert C. Miller and Krishna Bharat, Proceedings of the 7th International World Wide Web Conference, pages 119-130, April 1998

22.  Information Retrieval on the World Wide Web Venkat N. Gudivada, Vijay V. Raghavan, William I. Grosky, and Rajesh Kasangottu, IEEE Internet Computing, vol. 1, number 5, September/October, 1997.

23.  GENVL and WWW: Tools for Taming the Web, Oliver McBryan, Proceedings of the First Int'l World Wide Web Conference, CERN, Geneva, May 1994.

24.  A World Wide Web Resource Discovery System Budi Yuwon, Savio L. Y. Lam, Jerry H. Ying, Dik L. Lee Proceedings of the 4th World Wide Web Conference, 1998.

25.  A Survey of Information Retrieval and Filtering Methods Christos Faloutos and Douglas Oard (Univ. of Maryland).

26.  Guidelines for Robot Writers, Martijn Koster, 1993

27.  Robots in the Web: threat or treat? Martijn Koster, NEXOR, April 1995, [1997: Updated links and addresses];  A Standard for Robot Exclusion Martijn Koster.

28.  Authoritative Sources in a Hyperlinked Environment, J. Kleinberg. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997. ( IBM Clever Searching Project)

29.  How Search Engines Rank Web Pages Danny Courtois and Sullivan.

30.  Evaluation of Web search engines and the search for better ranking algorithms. Mildrid Ljosland e-mail: Mildrid.Ljosland@idi.ntnu.no Norwegian University of Science and Technology. the SIGIR99 Workshop on Evaluation of Web Retrieval, August 19, 1999

31.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. Proceedings of the 7th World-Wide Web conference, 1998. Copyright owned by Elsevier Sciences, Amsterdam.

32.  Inferring Web Communities from Link Topologies. D. Gibson, J. Kleinberg, and P. Raghavan. Proceedings of The Ninth ACM Conference on Hypertext and Hypermedia, 1998. Copyright owned by ACM.

33.  Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. S. Chakrabarti, B. Dom, R. Agrawal, P. Raghavan. VLDB Journal, 1998 (invited).

34.  Enhanced hypertext categorization using hyperlinks.S. Chakrabarti, B. Dom and P. Indyk. Proceedings of ACM SIGMOD 1998.

35.  Hypersearching the web. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. Scientific American, June, 1999.

36.  Mining the link structure of the World Wide Web. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. IEEE Computer.

37.  Trawling the Web for emerging cyber-communities. S.R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Eighth World Wide Web conference, Toronto, Canada, May 1999.

38.  Focused crawling: a new approach to topic specific resource discovery. S. Chakrabarti, M. Van den Berg, B. Dom Eighth World Wide Web conference, Toronto, 1999.

39.  The web as a graph: Measurements, models and methods. J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Proceedings of the International Conference on Combinatorics and Computing, 1999; invited paper.

40.  Extracting large scale knowledge bases from the web. S.R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. IEEE International conference on Very Large Databases (VLDB), Edinburgh, Scotland.

41.  Clustering categorical data: an approach based on dynamical systems. D. Gibson, J. Kleinberg and P. Raghavan. Proceedings of the VLDB conference, 1998.

42.  Search and Ranking Algorithms for Locating Resources on the World Wide Web B. Yuwono and D. Lee. IEEE conference on Data Engineering, 1996 (pp391-400).

43.  A Machine Learning Architecture for Optimizing Web Search Engines,  J. Boyan, D. Freitag, and T. Joachims. AAAI Workshop on Internet-based Information Systems, 1996.

44.  SIBRIS: the Sandwich Interactive Browsing and Ranking Information System S. Wade, P. Willett, and D. Bawden. Journal of Information Science, 15, 1989, pp249-260 ( No available link)

45.  Estimating the Usefulness of Search Engines W. Meng, K. Liu, C. Yu, W. Wu and N. Rishe. ICDE 1999. (more details)

46. The effectiveness of GlOSS for the Text Database Discovery Problem L. Gravano, H. Garcia-Molina, A. Tomasic. SIGMOD 1994. (GlOSS)

47. Adaptive methods for the computation of PageRank, Sepandar Kamvar1, Taher Haveliwala2, Gene Golub, Technical Report,, Standford University, 2003

48. Building a Distributed Full-Text Index for the Web, Melnik, Sergey and Raghavan, Sriram and Yang, Beverly and Garcia-Molina, Hector, ACM Transactions on Information Systems 2003

49. Parallel Crawlers, Junghoo Cho, Hector Garcia-Molina, 11th WWW

50  An adaptive model for optimizing performance of an incremental web crawler , Edwards, J., McCurley, K. S., and Tomlin, J. A., In Proceedings of the Tenth Conference on World Wide Web (2001)

51. Focused crawling using context graphs, 26th International Conference on Very Large Databases

52. Effective page refresh policies for web crawlers. Junghoo Cho, Hector Garcia-Molina, ACM Transactions on Database Systems

53. Self-similarity in the web. Stephen Dill etc. ACM Transactions on Internet Technology (TOIT) archive (August 2002)

54. Finding replicated web collections. Junghoo Cho, N. Shivakumar, and Hector Garcia-Molina, ACM SIGMOD Record  (June 2000)

55. Hilltop: A search engine based on expert documents. K. Bharat and G. A. Mihaila, 9th WWW Conference (Poster), 2000.

56. TopicSensitive PageRank, In Proceedings of the Eleventh International World Wide Web Conference

57. Generalizing PageRank: Damping functions for link-based ranking algorithms, Ricardo BaezaYates etc, In Proceedings of SIGIR2002

58. Site Level Noise Removal for Search Engines, Carvalho, Paul - Alexandru Chirita, Edleno Silva de Moura, etc, In 15th WWW

59. Efficient crawling through URL ordering, Junghoo Cho etc , Computer Networks and ISDN Systems archive 1998,

60. Stuff I've Seen: A System for Personal Information Retrieval and Re-Use, Susan Dumais etc, 26th ACM SIGIR conference on Research and development in informaion retrieval

61. When experts agree: Using non-affiliated experts to rank popular topics, Krishna Bharat, George A. Mihaila, 10th WWW

62. The stochastic approach for link-structure analysis (salsa) and the tkc effect, R. Lempel, S. Moran, 9th WWW

63. What is this Page Known for? Computing Web Page Reputations,  Davood Rafiei, Alberto Mendelzon , 9th WWW

64. PicASHOW: Pictorial Authority Search by Hyperlinks on the Web, R. Lempel, A. Soffer, 10th WWW

65. Web Search via Hub Synthesis, Dimitris Achlioptas, Amos Fiat, Anna Karlin, Frank McSherry, 42nd IEEE Symposium on  Foundations of Computer Science

66. Approximating Aggregate Queries about Web Pages via Random Walks, Ziv Bar-Yossef etc, VLDB 2000

67. PMSE: A Personalized Mobile Search Engine, K. Leung, D. Lee, W. Lee, IEEE Transactions on Knowledge and Data Engineering 25.4(2013):820-834

68. How To Use Search Engine Optimization Techniques to Increase Website Visibility, J.B. Kiloran, IEEE Transactions on Professional Communication 56.1(2013):50-56

69. A Crowd Powered Socially Embedded Search Engine, J. Jeong et al, ICWSM 2013

70. Using SKOS Vocabularies for improving Web Search, B. Haslhofer et al, WWW 2013

71. Addressing People`s  Information Needs Directly in a Web Search Result Page, L.B.Chilton, J. Teevan, WWW 2011


  1. Web Servers (top^)

2.1 Web Servers Issues (top^)

Questions: What are the key technology for building high performance and scalable Web Servers?


1.  Measuring the Capacity of a Web Server Gaurav Banga and Peter Druschel, Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems, Monterey, CA, December 1997. 

2.  Internet Web Servers: Workload Characterization and Performance Implications Arlitt and Williamson, ACM/IEEE Transactions on Networking, 5(5):631-645, Oct. 1997. A short version titled "Web Server Workloa d Characterization: The Search for Invariants", appeared in ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May 1996.

3.  Locating Nearby Copies of Replicated Internet Servers, James D. Guyton and Michael F. Schwartz. ACM SIGCOMM, 1995. 

4.  HACC: An Architecture for Cluster-Based Web Servers Xiaolan Zhang, Michael Barrientos, J. Bradley Chen, Margo Seltzer (Harvard University). In the Proceedings of the 3rd USENIX Windows NT Symposium, July 1999, Seattle, WA, 155-164.

5.  Trace-Driven Simulation of Document Caching Strategies for Internet Web Servers Martin F. Arlitt and Carey L. Williamson, Simulation, Special Issue: Modeling and Simulation of Computer Systems and Networks. Vol. 68, No. 1, January, 1997. 

6.  Performance Characteristics of Mirror Servers on the Internet Andy Myers, Peter Dinda, Hui Zhang. INFOCOM'98.

7.  TCP Behavior of a Busy Internet Server: Analysis and Improvements Hari Balakrishnan, Venkata Padmanabhan, Srini Seshan, Mark Stemm and Randy H. Katz, INFOCOM'98.

8.  The Content and Access Dynamics of a Busy Web Site: Findings and Implications V. N. Padmanabhan and L. Qiu. Proceedings of ACM SIGCOMM 2000, Stockholm, Sweden, August 2000. . (An earlier version appeared as Microsoft Research Technical Report MSR-TR-2000-13, February 2000 )

9.   A Performance Monitoring and Capacity Planning Methodology for Web Servers, Rodney B. Wallace and Tyrone E. McKoy, Jr. (NCR Corporation)

10.  A Self-Scaling and Self-Configuring Benchmark for Web Servers, Stephen Manley (Network Appliance), Michael Courage (Microsoft Co.), and Mar go Seltzer (Harvard)

11.  Connection Scheduling in Web Servers M. E. Crovella, R. Frangioso, and M. Harchol-Balter, Proceedings of the 1999 USENIX Symposium on Internet Technologies.

12.  Dynamic Server Selection in the Internet Mark E. Crovella and Robert L. Carter, Computer Science Department, Boston University.

13.  Dynamic Server Selection Using Bandwidth Probing in Wide-Area Networks, R. Carter and M. Crovella, INFOCOM, 1997. extended version (TR-96-007).

14.  NCSA's World Wide Web Server: Design and Performance Tomas T. Kwan, Robert E. McGrath, and Daniel A. Reed, IEEE Computer, Vol. 28, No. 11, pp. 68-74, November 1995. An earlier version titled: User Access Patterns to NCSA's World Wide Web Server. (Recent Pablo Project)

15.  A Scalable and Highly Available Web Server, D.M. Dias, W. Kish, R. Mukherjee, R. Tewari. In Proceedings of the IEEE Computer Conference (COMPCON), Santa Clara, March, 1996.

16.  A Scalable HTTP Server: The NCSA Prototype, R. McGrath. Proc. of the 1st Intl. World-Wide Web Conference, May 1994. (HTML)

17.  The Power of Two Choices in Randomized Load Balancing, M. Mitzenmacher, PhD. Thesis, 1996.

18.  Flash: An Efficient and Portable Web Server Vivek Pai, Peter Druschel and Willy Zwaenepoel, Proceedings of 1999 USENIX Conference, Monterey, CA, June 1999.

19.  The AFS File System in Distributed Computing Environments:White Paper, Mnsarc Corporation, May 1996.

20.  Apache Server (Apache HTTP Server V1.3 - API notes)


2.2 Web Proxies and Web Caching (top^)

Questions: How do we coordinate the activities of a geographically distributed application?


1.  World Wide Web Proxies Ari Luotonen (CERN) and Kevin Altis (Intel) (html)

2.  A Hierarchical Internet Object Cache Anawat Chankhunthod, Peter B. Danzig, Chuck Neerdaels (University of Southern California), Michael F. Schwartz, Kurt J. Worrell (University of Colorado, Boulder) (An Implementation of the hierarchical Object Cache at netapp)

3.  Beyond Hierarchies: Design Principles for Distributed Caching on the Internet, R. Tewari, M. Dahlin, H. Vin, and J. Kay, Technical report TR98-04, Dept. of Computer Sciences, Univ. of Texas, 1998.

4.  Main Memory Caching of Web Documents, vangelos P. Markatos. In Proceedings of the Fifth WWW Conference, 1996. 

5.  Design Considerations for Integrated Proxy Servers S. Sahu, P. Shenoy, D. Towsley. y Proc. IEEE NOSSDAV'99 (Basking Ridge, NJ, June 1999).

6.  Performance Issues of Enterprise Level Web Proxies C. Maltzahn and K. Richardson and D. Grunwald, Proceedings of ACM SIGMETRICS'97, Seattle, WA, Pages 13-23, June 1997.

7.  World Wide Web Cache Consistency, James Gwertzman and Margo Seltzer, Proceedings of the 1996 USENIX Technical Conference, San Diego, CA, Jan 1996.

8.  Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System D. Terry, M. Theimer, K. Petersen, A. Demers and M. Spreitzer, and C. Hauser, In Proceedings of the fifteenth ACM Symposium on Operating Systems Principles (SOSP'97), Copper Mountain Resort, CO, December, 1995.

9.  Volume Leases for Consistency in Large-scale Systems, J. Yin, L. Alvisi, M. Dahlin and C. Lin, IEEE Transactions on Knowledge and Data Engineering Special issue on Web Technologies, Jan 1999 .

10.  Web proxy caching: the devil is in the details Ramon Caceres, Fred Douglis, Anja Feldmann, Gideon Glass, Michael Rabinovich, Workshop on Internet Server Performance held with Sigmetrics'98.

11.  Making World Wide Web Caching Servers Cooperate Radhika Malpani, Jacob Lorch, David Berger. Proceedings of the 4th WWW, 1998.

12.  Intelligent Caching for World-Wide Web Objects Duane Wessels (University of Colorado), Proceedings of INET'95, May 1995.

13.  Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol L. Fan, P. Cao, J. Almeida, and A.Z. Broder, SIGCOMM, 1998, pp 254-265.

14.  Improving End-to-End Performance of the Web Using Server Volumes and Proxy Filters Edith Cohen, Balachander Krishnamurthy, Jennifer Rexford, SIGCOMM, 1998.

15.  Internet Cache Protocol (ICP), version 2 D. Wessels, K. Claffy, RFC 2186, May 1997. (ICP Working Group Home Page )

16.  Adaptive Web Caching S. Floyd, V. Jacobson and L. Zhang, Procedings of the Web Caching Workshop 1997.

17.  An Analysis of Geographical Push Caching J. Gwertzman and M. Seltzer.

18.  A Caching Relay for the World Wide Web Steven Glassman, First International World-Wide Web Conference, pp 69-76, May 1994.

19.  Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes Mark Crovella and Azer Bestavros, Proceedings of SIGMETRICS '96.

20.  Intelligent Caching for World-Wide Web Objects Duane Wessels (University of Colorado), Proceedings of INET'95, May 1995.

21.  The Rio File Cache: Surviving Operating System Crashes Peter M. Chen, Wee Teck Ng, Subhachandra Chandra, Christopher Aycock, Gurushankar Rajamani, and David Lowell, Proceedings of the 1996 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1996.

22.  Operating System Support for High-Speed Networking, P. Druschel, Communications of the ACM, Vol. 39, No. 9, Pages 41-51, September 1996.

23.  Squid Internet Object Cache; Squid Web Proxy Cache

24.  Performance of Web Proxy Caches, Feldmann, Caceres, Douglis, Glass, and Robinovitch, Workshop on Internet Server Performance (WISP), 1998.

25.  Enhancing the Web's Infrastructure: From Caching to Replication Michael Baentsch, Lothar Baum, Georg Molter, Steffen Rothkugel, and Peter Sturm, IEEE Internet Computing, vol. 1, no. 2, pages 18-27, April 1997 (class handout).

26.  Propagation, Replication and Caching from the W3C

27.  Caching Proxies: Limitations and Potentials Marc Abrams, Charles R. Standridge, Ghaleb Abdulla, Stephen Williams, Edward A. Fox, Proceedings of the Fourth International World Wide Web Conference, pages 119-133, Boston, MA, December 1995.

28.  Improving End to End Performance of the Web using Server Volumes and Proxy Filters, E. Cohen, B. Krishnamurthy and J. Rexford, In Proceedings of ACM SIGCOMM'98, Vancouver, Canada, Pages 241-253, September 1998.

29.  The Measured Access Characteristics of World-Wide-Web Client Proxy Caches, B M. Duska, D. Marwood, and M J. Feeley, In Proceedings of the USENIX Symposium on Internet Technologies and Systems, Monterey, CA, December, 1997

30.  A Survey of Proxy Cache Evaluation Techniques Brian D. Davison

31.  A Tutorial for Network Caching


2.3 Web Prefetching (top^)

Question: How much can Prefetching alleviate the latency and bandwidth problems in Web access?


1.  Using Predictive Prefetching to Improve World Wide Web Latency,  Padmanabhan and Mogul, SIGCOMM, 1996.

2.  Alleviating the Latency and Bandwidth Problems in WWW Browsing, Loon and Bharghavan, Usenix Symposium on Internet Technologies and Systems (USITS) 1997.

3.    Determining WWW User’s Next Access and Its Application to Pre-fetching, Carlos R. Cunha and Carlos F.B. Jaccoud, Proceedings of ISCC’97: The Second IEEE Symposium on Computers and Communications. Alexandria, Egypt, 1-3 July 1997.

4.    Potential and Limits of Web Prefetching Between Low-Bandwidth Clients and Proxies Li Fan, Quinn Jacobson and Pei Cao. To appear in SIGMETRICS’99.

5.    Optimal Prefetching via Data Compression, Vitter and Krishnan, FOCS, 1991.

6.    The Network Effects of Prefetching, Crovella and Barford, INFOCOM, 1998.


2.4 WWW Workloads (top^)

Questions: What are the main causes of World Wide Web traffic? How do we find idle resources if they exist?


1.    Web Facts and Fantasy, Stephen Manley (Network Appliance), Margo Seltzer (Harvard University), Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems, Monterey, CA, December 1997.

2.    Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes, Mark E. Crovella and Azer Bestavros IEEE/ACM Transactions on Networking, 5(6):835--846, December 1997.

3.    Measuring Web Performance in the Wide Area, P. Barford and M. E. Crovella, in Performance Evaluation Review, August, 1999.

4.    Changes in Web Client Access Patterns: Characteristics and Caching Implications, P. Barford, A. Bestavros, A. Bradley, and M. E. Crovella, in World Wide Web, Special Issue on Characterization and Performance Evaluation, Vol. 2, pp. 15-28, 1999.

5.    Characterizing Browsing Strategies in the World-Wide Web, L. Catledge and J. Pitkow, Journal of Computer Networks and ISDN Systems, vol. 27, no. 6, 1995, p. 1065.

6.    Measuring the Web, Tim Bray (Open Text Corporation), Fifth International World Wide Web Conference, May 1996, Paris, France.

7.    Generating Representative Web Workloads for Network and Server Performance Evaluation, Barford and Crovella, SIGMETRICS, 1998, pp. 151-160. 

8.    Characterizing Reference Locality in the WWW, Almeida, Bestavros, Crovella, and de Oliveira, International Conference on Parallel and Distributed Information Systems (ICPDIS), 1996. (The OCEANS Project)

9.    Web Traffic Characterization: An Assessment of the Impact of Caching Documents from NCSA's Web Server H. Braun and K. Claffy; Second International Conference on the WWW, Chicago, IL, Oct. 1994, pages 1007-1027.

10.    Open Market's The Internet Index Home Page

11.    System Design Issues for Internet Middleware Services: Deductions from a Large Client Trace, Gribble and Brewer, Usenix Symposium on Internet Technologies and Systems (USITS) 1997.

3.     Application Servers (top^)

Questions: How do we build a scalable Internet service located at a single site? Should we replicate to get end-to-end availability? What abstractions should we provide to support scalability to millions of users, and continuous operations 24 hours per day and 7 days per week?

1.    Availability and Latency of World Wide Web Information Servers, Charles L. Viles and James C. French (University of Virginia), USENIX, Computing Systems; vol. 8, no. 1; Winter 1995.


2.     A Quantitative Study of Differentiated Services S. Sahu, D. Towsley, J. Kurose. Proc. IEEE Global Internet'99 (Rio de Janeiro, Brazil, December 1999). A longer version is available as UMass CMPSCI Technical Report 99-09.


3.    A Comparison of Server-Based and Receiver-Based Local Recovery Approaches for Scalable Reliable Multicast S. Kasera, J. Kurose, D. Towsley. Proc. IEEE Infocom'98 (San Francisco, CA, April 1998). A longer version is available as UMass CMPSCI Technical Report 97-69.


4.    A Comparison of Sender-Initiated and Receiver-Initiated Reliable Multicast Protocols D. Towsley, J. Kurose, S. Pingali. IEEE Journal on Selected Areas in Communications (JSAC) (April 1997)


5.    Exploiting Internetwork Multicast Services Nortel White Paper.


6.    "Server-initiated Document Dissemination for the WWW" Azer Bestavros and Carlos Cunha,IEEE Data Engineering Bulletin, September 1996.


7, Middleware Support for Data Mining and Knowledge Discovery in Large-scale Distributed Information Systems., Azer Bestavros, In Proceedings of ACM SIGMOD'96 Data Mining Workshop, Montreal, Canada, June 1996.


8.    "Speculative Data Dissemination and Service to Reduce Server Load, Network Traffic and Service Time for Distributed Information Systems" , Azer Bestavros, Proceedings of ICDE'96: The 1996 International Conference on Data Engineering, New Orleans, Louisiana. March 1996.


9. Using speculation to reduce server load and service time on the WWW, Azer Bestavros, in Proceedings of CIKM'95: The Fourth ACM International Conference on Information and Knowledge Management, Baltimore, Maryland. November 1995.


10. "Demand-based document dissemination to reduce traffic and balance load in distributed information systems" Azer Bestavros, in Proceedings of the 1995 Seventh IEEE Symposium on Parallel and Distributed Processing, San Antonio, Texas. October 1995.


11. Demand-based Data Dissemination for Distributed Multimedia Applications, Azer Bestavros, in Proceedings of the   ACM/ISMM/IASTED International Conference on  Distributed Multimedia Systems and Applications, Stanford, CA. August 1995.


12. "Information Dissemination and Speculative Service: Two candidate  functionalities for the middleware infrastructure" Azer  Bestavros, in Proceedings of SIGCOMM'95 Workshop on Middleware. Cambridge, MA, August 1995.


13. Personalized Information Environments: An Architecture for Customizable Access  to Distributed Digital Libraries, James C. French and Charles L. Viles, D-Lib Magazine, June 1999, Volume 5 Number 6


14.    Continuous Profiling: Where Have All the Cycles Gone? Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika R. Henzinger, Shun-Tak A. Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, and William E. Weihl. SOSP'97.


15.    System Support for Automated Profiling and Optimization, Aolan Zhang, Zheng Wang, Nicholas Gloy, J. Bradley Chen, and Michael D. Smith, SOSP'97.


16.    Cluster-Based Scalable Network Services. Fox, Gribble, Chawathe, and Brewer, Proceedings of SOSP, 1997.

17.    A Case for Networks of Workstations: NOW, T. Anderson, D. Culler, D. Patterson, IEEE Micro Feb. 1995. (The Berkeley NOW Project

18.    Free Transactions with Rio Vista, David E. Lowell and Peter M. Chen, Proceedings of the 1997 Symposium on Operating Systems Principles (SOSP), October 1997. (The Rio project).


19.    The Case for Application-Specific Benchmarking, Margo Seltzer, David  Krinsky, Keith Smith, Xiaolan Zhang.


20.    Frangipani: A Scalable Distributed File System, C. Thekkath, T. Mann and E. Lee, Proceedings of the 1997 Symposium on Operating Systems Principles  (SOSP), October 1997.

21.    Serverless Network File Systems, Anderson, Dahlin, Neefe, Patterson, Roselli and Wang, SOSP, 1995.

22.    A Note on Distributed Computing, Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall, Sun Microsystems Laboratories Technical Report TR-94-29 (November 1994).

23.    Application-Level Document Caching in the Internet Azer Bestavros et.al, Proceedings of SDNE'95: The second International Workshop on Services in Distributed and Network Environments. Whistler, Canada, June 1995.

24.    WWW Media Distribution via Hopwise Reliable Multicast, James E. (Jed) Donnelley Lawrence Livermore National Laboratory, Livermore, California, USA. WWW'95.

25.    Scalable Reliable Multicast Using Multiple Multicast Channels, S. Kasera, G. Hjalmtysson, D. Towsley, J. Kurose, To appear in IEEE/ACM Transactions on Networking, 2000.

26.    Scalable fair reliable multicast using active services, S.K. Kasera, S. Bhattacharyya, M. Keaton, D. Kiwior, J. Kurose, D. Towsley, S. Zabel. IEEE Networks Magazine.

27.    An Internet Multicast System for the Stock Market, N.F. Maxemchuk and D. H. Shur (AT&T Labs - Research)

28.    Cooperative Reliable Multicast Protocol with Local Recovery, Young-mi Ohk, Steven H. Low

29.    Real-time Applications of the Internet, N. F. Maxemchuk, Johns Hopkins  University, April 8, 1999.

30.    Operational Information Systems - An example from the Airline Industry, Van Oleson, Greg Eisenhaur, Calton Pu, Karsten Schwan, Beth Plale and Dick Amin. Sept., 2000.

31.    Disconnected Operations in the Coda File System, J. Kistler and M. Satyanarayanan, ACM Transcations on Computing Systems, Vol 10, No 1, Pages 3-25, February 1992.

32.    Information Monitoring on the Web: A Scalable Solution. Ling Liu, Wei Tang, David Buttler, and Calton Pu. World Wide Web Journal(by Kluwer Academic Publishers), Volume 5, No. 4.

33.    InfoFilter: Supporting Quality of Service for Fresh Information Delivery, Ling Liu, Calton Pu, Karsten Schwan, Jon Walpole. New Generation Computing Journal (Vol.18, No.4),

34.    Continual Queries for Internet Scale Event-Driven Information Delivery , Ling Liu, Calton Pu, Wei Tang. n: Special issue on Web Technologies, IEEE Transactions on Knowledge and Data Engineering, Vol.11, No.4, July/Aug. 1999. pp610-628.

35.    Methodical Restructuring of Complex Workflow Activities, Ling Liu and Calton Pu. IEEE 14th International Conference on Data egineering, February 23-27, 1998, Orlando, Florida, USA. pp342-350.

36.    Support for Data-intensive Applications in Large-scale Systems, Mike Dahlin, University of Texas at Austin  E-Commerce White Paper Series

38.    "Web content adaptation to improve server overload behavior", T.Abdelzaher and N.Bhatti, International World Wide Web conference,Toronto, Canada, May 1999.

39.    "Web server QoS management by adaptive content delivery", T.Abdelzaher and N.Bhatti, International Workshop on Quality of  Service, London, UK, June 1999.

40.    "Digestor: Device-independent Access to the World Wide Web", T. Bickmore and B. Schilit, The Sixth International World Wide Web Conference, April 1997.

41.    "Adapting the Web: An Adaptive Web Browser",K. Henricksen and J. Indulska, User Interface Conference, 2001

42.    "HTTP Remote Variant Selection Algorithm --RVSA/1.0", K. Holtman and A. Mutz, RFC2296.

43.    Reducing WWW Latency and Bandwidth Requirements by Real-Time Distillation. A. Fox and E. Brewer. Fifth International World Wide Web Conference (Paris, May 1996).

44.    "Adaptive Delivery of HTML Contents", Y. Yang, J. Chen, and H. Zhang, 9th International World Wide Web Conference, Amsterdam, May 2000.

45.    "Network-Adaptive Control With TCP-Friendly Protocol for Multiple Video Objects", Q.Zhang, Wenwu Zhu, and Y.Q.Zhang, ICME 2000.

46.    Gardmon: A Java-based Monitoring Tool for Gardens Non-dedicated Cluster Computing. R. Buyya, B. Koshy, and R. Mudlapur. In Proceedings of Workshop on Cluster Computing Technologies, Environments, and Applications, PDPTA 99, Monte Carlo Resort, Las Vegas, Nevada, USA, 1999.

47.    The Condor Distributed Processing System. M. Livny, Dr Dobbs Journal, Feb 1995 pp 40-48.

48.    Project Ganglia: Distributed Monitoring and Execution System.

49.     PARMON: A Comprehensive Cluster Monitoring System. Rajkumar et al., Proceedings of the Fifth International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'98), Las Vegas, Nevada, USA, CSREA Press, 1998.

50.    Building a Resources Monitoring System for SMILE Beowulf Cluster. P. Uthayopas, S. Phaisithbenchapol, and K. Chongbarirux., Proceeding of the Third International Conference/Exhibition on High Performance Computing in Asia-Pacific Region (HPC ASIA'99), Singapore, 1998

  1. Internet Computing System Basics (top^)

4.1 Performance Issues (top^)

Questions: What is the right model for supporting Internet Applications of the future?

1.    WebOS: Operating System Support for Wide Area Applications, Vahdat, Anderson, Dahlin, Bellani, Culler, Eastham, and Yoshikawa, HPDC, 1998.

2.    Operating System Directions for the Next Millennium, Bolosky, Draves, Fitzgerald, Fraser, Jones, Knoblock, and Rashid.

3.    The Architectural Design of Globe: A Wide-Area Distributed System, M. van Steen, P. Homburg, and A.S. Tanenbaum, Technical Report IR-422, Dept. of Computer Science, Vrije University, March 1997. Appeared in IEEE Concurrency.

4.    A Network Architecture for Heterogeneous Mobile Computing E. A. Brewer, R. H. Katz et al. IEEE Personal Communications Magazine, October 1998.

5.    IO-Lite: A Unified I/O Buffering and Caching System, Peter Druschel, Vivek S. Pai and Willy Zwaenepoel. To appear in the Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI’99), New Orleans, LA, February 1999.

6.    The x-Kernel: An Architecture for Implementing Network Protocols, Hutchinson and Peterson

7.    Making Paths Explicit in the Scout Operating System, Mosberger and Peterson, OSDI, 1996.

8.    Server Operating Systems, Kaashoek, Engler, Gagner, and Wallach, European SOSP Workshop, 1996.

9.    Application Performance and Flexibility on Exokernel Systems, Kaashoek, Engler, Ganger, Briceno, Hunt, Mazieres, Pinckney, Grimm, Jonnotti, Mackenzie, SOSP, 1997.


4.2 Naming Issues (top^)

Questions: How should we translate from virtual name to physical addresses? How do we find objects or services or users in the Internet if they migrate?

1.    Locating Objects in Wide-Area Systems, M. van Steen, F.J. Hauck, P. Homburg, and A.S. Tanenbaum, IEEE Communications Magazine, January 1998.

2.   Scalable Naming in Global Middleware, G. Ballintijn, M. van Steen, A.S. Tanenbaum. Proc. 13th Int’l Conf. on Parallel and Distributed Computing Systems (PDCS-2000), Las Vegas, August 8-10, 2000.

3.   Development of the Domain Name System, P. Mockapetris and K. Dunlap SIGCOMM 1988, Computer Communications Review Vol 18 No 4 Aug. 1988 pp. 123-133. Dynamic DNS: rfc 2136 and rfc 2137

4.  Resolution of Uniform Resource Identifiers using the Domain Name System, Ron Daniel and Michael Mealling. Internet Draft, September 1996.

5.    Engineering a Global Resolution Service, Edward Slottow, Nov. 1996.

6.    Host Anycasting Service, C. Partridge, T. Mendez, and W. Milliken, http://andrew2.andrew.cmu.edu/rfc/rfc1546.html

7.    A Directory Service for Configuring High-Performance Distributed Computations, S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, and S. Tuecke, Proc. 6th IEEE Symp. On High-Performance Distributed Computing 1997.

8.    Active Naming: Programmable Location and Transport of Wide-Area Resources, Vahdat, Anderson and Dahlin SITS??, 1997?

9.    Using Smart Clients to Build Scalable Services, Chad Yoshikawa, Brent Chun, Paul Eastham, Amin Vahdat, Thomas Anderson, and David Culler. Proceedings of USENIX ‘97, January 1997.

  1. Advanced Internet Systems (top^)

5.1 Peer to Peer Computing (top^)

1.    Analysis of the Evolution of Peer-to-Peer Systems. David Liben-Nowell, Hari Balakrishnan, and David Karger, ACM Conf. on Principles of Distributed Computing (PODC), Monterey, CA, July 2002.

2.     Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, IEEE/ACM Transactions on Networking

3.    Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility A. Rowstron and P. Druschel,18th ACM SOSP’01, Lake Louise, Alberta, Canada, October 2001.

4.    SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network,  Yan Chen, Randy H. Katz, and John D. Kubatowicz in Proceedings of the International Conference on Pervasive Computing, August 2002.

5.    PeerCQ: A Decentralized and Self-Configuring Peer-to-Peer Information Monitoring System.Bugra Gedik and Ling Liu. The 23rd International Conference on Distributed Computing Systems. (ICDCS 2003)

6.    Search and replication in unstructured peer-to-peer networks, Qin Lv, Pei Cao, Edith Cohen, Kai Li, and Scott Shenker. In the Proceedings of the 16th international conference on Supercomputing, June 2002 , New York, USA.

7.    Freenet: A Distributed Anonymous Information Storage and Retrieval System” Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore W. Hong, in Designing Privacy Enhancing Technologies: International Workshop on Design Issues in Anonymity and Unobservability, LNCS 2009, ed. By Hannes Federrath. Springer: New York (2001).

8.    Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems”. A. Rowstron and P. Druschel, IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001.

9.    Anonymous Publish/Subscribe in P2P Networks,A. K. Datta, M. Gradinariu, M. Raynal and G. Simon, IPDPS 2003.

10.    SCRIBE: A large-scale and decentralised application-level multicast infrastructure,M. Castro, P. Druschel, A-M. Kermarrec and A. Rowstron, IEEE Journal on Selected Areas in Communications (JSAC) (Special issue on Network Support for Multicast Communications). 2002.

11.    Scalable application layer multicast. S. Banerjee, B. Bhattacharjee, and C. Kommareddy, In Proceedings of the 2002 ACM SIGCOMM Conference, 2002.

12.    Enabling Conferencing Applications on the Internet using an Overlay Multicast Architecture.Yang-hua Chu, Sanjay G. Rao, Srinivasan Seshan and Hui Zhang, In Proceedings of ACM SIGCOMM 2001.

13.    A Scalable Content Addressable Network. S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. In Proceedings of the ACM SIGCOMM Conference, 2001.

14.    Gnutella RFC

15.    Making Gnutella-like P2P Systems Scalable. Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker. In Proceedings of the ACM SIGCOMM, 2003.

16.    Phenix: Supporting Resilient Low-diameter Peer-to-Peer Topologies. R. H. Wouhaybi and A. T. Campbell. In Proceedings of IEEE INFOCOM 2004.

17.   Improving Search in Peer-to-Peer Systems. Beverly Yang, Hector Garcia-Molina, In Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS), 2002

18.  “Designing a Super-peer Network.” Beverly Yang, Hector Garcia-Molina, In Proceedings of the 19th International Conference on Data Engineering (ICDE), Bangalore, India, March 2003

19.  “Routing Indices For Peer-to-Peer Systems.” Arturo Crespo, Hector Garcia-Molina. Proceedings of the International Conference on Distributed Computing Systems (ICDCS). July 2002.

20.  Turning Heterogeneity into an Advantage in Overlay Routing. Z. Xu, M. Mahalingam, and M. Karlsson. In Proceedings of INFOCOM, 2003.

21.    Apoidea: A Decentralized Peer-to-Peer Architecture for Crawling the World Wide Web,A. Singh, M. Srivatsa, L. Liu and T. Miller, In the proceedings of the SIGIR workshop on distributed information retrieval, August 2003. Also in Lecture notes of computer science (LNCS) series, Springer Verlag.

22. Tracing a large-scale Peer to Peer System: an hour in the life of Gnutella, Evangelos P. Markatos, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2002.

23. Mapping the gnutella network. M. Ripeanu, I. Foster, and A. Iamnitchi. IEEE Internet Computing Journal, 6(1), 2002.

24. Scaling Unstructured Peer-to-Peer Networks with Heterogeneity-Aware Topology and Routing. M. Srivatsa and L. Liu. In the Proceedings of IEEE Transactions on Parallel and Distributed Systems.

25. Contructing a Proximity-aware Power Law Overlay Network . J. Zhang, L. Liu and C. Pu. In the Proceedings of IEEE Global Telecommunications Conference GLOBECOM, 2005.

26. An Analysis of Internet Content Delivery Systems. Stefan Saroiu, Krishna P. Gummadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy. Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI 2002), December 2002.

27. On the Feasibility of Peer-to-Peer Web Indexing and Search, Jinyang Li, Boon Thau, Loo Joseph, M. Hellerstein, M. Frans Kaashoek, LECTURE NOTES IN COMPUTER SCIENCE  2003, ISSU 2735, pages 207-215.

28. P2P Content Search: Give the Web Back to the People, Matthias Bender, Sebastian Michel, Peter Triantafillou, GerhardWeikum, Christian Zimmer, Proceedings of the 5th International Workshop on Peer-to-Peer Systems, 2006

29. MINERVA∞ A scalable efficient peer-to-peer search engine, S Michel, P Triantafillou, G Weikum – LECTURE NOTES IN  COMPUTER SCIENCE, 2005.

30. Routing indices for peer-to-peer systems, A Crespo, H Garcia-Molina, In Proceedings of Distributed Computing Systems, 2002.    

31.An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol, S.A. Baset and H.G. Schulzrinne, Infocom April 2006.

32. An Experimental Study of the Skype Peer-to-Peer VoIP System, Saikat Guha, Neil Daswani, Ravi Jain, IPTPS, February 2006.

33. Characterizing and Detecting Skype-Relayed Traffic, K. Suh, D. R. Figueiredo, J. Kurose, D. Towsley, Infocom , April 2006.

34. Rarest First and Choke Algorithms are Enough, Arnaud Legout, G. Urvoy-Keller, P. Michiardi, IMC 2006

35. The Bittorrent P2P File-sharing System: Measurements and Analysis, J.A Pouwelse, P. Garbacki, D.H.J Epema, H.J. Sips, IPTPS, February 2005.

36. Incentives Build Robustness in BitTorrent, Bram Cohen, First Workshop on Economics of Peer-to-peer Systems, June 2003.

37. Dynamic of bidding in a P2P Lending Service: Effects of Herding and Predicting Loan Success, S. Ceyhan, X.Shi, J. Leskovec, WWW 2011

39. Resilience of Dynamic Overlays through Local Interactions, S. Ferretti, WWW 2013 

Some P2P Links

o    Napster

o    Gnutella (GNU implementation Phex, Jtella)

o    FreeNet

o    Free Haven

o    SETI@Home

o    distributed.net

o    OpenDHT

o    JXTA

o    O’Reilly Open P2P


5.2 Mobile Computing (top^)

1. Fundamental Challenges in Mobile Computing, Satyanarayanan, M., Fifteenth ACM Symposium on Principles of Distributed Computing ,   May 1996, Philadelphia, PA, Revised version appeared as: “Mobile Computing: Where’s the Tofu?”,  Proceedings of the ACM Sigmobile, April 1997, Vol. 1, No. 1.

2. “Multi-Fidelity Algorithms for Interactive Mobile Applications”,  Satyanarayanan, M., Narayanan, D. Proceedings of the 3rd International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications, August 1999, Seattle, WA

3. “Mobile Data Access”, Noble, B.School of Computer Science, Carnegie Mellon University, May 1998, CMU-CS-98-118

4.    Energy-aware adaptation for mobile applicationsFlinn J., Satyanarayanan, M., Proceedings of the 17th ACM Symposium on Operating Systems Principles, December, 1999, Kiawah Island Resort, SC.

5.  “PowerScope: A Tool for Profiling the Energy Usage of Mobile Applications”, Flinn J., Satyanarayanan, M., Proceedings of the Second IEEE Workshop on Mobile Computing Systems and Applications, February, 1999, New Orleans, LA  

6. System Support for Mobile, Adaptive Applications, Noble, Brian, IEEE Personal Communications, Vol. 7, No. 1, February, 2000

7.    Experience with adaptive mobile applications in Odyssey , Noble, B.D. and Satyanarayanan, M., Mobile Networks and Applications, Vol. 4, 1999

8.    Agile Application-Aware Adaptation for Mobility, Noble, B., Satyanarayanan, M., Narayanan, D., Tilton, J.E., Flinn, J., Walker, K. Proceedings of the 16th ACM Symposium on Operating System Principles, October 1997, St. Malo, France

9.    A Research Status Report on Adaptation for Mobile Data Access , Noble, B., Satyanarayanan, M. SIGMOD Record, Vol. 24, No. 4, December 1995

10.    A Programming Interface for Application-Aware Adaptation in Mobile Computing , Noble, B., Price, M., Satyanarayanan, M., Proceedings of the Second USENIX Symposium on Mobile & Location-Independent Computing, Apr. 1995, Ann Arbor, MI

11.    Application-Aware Adaptation for Mobile Computing , Satyanarayanan, M., Noble, B., Kumar, P., Price, M.     Proceedings of the 6th ACM SIGOPS European Workshop,  Sep. 1994, Dagstuhl, Germany.

12.    Mobile Information Access, Satyanarayanan, M. , IEEE Personal Communications, Vol. 3, No. 1, February 1996

13.    Indexing Techniques for Power Management in Multi-Attribute Data Broadcast Qinglong Hu, Wang-Chien Lee, and Dik Lun Lee,

14.    Power conserving And access Efficient Indexes For Wireless Computing Dik Lun Lee, and Qinglong Hu,

15.    Power Conservative Multi-Attribute Queries on Data BroadcastQinglong Hu, Wang-Chien Lee, and Dik Lun Lee, ICDE 2000.

16.    Effects of power conservation, wireless coverage and cooperation on data dissemination among mobile devices”, Maria Papadopouli and Henning Schulzrinne, ACM  SIGMOBILE Symposium on Mobile Ad Hoc Networking & Computing (MobiHoc) 2001, October 4-5, 2001, Long Beach, California. (Extension of the Sarnoff paper.)

17.    Energy-aware Web Caching for Mobile Terminals. Francoise Sailhan, Valarie Issarny. In Proceedings of the ICDCS Workshop on Web Caching Systems. July 2002, Vienna, Austria.

18.    Power-Controlled Data Prefetching/Caching in Wireless Packet Networks, Savvas Gitzenis and Nicholas Bambos, IEEE Infocom 2002, New York.

19.    Sleepers and Workaholics: Caching Strategies in Mobile Environments. Daniel Barbara, Tomasz Imielinski,VLDB Journal 4(4): 567-602(1995).

20.    Indexing techniques for data broadcast on wireless channels. D.L. Lee, Q. Hu, and W. C. Lee,Proceedings of the Fifth International Conference on Foundations of Data Organization (FODO ‘98), Kobe, Japan, Nov 11-12, 1998, 175-182.

21.    Indexing Techniques for Wireless Data Broadcast Under Data Clustering and Scheduling,Qinglong Hu, Wang-Chien Lee, and Dik Lun Lee, in Proceedings of ACM International Conference on Information and Knowledge Management (CIKM99), Kansas City, Missouri, Nov. 1999, pp. 351-358.

22.    MobiEyes: Distributed Processing of Continuously Moving Queries on Moving Objects in a Mobile System, Bugra Gedik, Ling Liu. The 9th International Conference on Extending DataBase Technology (EDBT 2004)  or Bugra Gedik and Ling Liu. MobiEyes: A Distributed Location Monitoring Service Using Moving Location Queries. IEEE Transactions on Mobile Computing. Vol. 5, No. 10, October 2006. Pp1-19.

23. Processing Moving Queries over Moving Objects Using Motion Adaptive Indexes, Bugra Gedik, Kun-Lung Wu, Philip Yu, and Ling Liu, To appear in IEEE Transactions on Knowledge and Data Engineering (TKDE).,

24. Bugra Gedik, Ling Liu, Kun-Lung Wu, Philip S. Yu. Lira: Lightweight, Region-aware Load Shedding in Mobile CQ Systems”. IEEE 23rd International Conference on Data Engineering. Istanbul, Turkey; April 17-20, 2007.

25. Bugra Gedik, Kun-Lung Wu, Philip S. Yu, and Ling Liu. GrubJoin: An Adaptive Multi-Way Windowed Stream Join with Time Correlation-Aware CPU Load Shedding”. IEEE 23rd International Conference on Data Engineering. Istanbul, Turkey; April 17-20, 2007.

26. Jianjun Zhang, Gong Zhang, Ling Liu. ``GeoGrid: A Scalable Location Service Network”. Proceedings of 27th IEEE International Conference on Distributed Computing Systems (ICDCS 2007).

27. Kipp Jones and Ling Liu. “Improving Wireless Positioning with Look-ahead Map-Matching “, to appear in Proceedings of the 4th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (Mobiquitous 2007). August 6-10, 2007, Philadelphia, PA.

28. Anand Murugappan and Ling Liu. ``A SpatioTemporal Placement Model for Caching Location Dependent Queries”, Proceedings of the 4th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (Mobiquitous 2007). August 6-10, 2007, Philadelphia, PA.

29. Mudhakar Srivatsa, Arun Iyengar, Jian Yin and Ling Liu. ``A Scalable Access Control in Location-based Broadcast Services”, Proceedings of The 27th IEEE International Conference on Computer Communications (INFOCOM 2008), to be held in Phoenix, Arizona.

30. Peter Pesti, Ling Liu, Bhuvan Bamba, Arun Iyengar, Matt Weber. RoadTrack: Scaling Location Updates for Mobiles on Road Networks with Query Awareness”, Proceedings of the 36th International Conference on Very Large Data Bases, Singapore, Sept 13-17, 2010.

31. Bhuvan Bamba, Ling Liu, Philip S. Yu, Gong Zhang and Myungcheol Doo. Scalable Processing of Spatial Alarms”, Proceedings of the 15th Annual IEEE International Conference on High Performance Computing (HiPC 2008), Bangalore, India, December 17-20, 2008.

32. Bhuvan Bamba, Ling Liu, Philip Yu, Arun Iyengar. Distributed Processing of Spatial Alarms: A Safe Region-based Approach”, Proceedings of IEEE Int. Conf. on Distributed Computing (ICDCS 2009), Montreal, Canada, June 22-26, 2009.

33. A. Murugappan and L. Liu. Energy-efficient processing of spatial alarms on mobile clients. “, In Proc. ICSDE, 2008.

35. ProfileDroid: Multi-layer Profiling of Android Applications, X. Wei et al, MobiCom, 2012

36. Crowdsourcing to Smartphones: Incentive Mechanism Design for Mobile Phone Sensing, D Yang et al, MobiCom 2012

37. Locating in Fingerprint Space: Wireless Indoor Localization with Little Human Intervention, Z. Yang et al, MobiCom 2012


5.3 Sensor, Stream, and Continual Query (top^)

1.    Continuous Queries over Data Streams   John S. Breese, David Heckerman, and Carl Kadie, S. Babu and J. Widom.In SIGMOD Record, September 2001 .

2.    Towards Sensor Database Systems. Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri. In Proceedings of the Second International Conference on Mobile Data Management. Hong Kong, January 2001. 

3.    Querying the Physical World. Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri.IEEE Personal Communications, Vol. 7, No. 5, October 2000, pages 10-15. Special Issue on Smart Spaces and Environments.

4.    Fjording the Stream: An Architecture for Queries over Streaming Sensor Data, Sam Madden and Michael J. Franklin,ICDE Conference, February, 2002, San Jose.

5.    Streaming Queries over Streaming Data Sirish Chandrasekaran, Michael J. Franklin, VLDB Conference, August 2002, Hong Kong.

6.    Monitoring Streams: A New Class of Data Management Applications.D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, S. Zdonik. In proceedings of the 28th International Conference on Very Large Data Bases (VLDB’02), August 20-23, Hong Kong, China.

7.   Quality-Aware Distributed Data Delivery for Continuous Query Services, by Bugra Gedik and Ling Liu. ACM 2006 International Conference on Management of Data (ACM SIGMOD), Chicago, June 26-29, 2006. AR: 13% (58 out of 446).

8.   GrubJoin: An Adaptive Multi-Way Windowed Stream Join with Time Correlation-Aware CPU Load Shedding. By Bugra Gedik, Kun-Lung Wu, Philip S. Yu, and Ling Liu. IEEE 23rd International Conference on Data Engineering. Istanbul, Turkey; April 17-20, 2007.

9.   CPU Load Shedding for Binary Stream Joins. Springer Knowledge and Information Systems. By Bugra Gedik, Kun-Lung Wu, Philip S. Yu, and Ling Liu. DOI 10.1007/s10115-006-0044-4, September 21, 2006. (will appear in print, ISSN 0219-1377)

10.  CACM Special Issue on Sensor Networks, June 2004 (Online available at ACM digital library, use GT library website to access it)

11. Bugra Gedik, Ling Liu, Philip Yu. ASAP: An Adaptive Sampling Approach to Data Collection in Sensor Network. IEEE Transaction on Parallel and Distributed Systems, 2007.


5.4 RFID (top^)

1.      Security and Privacy Issues in ePassport, Ari Juels, David Molnar and David Wagner, In Proceedings of Advances in Cryptology, 2005.

2.      Privacy and Security Issues in Library RFID Issues, Practices, and Architectures, David Molnar and David Wagner, In Proceedings of ACM CCS, 2004.

3.      High Power Proxies for Enhancing RFID Privacy and Utility, In Proceedings of PET, 2005.

4.      RFID Security and Privacy: A Research Survey, Ari Juels, In Proceedings of IEEE Journal on Selected Areas in Communication, 2006.


5.5  Geo-Location Based Services and Applications (top^)

1.      The Cricket location support system. N. Priyantha, A. Chakraborty and Hari Balakrishnan. ACM MobiCom 2000.

2.      The active badge location system. Roy Want, Andy Hopper, Veronica Falco and Jonathan Gibbons, ACM transactions on Information Systems (TOIS), 10(1), 1992.

3.      "Geographic Location Tags on Digital Images." Toyama, Kentaro, et al., International Multimedia Conference, Berkeley, CA: 156 - 166, 2003.


4.      Automatic Organization for Digital Photographs with Geographic Coordinates. Naaman, Mor, et al., International Conference on Digital Libraries, Tuscon, AZ: 53-62, June 2004


5.      "Hybrid spatio-temporal structuring and browsing of an image collection acquired from a personal camera phone." Pigeau, A. and M. Gelon, International Symposium on Image and Video communications over Fixed and Mobile Networks XXII: 53-58, 2004.


6.      "Sharing, Discovering and Browsing Geotagged Pictures on the Web." Torniai , Carlo, Steve Battle, and Steve Cayzer., hpl.hp.com, May 2007.


7.      "The Big Picture: Exploring Cities through Georeferenced Images and RDF Shared Metadata". Torniai , Carlo, Steve Battle, and Steve Cayzer. Chi Conference, 2007


8.      "Project Lachesis: Parsing and Modeling Location Histories". Hariharan, Ramaswamy and Kentaro Toyama. GIScience, 2004.


9.       Context data in geo-referenced digital photo collections. Naaman, Mor, et al. International Multimedia Conference, New York, NY: 196-203, October 2004.


11.  "Eyes on the World." Invisible Computing: Naaman, Mor, et al. 108-111, October 2006.


12.  Flickr.


13.  Picasa.


14.  Evermore GT-800BT Bluetooth GPS EverPhoto.


5.6 Spatial Indexing and Spatial Mining (top^)

1.      R-trees: a dynamic index structure for spatial searching. Antonin Guttman , Proceedings of the 1984 ACM SIGMOD international conference on Management of data, June 18-21, 1984, Boston, Massachusetts

2.      Indexing the positions of continuously moving objects. S. Saltenis, C. S. Jensen, S. T. Leutenegger, and M. A.Lopez. In SIGMOD b00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 331b^S342, New York, NY, USA, 2000. ACM Press.

3.      Voronoi Diagram, Franz Aurenhammer, Rolf Klein1

4.      Spatial Databases; Accomplishments and Research NeedsSpatial Databases: Accomplishments and Research Needs, S. Shekhar, S. Chawla, S. Ravada, A. Fetterer, X. Liu and C.T. Liu, IEEE Transactions on Knowledge and Data Engineering, Jan.-Feb. 1999.

5.     Discovering Spatial Co-location Patterns: A Summary of ResultsDiscovering Spatial Co-location Patterns: a Summary of Results, S. Shekhar and Y. Huang,  In Proc. of 7th International
Symposium on Spatial and Temporal Databases (SSTD01), July 2001.

6.      Detecting Graph-based Spatial Outliers: Algorithms and Applications, S. Shekhar, C.T. Lu, P. Zhang, the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001.

7.      Extending Data Mining for Spatial Applications: A Case Study in Predicting Nest Locations, S. Chawla, S. Shekhar, W. Wu and U. Ozesmi, Proc. Int. Confi. on 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000), Dallas, TX, May 14, 2000.

8.      Modeling Spatial Dependencies for Mining Geospatial Data, S. Chawla, S. Shekhar, W. Wu and U. Ozesmi, First SIAM International Conference on Data Mining, 2001.

9.      Spatial Contextual Classification and Prediction Models for Mining Geospatial Data, S. Shekhar, P.R. Schrater, R. R. Vatsavai, W. Wu, and S. Chawla, IEEE Transactions on Multimedia, 2001.

10.   The Quadtree and Related Hierarchical Data Structures. Finkel and Bentley, ACM Comput. Surv.1974

11.  An introductory tutorial on kd-trees, A. Moore

12.  Building of Trapezoidal Map from a set of non-intersecting lines, Jukka Kaartinen

13.  Spatial data structures for version management of engineering drawings in cad database. Y. Nakamura and H. Dekihara. In ICIAP b03: Proceedings of the 12th International Conference on Image Analysis and Processing, page 219, Washington, DC, USA, 2003. IEEE Computer Society.

14. S. Prabhakar, Y. Xia, D. V. Kalashnikov, W. G. Aref, and S. E. Hambrusch. " Query indexing and velocity constrained indexing: Scalable techniques for continuous queries on moving objects. , In IEEE Transactions on Computers, 2002.

15. B. Seeger and H. P. Kriegel. " Techniques for design and implementation of efficient spatial access methods., In VLDB, 1988.

6.      Big Data Analysis (top^)

6.1 Big Data Analysis (top^)

1.       Seeking Stable Clusters in the Blogosphere. Bansal, F. Chiang, N. Koudas, and F. Wm. Tompa. VLDB 2007.

2.      Improved Annotation of the Blogosphere via Autotagging and Hierarchical Clustering. C. Brooks and N. Montanez.,WWW 2006.

3.      J. Zhang, M. Ackerman, and L. Adamic. Expertise Networks in Online  Communities: Structure and Algorithms. WWW 2007.

4.      L. Backstrom et al. Group Formation in Large Social Networks: Membership, Growth, and Evolution. KDD 2006.

5.      X. Wu, L. Zhang, and Y. Yu. Exploring Social Annotations for the Semantic Web. WWW 2006.

6.      DeRose, W. Shen, F. Chen, A. Doan, R. Ramakrishnan. Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach. VLDB 2007.

7.      S. Abiteboul and N. Polyzotis. The Data Ring: Community Content Sharing. CIDR 2007.

8.      M. Dubinko et al. Visualizing Tags Over Time. WWW 2006.

9.      K. Lawrence and M.C. Shraefel. Bringing Communities to the Semantic Web and the Semantic Web to Communities. WWW 2006.

10.  Mei, C. Liu, and H. Su. A Probabilistic Approach to Spatiotemporal Theme Pattern Mining on Weblogs. WWW 2006.

11.  B. Aleman-Meza et al. Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection. WWW 2006

12.  Y. Matsuo, J. Mori, and M. Hamasaki. POLYPHONET: An Advanced Social  Network Extraction System from the Web. WWW 2006.

13.  Li et al. Towards Effective Browsing of Large Scale Social Annotations. WWW 2007.

14.  X. Ni et al. Exploring in the Weblog Space by Detecting Informative and Affective Articles. WWW 2007.

15.  Q. Mei et al. Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs. WWW 2007.

16.  H. Halpin, V. Robu, and H. Shepherd. The Complex Dynamics of Collaborative Tagging. WWW 2007.

17.  P.-A. Chirita et al. P-TAG: Large Scale Automatic Generation of  Personalized Annotations TAGs for the Web. WWW 2007.

18.  S. Bao et al. Optimizing Web Search Using Social Annotations. WWW  2007.

19.  L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore Art Thou R3579X?  Anonymized Social Networks, Hidden Patterns, and Structural  Steganography. WWW 2007.

20.  Chi et al. Structural and temporal analysis of the blogosphere through community factorization. KDD 2007.

21.  Tantipathananandh et al. A Framework For Community Identification in  Dynamic Social Networks. KDD 2007.

22.  Y. Liu et al. ARSA: A Sentiment-Aware Model for Predicting Sales  Performance Using Blogs. SIGIR 2007.

23.  Golder and B. Huberman. The Structure of Collaborative Tagging  Systems.

24.  J. Freyne et al. Collecting Community Wisdom: Integrating Social  Search and Social Navigation. IUI 2007.

25.  A. Sahuguet, R. Hull, D. F. Lieuwen, and M. Xiong. Enter once, share everywhere: User profile management in converged networks. International Conference on Innovative Data Systems Research, 2003.

26.  James Caverlee and Ling Liu. Tamper Resilient Trust Establishment in Online Social Networks. Technical Report, Georgia Institute of Technology, School of Computer Science. July 2007.

27.  Aleksandra Korolova, Rajeev Motwani, and Shubha U. Nabar. Link Privacy in Social Networks. CIKM 2008, Napa Valley, California, USA.

28.  Jure Leskovec, Daniel Huttenlocher, Jon Kleinberg. Predicting Positive and Negative Links in Online Social Networks. WWW 2010.

29.  Jure Leskovec, Kevin Lang, Michael Mahoney. Empirical Comparison of Algorithms for Network Community Detection. WWW 2010.

30.  Jennifer Neville, Timothy La Fond. Randomization Tests for Distinguishing Social Influence and Homophily Effects. WWW 2010.

31.  Yue Lu, Panayiotis Tsaparas, Alex  Ntoulas, Livia  Polanyi. Exploiting Social Context for Review Quality Prediction. WWW2010.

32.  Arun Maiya, Tanya Berger-Wolf. Sampling Community Structure. WWW 2010.

33.  Kristina Lerman, Tad Hogg. Using a Model of Social Dynamics to Predict Popularity of News. WWW 2010

34.  Alessandra Sala, Lili Cao, Christo Wilson, Robert Zablit, Haitao Zheng, Ben Zhao. Measurement-calibrated Graph Models for Social Network Experiments. WWW2010.

35.  Yu Zheng, Lizhu Zhang, Zhengxin Ma, Xing Xie, Wei-Ying Ma. Recommending friends and locations based on individual location history, February 2011, ACM Transactions on the Web (TWEB), Volume 5 Issue 1

36.  Yu Zheng, Xing Xie. Learning travel recommendations from user-generated GPS traces, January 2011 ACM Transactions on Intelligent Systems and Technology (TIST), Volume 2 Issue 1

37.  D Crandall, L. . Backstrom, D. Huttenlocher, J. Kleinberg. Mapping the world's photos, WWW 2009.

38.  ZHOU, T., KUSCSIK, Z., LIU, J.-G., MEDO, M., WAKELING, J. R., AND ZHANG, Y.-C. Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences 107, 10, 4511-4515.

39.  A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social networks. In KDD, 2008.

40.  I. Anger and C. Kittl. Measuring influence on twitter. In Proceedings of International Conference on Knowledge Management and Knowledge Technologies, 2011.

41.  H. Bao and E. Y. Chang. Adheat: an influence-based diffusion model for propagating hints to match ads. In Proceedings of WWW, 2010.

42.  F. Bass. A new product growth for model consumer durables. Management Science, 1969.

43.  D. Boyd and J. Heer. Profiles as conversation: Networked identity performance on friendster. In Proceedings of HICSS, 2006.

44.  P. Domingos and M. Richardson. Mining the network value of customers. In Proceedings of SIGKDD, 2001.

45.  N. Eagle, A. Pentland, and D. Lazer. Inferring friendship network structure using mobile phone data. PNAS, 106(36):15274-15278, 2009.

46.  M. de Gemmis, P. Lops, G. Semeraro, and P. Basile. Integrating tags in a semantic content-based recommender. In RecSys, 2008.

47.  M. Goetz, J. Leskovec, M. McGlohon, and C. Faloutsos. Modeling blog dynamics. In ICWSM, 2009.

48.  J. Goldenberg, B. Libai, and E. Muller. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters, 2001.

49.  M. Gomez-Rodriguez, J. Leskovec, and A. Krause. Inferring networks of diffusion and influence. In KDD, 2010.

50.  Z. Guan, C. Wang, J. Bu, C. Chen, K. Yang, D. Cai, and X. He. Document recommendation in social tagging services. In WWW, 2010.

51.  M. Gladwell. The tipping point: How little things can make a big difference. Little Brown,2000.

52.  H. Kelman. Compliance, identification, and internalization: Three processes of attitude change. Journal of Conflict Resolution, 1958.

53.  C. A. Hidalgo and C. Rodriguez-Sickert. The dynamics of a mobile phone network. Physica A: Statistical Mechanics and its Applications, 387(12):3017 - 3024, 2008.

54.  D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In ACM SIGKDD, 2003.

55.  H. Ma, H. Yang, M. R. Lyu, and I. King. Mining social networks using heat diffusion processes for marketing candidates selection. In Proceeding of ACM CIKM, 2008.

56.  P. Massa and P. Avesani. Trust metrics in recommender systems. In Computing with Social Trust, 2009.

57.  M. Richardson and P. Domingos. Mining knowledge-sharing sites for viral marketing. In Proceedings of SIGKDD, 2002.

58.  M. M. Skeels and J. Grudin. When social networks cross boundaries: a case study of workplace use of facebook and linkedin. In Proceedings of ACM GROUP, 2009.

59.  P. Singla and M. Richardson. Yes, there is a correlation: - from social networks to personal behavior on the web. In WWW, 2008.

60.  J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD, 2009.

61.  R. Xiang, J. Neville, and M. Rogati. Modeling relationship strength in online social networks. In WWW, 2010.

62.  E. Zheleva, H. Sharara, and L. Getoor. Co-evolution of social and affiliation networks. In KDD, 2009.

63.  M. Cha , H. Haddadi , F. Benevenuto , K. P. Gummadi. Measuring user influence in Twitter: The million follower fallacy. In ICWSM 2010.

64.  Danny Oosterveer. Influencing and measuring word of mouth on Twitter. Master Thesis, August 2011.

65.  Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis.  Show me the money!: deriving the pricing power of product features by mining consumer reviews. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, San Jose, California, USA August 12 - 15, 2007 2007


6.2  Collaborative Filtering and Content based Recommendation (top^)

1.      Empirical Analysis of Predictive Algorithms for Collaborative Filtering. John S. Breese, David Heckerman, and Carl Kadie, Technical Report 1998.

2.      Collaborative Filtering by Personality Diagnosis: A Hybrid Memory- and Model-Based Approach. David Pennock, Eric Horvitz, Steve Lawrence, and C. Lee Giles, Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, UAI 2000.

3.      Memory-Based Weighted-Majority Prediction For Recommender Systems. Joaquin Delgado and Naohiro Ishii, In Proceedings of the ACM SIGIR’99 Workshop on Recommender Systems: Algorithms and Evaluation, UC Berkeley, CA, USA, August 1999.

4.      Clustering for collaborative filtering applications. Arnd Kohrs and Bernard Merialdo, In Proceedings of CIMCA’99. IOS Press, 1999.

5.      Voting Systems with Trust Mechanisms in Cyberspace: Vulnerabilities and Defenses. Qinyuan Feng, Yan Lindsay Sun, Ling Liu, Yafei Yang, Yafei Dai, IEEE Transactions on Knowledge and Data Engineering, 2010.

6.      Enhancing Personalized Ranking Quality through Multidimensional Modeling of Inter-item Competition. Qinyuan Feng, Ling Liu, Yan Sun, Ting Yu, Yafei Dai, In Proceedings of 2010 International Conference on Collaborative Computing (CollaborateCom 2010), Chicago, Illinois, USA, October 9-12, 2010.

7.      Item-based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. In WWW 2010.

8.      Large-scale Parallel Collaborative Filtering for the Netflix Prize

9.      Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights. In ICDM 2007.

10.  Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model. In KDD 2008.

11.  Collaborative Filtering via Concept Decomposition on the Netflix Dataset

12.  Binary principal component analysis in the netflix collaborative filtering task

13.  Progress Report: Collaborative Filtering Using Bregman Co-clustering

14.  Incremental Matrix Factorization for Collaborative Filtering

15.  Aris Anagnostopoulos, Ravi Kumar, Mohammad Mahdian. Influence and correlation in social networks. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining 2008, Las Vegas, Nevada, USA

16.  Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis.  Show me the money!: deriving the pricing power of product features by mining consumer reviews. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, San Jose, California, USA August 12 – 15, 2007 2007


6.3 Association Mining (top^)

1.      R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules, In Proc. 20th Int. Conf. on Very Large Databases (VLDB 1994, Santiago de Chile), pp487-499 Morgan Kaufmann, San Mateo, CA, USA 1994

2.      J. Han, Y. Fu, iDiscovery of multiple-level association rules from large databases, in: Proc. 21st Int. Conf. on Very Large Data Bases, Zurich, Switzerland, pp. 420-431, 1995.

3.      R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo Fast Discovery of Association Rules, In: Advances in Knowledge Discovery and Data Mining, 307-328 AAAI Press / MIT Press, Cambridge, CA, USA 1996

4.      B. Liu, W. Hsu, Y. Ma, Mining association rules with multiple minimum supports, in: Proc. 1999 Int. Conf. on Knowledge Discovery and Data Mining, San Deige, CA, 1999, pp. 337-341.

5.      Christian Borgelt, Efficient Implementations of Apriori and Eclat, Workshop of Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL, USA).

6.      M. J. Zaki, C.J. Hsiao, Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No 4, April 2005, pp. 462-478, 2005.

7.      M.C. Tseng , W.Y. Lin, Efficient mining of generalized association rules with non-uniform minimum support, Data & Knowledge Engineering 62, ScienceDirect, pp. 41-64, 2007.

8.      Aggarwal, C.C. and Yu, P.S. 2001, A New Approach to Online Generation of Association Rules, IEEE Transactions on Knowledge and Data Engineering. Volume 13, No 4,pp. 527-540.

9.      Ioannis N. Kouris, Christos H. Makris, Athanasios K. Tsakalidis. An Improved Algorithm for Mining Association Rules Using Multiple Support Values FLAIRS 2003

10.  Katsumi Takahashi, Iko Pramudiono, Masaru Kitsuregawa, Geo-word centric association rule mining Proceedings of the 6th International conference on Mobile data management MDM '05

11.  Xiaoyun Wu, and Rohini Srihari. Incorporating Prior Knowledge with Weighted Margin Support Vector Machines, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004

12.  William DuMouchel and Daryl Pregibon. Empirical Bayes Screening for Multi-Item Associations in Massive Datasets, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining 2003

13.  Dino Pedreshi, Salvatore Ruggieri, Franco Turini. Discrimination-aware data mining. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining 2008, Las Vegas, Nevada, USA

14.  Raymond K. Pon, Alfonso F. Cardenas, David Buttler, Terence Critchlow. Tracking multiple topics for finding interesting articles. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, San Jose, California, USA August 12 - 15, 20


6.4 Data Clustering (top^)

1.      A. K. Jain, M.N. Murthy and P.J. Flynn, Data Clustering: A Review, ACM Computing Reviews, Nov 1999.

2.      Euripides G.M. Petrakis and Christos Faloutsos, Similarity Searching in Medical Image Databases , IEEE Transaction on Knowledge and Data Engineering Volume 9, No. 3, MAY/JUNE 1997.

3.      Hinneburg A., Keim D.A. An Efficient Approach to Clustering in Large Multimedia Databases with Noise Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, AAAI Press, 1998.

4.      Guojun Gan, Chaoqun Ma, Jianhong Wu Data Clustering: Theory, Algorithms, and Applications

5.      Keke Chen and Ling Liu. iVIRBRATE: Interactive Visualization Based Framework for Clustering Large Datasets ACM Transactions on Information Systems.

6.      S. Guha, R. Rastogi, and K. Shim, CURE: An efficient clustering algorithm for large databases In Proceedings of ACM SIGMOD International Conference on Management of Data, 1998

7.      Tian Zhang, Raghu Ramakrishnan, and Miron Livny, BIRCH: An Efficient Data Clustering Method for Very Large Databases In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 103--114,

8.      H. Zha and X. He and C. Ding and M. Gu and H. Simon. Bipartite Graph Partitioning and Data Clustering. Proc. of {ACM} 10th Int'l Conf. Information and Knowledge Management, pp. 25--31, 2001.

9.      A. Patrikainen and H. Mannila. Subspace clustering of high-dimensional binary data -- A probabilistic approach. Proc. Workshop on Clustering High Dimensional Data in {SIAM} International Conference on Data Mining, 2004.

10.  Co-clustering documents and words using bipartite spectral graph partitioning. I.S. Dhillon. Knowledge Discovery and Data Mining, pp. 269--274, 2001.

11.  Frey, B. J. & Dueck, D. Clustering by Passing Messages Between Data Points. Science, 2007, 315, 972-976

12.  Reynold Cheng, Michael Chau , Ben Kao , Jackey Ng Uncertain Data Mining: An Example in Clustering Location Data. PAKDD 2006

13.  Fosca Giannotti, Mirco Nanni, Fabio Pinelli, Dino Pedreschi, Trajectory pattern mining Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining KDD '07

14.  Jain, Anil K., M. Narasimha Murty, and Patrick J. Flynn. "Data clustering: a review." ACM computing surveys (CSUR) 31, no. 3 (1999): 264-323.

15.  Xu, Rui, and Donald Wunsch. "Survey of clustering algorithms." Neural Networks, IEEE Transactions on 16, no. 3 (2005): 645-678.


6.5 Classification based Machine Learning (top^)

1.      Xiaoyun Wu, and Rohini Srihari. Incorporating Prior Knowledge with Weighted Margin Support Vector Machines, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004

2.      William DuMouchel and Daryl Pregibon. Empirical Bayes Screening for Multi-Item Associations in Massive Datasets Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining 2003

3.      Charles Elkan, Keith Noto. Learning classifiers from only positive and unlabeled data. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining 2008, Las Vegas, Nevada, USA

4.      Venkatesh Ganti, Arnd C. Konig, Rares Vernica . Entity categorization over large document collections. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining 2008, Las Vegas, Nevada, USA

5.      Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, Michael W. Mahoney.  Feature selection methods for text classification. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, San Jose, California, USA August 12 - 15, 2007

6.      Ping Luo, Hui Xiong, Kevin Lu, Zhongzhi Shi. Distributed classification in peer-to-peer networks. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, San Jose, California, USA August 12 - 15, 2007

7.      Safavian, S. Rasoul, and David Landgrebe. "A survey of decision tree classifier methodology." Systems, Man and Cybernetics, IEEE Transactions on 21, no. 3 (1991): 660-674.

7. Cloud Computing (top^)

    1. Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung. The Google file system. In 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003.
    2. Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of OSDI 2004, San Francisco, CA, 2004.
    3. Fay Chang et. al. Bigtable: A Distributed Storage System for Structured Data In Proceedings of OSDI 2006, Seattle, WA, 2006.
    4. http://hadoop.apache.org/
    5. Frank Schmuck, Roger Haskin. GPFS: A Shared File System For Large Computing Cluster. In Proceedings of the 2002 Conference on File and Storage Technologies (FAST)
    6. Giuseppe DeCandia et. al. Dynamo: Amazon's Highly Available Key-value Store. In SOSP '07
    7. Chandramohan A. Thekkath, Timothy Mann, Edward K. Lee Frangipani: A Scalable Distributed File SystemIn Proceedings of the 16th ACM Symposium on Operating Systems Principles, 1997
    8. Mike Burrows. The Chubby lock service for loose-coupled distributed systems. In 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006.
    9. Marcos K. Aguilera, Arif Merchant, Mehul Shah. Sinfonia: A New Paradigm for Building Scalable Distributed SystemsIn SOSP 2007
    10. Philip H. Carns, Walter B. Ligon Iii, Robert B. Ross, Rajeev Thakur, PVFS: A Parallel File System for Linux Clusters In Proceedings of the 4th Annual Linux Showcase and Conference.
    11. Red Hat Company. Red Hat Global File System
    12. Andrew Pavlo et. al. A Comparison of Approaches to Large-Scale Data Analysis. SIGMOD'09.
    13. Cheng-Tao Chu et. al. Map-Reduce for Machine Learning on Multicore. NIPS'06.
    14. Azza Abouzeid et. al. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In Proceedings of VLDB, 2009.
    15. Kamil Bajda-Pawlikowski, Daniel J. Abadi, Avi Silberschatz, and Erik Paulson. Efficient Processing of Data Warehousing Queries in a Split Execution Environment. SIGMOD 2011.
    16. lper Okcan and Mirek Riedewald. Processing Theta-Joins using MapReduce. SIGMOD 2011.
    17. E. Friedman, P. Pawlowski, and J. Cieslewicz. SQL/MapReduce: a practical approach to self-describing, polymorphic, and parallelizable user-defined functions. PVLDB 2009.
    18. C. Yang, C. Yen, C. Tan, and S. Madden. Osprey: Implementing MapReduce-Style Fault Tolerance in a Shared-Nothing Distributed Database. In ICDE '10, 2010.
    19. H.-c. Yang, A. Dasdan, R.-L. Hsiao, and D. S. Parker. Map-reduce-merge: simplified relational data processing on large clusters. In Proc. of SIGMOD, 2007
    20. Hive :A Petabyte Scale Data Warehouse Using Hadoop. In ICDE, 2010.
    21. S. Blanas, J. M. Patel, V. Ercegovac, J. Rao, E. J. Shekita, and Y. Tian. A comparison of join algorithms for log processing in MapReduce. In Proc. of SIGMOD 2010.
    22. R. Vernica, M. J. Carey, and C. Li. Efficient parallel set-similarity joins using mapreduce. In SIGMOD 2010.
    23. F. N. Afrati and J. D. Ullman. Optimizing joins in a map-reduce environment. In EDBT, 2010.
    24. Balaji Palanisamy, Aameek Singh, Ling Liu, Bhushan Jain. Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud", ACM/IEEE International Conference on SuperComputing (SC2011), Seattle WA, Nov. 12-18, 2011.
    25. Shicong Meng, Ling Liu, and Ting Wang. State Monitoring in Cloud Datacenters, IEEE Transactions on Knowledge and Data Engineering, Special Issue on Cloud Data Management. VOL. 23, NO. 9, SEPTEMBER 2011.
    26. Yiduo Mei, Ling Liu, Xing Pu, Sankaran Sivathanu. Performance Analysis of Network I/O Workloads in Virtualized Data Centers, IEEE Transactions on Service Computing, 2011.
    27. Gong Zhang and Ling Liu. Why Do Migrations Fail and What Can We Do about It?, Proceedings of Usenix 25th Large Installation System Administration Conference (LISA '11), December 4-9, 2011, Boston, MA.
    28. Jeffrey Dean and Sanjay Ghemawat. MapReduce: A Flexible Data Processing Tool, Communications of the ACM, January 2010, Vol. 53, No.1.
    29. Patrick Hunt, Mahadev Konar, Flavio P. Junqueira and Benjamin Reed ZooKeeper: Wait-free coordination for Internet-scale systems, USENIX ATC 2010
    30. P. E. O'Neil, E. Cheng, D. Gawlick, and E. J. O'Neil, The log-structured merge-tree (lsm-tree), Acta Inf., vol. 33, no. 4, pp. 351-385, 1996.
    31. S. Idreos, M. L. Kersten, and S. Manegold, Updating a cracked database, in SIGMOD 2007, pp. 413-424
    32. G. Graefe, Write-optimized b-trees, in VLDB 2004, pp. 672-683.
    33. Matthias Brantner, Daniela Florescu, David Graf, Donald Kossmann, Tim Kraska, Building a database on S3, Proceedings of the 2008 ACM SIGMOD: International Conference on Management of Data, Vancouver, Canada, Pages 251-264, 2008.
    34. J. Baker, C. Bond, J. Corbett, J. J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh, Megastore: Providing scalable, highly available storage for interactive services, in CIDR. 2011, pp. 223-234.
    35. M. Armbrust, K. Curtis, T. Kraska, A. Fox, M. J. Franklin, and D. A. Patterson, Piql: Success-tolerant query processing in the cloud, PVLDB, vol. 5, no. 3, pp. 181-192, 2011.
    36. J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D., Mazires, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. Rumble, E. Stratmann, and R. Stutsman, The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM. SIGOPS OSR, 43(4), December 2009, pp. 92-105.
    37. M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era: (it's time for a complete rewrite) VLDB 2007.
    38. Daniel Abadi, Consistency Tradeoffs in Modern Distributed Database System Design: CAP is Only Part of the Story, IEEE Computer 2012, vol 45, no. 2
    39. Maysam Yabandeh, Daniel Gomez Ferro, A critique of snapshot isolation, EuroSys 2012, 155-168
    40. Daniel Peng, Frank Dabek, Large-scale Incremental Processing Using Distributed Transaction and Notifications, OSDI 2010, 251-264
    41. Lisa Glendenning, Ivan Beschastnikh, Arvind Krishnamurthy, Thomas E. Anderson. Scalable consistency in Scatter, SOSP 2011
    42. Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Ng, and Kunle Olukotun. Map-Reduce for Machine Learning on Multicore. NIPS 2006.
    43. M. Busch, K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. Lin, Earlybird: Real-time search at twitter, in ICDE 2012, pp. 1360-1369.
    44. TripleProv: Efficient Processing of Lineage Queries in A Native RDF Store, M. Wylot, WWW 2014
    45. Optimizing RDF Queries on Cloud Platform, H.Kim et al, WWW 2013
    46. HASBE:A Hierachical Attribute-based Solution for Flexible and Scalable Access Control in Cloud Computing, Z. Wan, J. Liu, R. Deng, IEEE Transactions on Information Forensics and Security, Vol 7, No.2, 2012
    47. Toward Secure and Dependable Storage Services in Cloud Computing, W. Cong et al, IEEE Transactions on Services Computing, Vol 5, No. 3, 2012

8 Security and Privacy for Internet Applications (top^)

Questions: How can we build secure Internet Applications?

8.1 Security (top^)

1.    Extensible Security Architectures for Java, D S. Wallach, D. Balfanz, D. Dean and E W. Felten, Proceedings of the sixteenth ACM symposium on Operating systems principles (SOSP’97), Saint-Malo, France, Pages 116-128, December 1997.

2.    The CRISIS Wide Area Security Architecture, E. Belani, A. Vahdat, T. Anderson, and M. Dahlin, In Proceedings of USENIX Security Symposium, San Antonio, TX, January 1998.

3.    Authentication in Distributed Systems: Theory and Practise, B. Lampson, M. Abadi, M. Burrows and E. Wobber, Proceedings of the thirteenth ACM symposium on Operating systems principles (SOSP’91), Pacific Grove, CA, pages 165-182, October, 1991.

4.    Kerberos: An Authentication Service for Open Network Systems, J. G. Steiner, B. Clifford Neuman, and J.I. Schiller, Proceedings of the USENIX Winter Conference, pages 191-202, 1988.

5.    Kerberos: An Authentication Service for Computer Networks B. Clifford Neuman and Theodore Ts’o, IEEE Communications Magazine, Volume 32, Number 9, pages 33-38, September 1994. ( Kerberos Home, About Kerberos, Kerberos Reference Page, Kerberos for UNIX)

6.    Java Security: From HotJava to Netscape and Beyond D. Dean, E. W. Felten, and D S. Wallach, Proceedings of 1996 IEEE Symposium on Security and Privacy, May 1996.

7.   Cryptography and the Internet, S. Bellovin, Proceedings of CRYPTO ‘98, August 1998, pp. 46-55.

8.    Toward Acceptable Metrics of Authentication, M. Reiter and S. Stubblebine,IEEE Symposium on Security and Privacy, 1996.

9.    Timing attacks on Implementations of Diffie-Hellman, RSA, DSS and Other Systems, P. Kocher,Advances in Cryptology – CRYPTO 96, pp 104-113, 1996

10. Vulnerabilities and Security Threats in Structured Overlay Networks. M. Srivatsa, L. Liu, Proc of 20th IEEE Annual Computer Security Applications Conf ACSAC, 2004.

11. FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment. A. Adya, W. J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J. R. Douceur, J. Howell, J. R. Lorch, M. Theimer, R. P. Wattenhofer, Proc of 5th OSDI, 2002.

12. Secure routing for Structured Peer-to-Peer Overlay Networks. M. Castro, P. Druschel, A. Ganesh, A. Rowstron, D. S. Wallach, Proc of 5th OSDI, 2002.

13. The Sybil Attack. J. Douceur, Proc of 2nd Annual IPTPS Workshop, 2002.

14. Security Considerations for Peer-to-Peer Distributed Hash Tables. E. Sit, J. Morris, Proc of 2nd Annual IPTPS Workshop, 2002.

15. SOS: An Architecture For Mitigating DdoS Attacks. A. D. Keromytis, V. Misra, D. Rubenstein, Proc of IEEE Journal on Selected Areas in Communication, Vol 21, 2003.

16. Countering Targeted File Attacks using Location Keys. M. Srivatsa and L. Liu, Proc of USENIX Security Symposium, 2005.

17. Securing Publish-Subscribe Overlay Services With EventGuard . M. Srivatsa and L. Liu, Proc of ACM Conference on Computer and Communication Security (CCS) , 2005.


8.2 Privacy (top^)

1. A Privacy-Preserving Index for Range Queries, Bijit Hore, Sharad Mehrotra, Gene Tsudik, VLDB 2004

2. Auditing Compliance with a Hippocratic Database, Rakesh Agrawal Roberto Bayardo Christos Faloutsos Jerry Kiernan Ralf Rantzau Ramakrishnan Srikant, VLDB 2004

3. Privacy-preserving data mining. R. Agrawal and S. Ramakrishnan. In Proceedings of of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 439–450, 2000.

4. On the design and quantification of privacy preserving data mining algorithms. D. Agrawal and C. C. Aggarwal, In Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Santa Barbara, California, USA, May 21-23 2001. ACM.