CS4440 Emerging
Database Technologies
Instructor: Professor Ling Liu
Course
Readings
Attention:
The information contained in this page is subject to changes.
| Requirement
| Required
Readings | General/Recommended
Readings | Reading
Summary Posting |
There will be several background readings assigned each week. The readings
will either be handed out a week before or listed on the Web page for required
readings. You may also access this information from course
schedule.
Homework/Assignment:
You are expected to read the material each week and write 2-3 paragraphs per
reading giving your impressions and thoughts. The summaries should be informal
and brief, and should consist of your own comments on the readings, NOT a
rehash of the content.
You should email your summaries to TA: Gong Zhang (gzhang3 AT cc DOT gatech
DOT edu), preferably before each class but no
later than 11:59 pm on Friday each week (unless there is no reading
assignments for the week). Late assignment will NOT be accepted unless approved
in advance by the instructor.
Reading Summary Guidelines:
The summary for each reading assignment is expected to consist of 1
paragraph on each of the following three aspects: (1) the positive aspect of
the paper; (2) the negative aspect of the paper; and (3) a brief discussion on
how the idea or method proposed or used in evaluation may be applied to your
own project for the course.
You may want to keep these guidelines in mind when reading papers.
- Problem Statement
- What is the problem
area with which the paper is concerned? What are the concrete problems
that the authors are trying to solve?
- Contributions/New Ideas
- Summarize the authors'
arguments. What the authors are proposing, new architecture, algorithm,
methodology? Are you convinced? Why or Why not?
- Evaluation
- How did authors
evaluate their new proposals? Did they build a system? run a simulation,
collect traces from existing systems? or prove theorems? How their data
collection was done? Do you agree with their conclusion? their analysis?
- Weakness
- Comparing with the
state of art research in the probem area or
according to the related work section in the paper, was the idea proposed
new? Was the approach novel? What, in your opinion, should be evaluated
to validate their new proposal, but are missing in their evaluation? Is
there any alternative ways to conduct evaluation?
You may find the following short article helpful:
Efficient
Reading of Papers in Science and Technology By Michael J. Hanson and updated by D. McNamee
Areas of Readings
1. Mobile Database Management
2. Spatial Indexing Techniques
3. Data Clustering Algorithms
4. Stream databases
5.
RFID data management
6. Web Search and Web IR
7. Data Mining
8. Privacy Preserving Data Mining
9. Workflow Management
10. Role based Access Control
11. Data Warehouse and OLAP
Required Readings and Dates
You are expected to read papers in the required reading list, but only write
summary for one paper selected from the list of 2-3 required readings
associated with each lecture. Please use the Summary Template to write the reading
summaries.
NOTE: Most of the papers listed below are from ACM
or IEEE conferences or Journals. Online proceedings can be accessible from the
ACM /IEEE online library link provided by GT library. Your GT ID/Password are
required to access the online library.
http://www.library.gatech.edu/research_help/subject/index.php?/computer_science/conferences
General/Recommended Course Reading
List
- Bugra
Gedik and Ling Liu. MobiEyes: A Distributed Location Monitoring Service Using
Moving Location Queries. IEEE Transactions on Mobile Computing. Vol. 5, No. 10, pp. 1384-1402,
October 2006.
- Kipp Jones and Ling Liu. Map-matching: Towards Improving Wireless Positioning,
to appear in Proceedings of the 4th Annual International Conference on Mobile and
Ubiquitous Systems: Computing, Networking and Services (Mobiquitous 2007). August 6-10, 2007, Philadelphia, PA.
- Anand Murugappan and Ling
Liu. A SpatioTemporal Placement
Model for Caching Location Dependent Queries, Proceedings
of the 4th Annual International Conference on Mobile and Ubiquitous
Systems: Computing, Networking and Services (Mobiquitous
2007). August 6-10, 2007, Philadelphia,
PA.
- Bugra Gedik, Ling Liu,
Kun-Lung Wu, Philip S. Yu. Lira:
Lightweight, Region-aware Load Shedding in Mobile CQ Systems.
Proceedings of the IEEE 23rd International Conference on Data Engineering.
Istanbul, Turkey; April 17-20, 2007.
- Christian
S. Jensen, Dan Lin, Beng Chin Ooi, Rui Zhang: Effective Density Queries on
Continuously Moving Objects. ICDE 2006
- Mindaugas Pelanis, Simonas Saltenis, Christian S. Jensen: Indexing the past, present, and
anticipated future positions of moving objects. ACM Trans. Database Syst. 31(1):
255-298 (2006)
- Hu,
H., Lee, D.L., and Xu, J. Fast Nearest Neighbor Search
on Road Networks. Proceedings of the International
Conference on Extending Database Technology (EDBT 2006), Munich, Germany,
Mar 2006, 186-203.
8.
Hu, H., Lee,
D.L., and Lee, V.C.S. Distance Indexing on Road Networks.
Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB
2006), Seoul, Korea, Sept 2006, 894-905.
9.
Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah: Change Tolerant Indexing for
Constantly Evolving Data. ICDE 2005: 391-402
10. Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, Jeffrey Scott Vitter: Efficient Indexing
Methods for Probabilistic Threshold Queries over Uncertain Data. VLDB 2004 : 876-887
11. Fosca
Giannotti, Mirco Nanni,
Fabio Pinelli, Dino Pedreschi.
Trajectory
pattern mining, Proceedings of the 13th ACM SIGKDD
international conference on Knowledge discovery and data mining KDD '07.
- Project lachesis: Parsing
and modeling location histories, in: GIScience, 2004Hariharan, Toyama
- Extracting places from traces of locations.
In Proc. WMASH, pages 110--118, New
York, NY, USA, 2004. Kang, Welbourne, Stewart, Borriello
- Xu, Wu, Tang, Lee. Monitoring Top-k Query in Wireless Sensor Networks,
Proc. the 22nd IEEE Int. Conf. on Data Engineering (ICDE '06), Atlanta, GA,
April 2006.
- David Mark. Geographic Information Science: Defining the Field.
- A bibliography of temporal, spatial and spatio-temporal data mining research,
John F. Roddick , Myra Spiliopoulou
, ACM SIGKDD Explorations Newsletter, v.1 n.1, p.34-38, June 1999
- Modeling
Transportation Routines using Hybrid Dynamic Mixed Networks ,
Vibhav Gogate, Rina Dechter, Bozhena Bidyuk, James Marca and Craig Rindt, , In
21st Conference on Uncertainty in Artificial Intelligence (UAI), 2005.
- Sui D.Z. Tobler's First Law of Geography: A Big Idea for a Small World?
Annals of the Association of American Geographers 94 (2), 269–277.
- John Heidemann. Nirupama
Bulusu. Using Geospatial Information in Sensor Networks.
USC/Information Sciences Institute. September 20, 2000.
- Building Personal Maps from GPS Data. Lin Liao and Donald J. Patterson and
Dieter Fox and Henry Kautz.
- Y. Xu, W.-C. Lee, J. Xu, and G. Mitchel
Processing Window Queries in Wireless Sensor Networks, Proc. the 22nd IEEE Int. Conf. on Data
Engineering (ICDE '06), Atlanta, GA, April 2006.
- L. Liao, D. Fox, and H. Kautz. Location-Based
Activity Recognition using Relational Markov Networks. Proc. of the International Joint Conference on Artificial
Intelligence (IJCAI-05).
- D. Ashbrook
and T. Starner, Using GPS to Learn Significant Locations and Predict Movement
Across Multiple Users, Personal and Ubiquitous Computing, Vol.
7.5.
- C. S. Jensen and R. T. Snodgrass. Temporal Data Management. IEEE
TKDE, 11(1): 36--45 (1999).
- Liao, Fox, Kautz, Learning
and Inferring Transporation Routines,
Artificial
Intelligence 2007.
- Patterson, Liao, Fox, Kautz, Inferring High-Level Behavior from Low-Level Sensors,
UBICOMP 2003. ICS 280.
- Fundamental Challenges in Mobile Computing,
Satyanarayanan, M., Fifteenth ACM Symposium on
Principles of Distributed Computing , May 1996, Philadelphia,
PA, Revised version appeared as: "Mobile Computing: Where's the
Tofu?", Proceedings of the ACM Sigmobile,
April 1997, Vol. 1, No. 1.
- Multi-Fidelity Algorithms for Interactive Mobile
Applications, Satyanarayanan,
M., Narayanan, D. Proceedings of the 3rd International Workshop on
Discrete Algorithms and Methods for Mobile Computing and Communications,
August 1999, Seattle, WA
- Mobile Data Access, Noble, B.School of Computer Science, Carnegie Mellon
University, May
1998, CMU-CS-98-118
- Energy-aware adaptation for mobile applications, Flinn J., Satyanarayanan,
M., Proceedings of the 17th ACM Symposium on Operating Systems Principles,
December, 1999, Kiawah Island Resort, SC.
- PowerScope: A Tool for Profiling the Energy Usage of Mobile
Applications, Flinn J., Satyanarayanan, M., Proceedings of the Second IEEE
Workshop on Mobile Computing Systems and Applications, February, 1999, New
Orleans, LA
- System Support for Mobile, Adaptive Applications,
Noble, Brian, IEEE Personal Communications, Vol. 7, No. 1, February, 2000
- Experience with adaptive mobile applications in Odyssey
, Noble, B.D. and Satyanarayanan, M., Mobile
Networks and Applications, Vol. 4, 1999
- Agile Application-Aware Adaptation for Mobility,
Noble, B., Satyanarayanan, M., Narayanan, D.,
Tilton, J.E., Flinn, J., Walker, K. Proceedings
of the 16th ACM Symposium on Operating System Principles, October 1997,
St. Malo, France
- A Research Status Report on Adaptation for Mobile Data
Access , Noble, B., Satyanarayanan,
M. SIGMOD Record, Vol. 24, No. 4, December 1995
- A Programming Interface for Application-Aware
Adaptation in Mobile Computing , Noble, B., Price, M., Satyanarayanan, M., Proceedings of the Second USENIX
Symposium on Mobile & Location-Independent Computing, Apr. 1995, Ann
Arbor, MI
- Application-Aware Adaptation for Mobile Computing
, Satyanarayanan, M., Noble, B., Kumar, P.,
Price, M. Proceedings of the 6th ACM SIGOPS European
Workshop, Sep. 1994, Dagstuhl, Germany.
- Mobile Information Access, Satyanarayanan, M. , IEEE Personal Communications,
Vol. 3, No. 1, February 1996
- Indexing
Techniques for Power Management in Multi-Attribute Data Broadcast
Qinglong Hu, Wang-Chien Lee, and Dik Lun Lee.
- Power
conserving And access Efficient Indexes For Wireless Computing
Dik Lun Lee, and Qinglong
Hu,
- Power Conservative Multi-Attribute Queries on Data
Broadcast, Qinglong Hu, Wang-Chien Lee, and Dik Lun Lee, ICDE 2000.
- Effects of power conservation, wireless coverage and
cooperation on data dissemination among mobile devices",
Maria Papadopouli and Henning Schulzrinne, ACM SIGMOBILE Symposium on
Mobile Ad Hoc Networking & Computing (MobiHoc)
2001, October 4-5, 2001, Long Beach, California. (Extension of the Sarnoff
paper.)
- Energy-aware Web Caching for Mobile Terminals.
Francoise Sailhan, Valrie
Issarny. In Proceedings of the ICDCS Workshop on Web Caching Systems.
July 2002, Vienna, Austria.
- Power-Controlled Data Prefetching/Caching
in Wireless Packet Networks, Savvas
Gitzenis and Nicholas Bambos,
IEEE Infocom 2002, New York.
- Sleepers
and Workaholics: Caching Strategies in Mobile Environments.
Daniel Barbara, Tomasz Imielinski,VLDB Journal 4(4):
567-602(1995).
- Indexing techniques for data broadcast on wireless
channels. D.L. Lee, Q. Hu, and W.
C. Lee,Proceedings of the Fifth International Conference
on Foundations of Data Organization (FODO '98), Kobe, Japan, Nov
11-12, 1998, 175-182.
- Indexing Techniques for Wireless Data Broadcast Under
Data Clustering and Scheduling,Qinglong Hu, Wang-Chien Lee, and Dik Lun Lee,
in Proceedings of ACM
International Conference on Information and Knowledge Management (CIKM99), Kansas City, Missouri, Nov.
1999, pp. 351-358.
- Location Privacy in Pervasive Computing,
A. R. Beresford, F. Stajano. In Proc of IEEE
Pervasive Computing 46-55, March 2003
- A Customizable k-Anonymity Model for Protecting
Location Privacy. B. Gedik, L. Liu, Proc of Intl Conf
of Distributed Computing Systems ICDCS, 2005.
- Framework for Security and Privacy in Automotive Telematics. S. Duri,
M. gruteser, X. Liu, P. Moskowitz,
R. Perez, M. Sing, J. M. TangProc
of Intl Workshop on Mobile Commerce WMC,
2002.
- Anonymous Usage of Location-Based Services Through Spatial and Temporal Cloaking.
M. gruteser, D. GrunwaldProc
of ACM/USENIX MobiSys, 2003.
2. Spatial Indexing Techniques
- R-trees: a dynamic index structure for spatial
searching, Antonin Guttman
, Proceedings of the 1984 ACM SIGMOD international conference on
Management of data, June 18-21, 1984, Boston, Massachusetts
- Indexing the positions of continuously moving objects.
S. Saltenis, C. S. Jensen, S. T. Leutenegger, and M. A.Lopez.
In SIGMOD ’00: Proceedings of the 2000 ACM SIGMOD international
conference on Management of data, pages 331–342, New York, NY, USA, 2000. ACM Press.
- Voronoi Diagram, Franz Aurenhammer, Rolf Klein1
4. The Quadtree and Related Hierarchical Data Structures. Finkel and Bentley, ACM Comput. Surv.1974
- An introductory tutorial on kd-trees,
A. Moore
- Building of Trapezoidal Map from a set of
non-intersecting lines, Jukka
Kaartinen
- Spatial data structures for version management of
engineering drawings in cad database. Y. Nakamura and H. Dekihara. In ICIAP ’03: Proceedings of the 12th
International Conference on Image Analysis and Processing, page 219, Washington, DC,
USA, 2003.
IEEE Computer Society.
- Data Clustering: A Review, A. K. Jain,
M.N. Murthy and P.J. Flynn, ACM Computing Reviews, Nov 1999.
- On Line Clustering, Athman
Bouguettaya, IEEE Transaction on Knowledge and
Data Engineering Volume 8, No. 2, April 1996.
- Similarity Searching in Medical Image Databases,
Euripides G.M. Petrakis and Christos Faloutsos,
IEEE Transaction on Knowledge and Data Engineering Volume 9, No. 3,
MAY/JUNE 1997.
- Windows NT Clusters for Availability and Scalability,
Rob Short, Rod Gamache, John Vert
and Mike Massa ,Microsoft Online Research Papers,
Microsoft Corporation.
- Defining Data Mining, The Hows
and Whys of Data Mining, and How It Differs From Other Analytical
Techniques, Bruce Moxon,
Online Addition of DBMS Data Warehouse Supplement, August 1996.
- An Efficient Approach to Clustering in Large
Multimedia Databases with Noise. Hinneburg A., Keim D.A. Proc. 4th Int.
Conf. on Knowledge Discovery and Data Mining, AAAI Press, 1998. http://citeseer.ist.psu.edu/hinneburg98efficient.html
- Data
Clustering: Theory, Algorithms, and Applications, Guojun Gan , Chaoqun Ma , Jianhong Wu
- Chameleon: A hierarchical Clustering Algorithms Using
Dynamic Modeling IEEE Computer, George Karypis,
Eui-Hong Han, and Vipin
Kumar, Special Issue on Data Analysis and Mining. Vol. 32, No. 8, August
1999.
- Keke
Chen and Ling Liu. ``iVIRBRATE: Interactive Visualization Based Framework for
Clustering Large Datasets", ACM Transactions on
Information Systems.
- CURE: An efficient clustering
algorithm for large databases, S. Guha, R. Rastogi, and K. Shim, In Proceedings of ACM
SIGMOD International Conference on Management of Data, pages 73--84, New York, 1998.
- BIRCH: An Efficient Data
Clustering Method for Very Large Databases, Tian Zhang, Raghu Ramakrishnan, and Miron Livny, In Proceedings of the 1996 ACM SIGMOD
International Conference on Management of Data, pages 103--114, Montreal, Canada, 1996.
- Bipartite Graph Partitioning and Data Clustering.
H. Zha and X. He and C. Ding and M. Gu and H. Simon. Proc. of
{ACM} 10th Int'l Conf. Information and Knowledge Management, pp. 25--31,
2001.
- Spectral biclustering of microarray data: coclustering
genes and conditions. Y. Kluger
and R. Basri and J.T. Chang and M. Gerstein.
Genome Research. 13:703-716, 2003.
- Automatic subspace clustering of high dimensional data
for data mining applications. R. Agrawal,
J. Gehrke, D. Gunopulos, and P. Raghavan. In Proc. 1998 ACM-SIGMOD Int. Conf.
Management of Data, Seattle, Washington, June 1998
- A divisive information-theoretic feature clustering
algorithm for text classification. I.S. Dhillon and S. Mallela and R. Kumar. JMLR, 3:1265-1287,
2003.
- Subspace clustering of high-dimensional binary data --
A probabilistic approach. A. Patrikainen
and H. Mannila. Proc. Workshop on Clustering
High Dimensional Data in {SIAM}
International Conference on Data Mining, 2004.
- Segmentation using eigenvectors: a unifying view.
Weiss Y. Proceedings IEEE International Conference on Computer Vision p.
975-982 (1999).
- Coupled two-way clustering analysis of gene microarray data. G. Getz and E.
Levine and E. Domany. Proceedings of the
National Academy of Sciences of the United States of America,
94:12079-12084, 2000.
- On clusterings - good, bad
and spectral, S. Vempala R. Kannan
and A. Vetta, in Proc. 41st Symposium on the
Foundation of Computer Science, FOCS, 2000.
- Co-clustering documents and words using bipartite
spectral graph partitioning. I.S. Dhillon. Knowledge Discovery and Data Mining, pp. 269--274,
2001.
- Iterative Double Clustering for Unsupervised and
Semi-Supervised Learning, R. El-Yaniv
and O. Souroujon.NIPS 14, pp. 1025-1032, 2002.
-
4. Stream databases
- Continuous Queries over Data Streams
John S. Breese, David Heckerman, and Carl Kadie,
S. Babu and J. Widom.In
SIGMOD Record, September 2001.
- Towards Sensor Database Systems.
Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri. In
Proceedings of the Second International Conference on Mobile
Data Management. Hong Kong, January
2001.
- Querying
the Physical World. Philippe Bonnet, J. E. Gehrke, and
Praveen Seshadri. IEEE Personal Communications, Vol. 7, No. 5, October
2000, pages 10-15. Special Issue on Smart Spaces and Environments.
- Fjording the Stream: An Architecture for Queries over Streaming
Sensor Data, Sam Madden and Michael J. Franklin,ICDE Conference, February, 2002, San Jose.
- Streaming Queries over Streaming Data Sirish Chandrasekaran,
Michael J. Franklin, VLDB Conference, August 2002, Hong
Kong.
- Monitoring Streams: A New Class of Data Management Applications.D.
Carney, U. Cetintemel, M. Cherniack,
C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, S. Zdonik. In proceedings of the 28th International
Conference on Very Large Data Bases (VLDB'02), August 20-23, Hong Kong, China.
- Gigascope: a stream database for network applications
, Chuck Cranor, Theodore Johnson, and
Oliver Spatscheck,in
Proceedings of SIGMOD 2003.
- Query Processing, Approximation, and Resource
Management in a Data Stream Management System. R. Motwani et al. CIDR, 2003.
- Aurora: A New Model and Architecture for Data Stream
Management. D. Abadi, D. Camey, U. Cetintemel, M. Chemiack, C. Convey, S. Lee, M. Stonebraker,
N. Tatbul, and S. Zdonik.
in VLDB Journal, 2003.
- Issues in Data Stream Management.
Golab, L. und Ozsu, M. T. ACM SIGMOD Record.
32(2). 2003.
11. Estimating Clustering Indexes in Data Streams,
Luciana Buriol, Gereon Frahling, Stefano Leonardi,
Christian Sohler, Proc. 15th European Symposium on
Algorithms (ESA), 2007
5. RFID data management
- Security
and Privacy Issues in ePassport,
Ari Juels, David Molnar
and David Wagner, In Proceedings of Advances in Cryptology, 2005.
- Privacy and Security Issues in Library RFID Issues,
Practices, and Architectures, David Molnar and David
Wagner, In Proceedings of ACM CCS, 2004.
- High Power Proxies for Enhancing RFID Privacy and
Utility, In Proceedings of PET, 2005.
- RFID Security and Privacy: A Research Survey,
Ari Juels, In
Proceedings of IEEE Journal on Selected Areas in Communication, 2006.
- A Platform for RFID Security and Privacy Administration.
Melanie R. Rieback, Vrije
Universiteit Amsterdam; Georgi
N. Gaydadjiev, USENIX/SAGE Large Installation
System Administration conference - LISA'06, December 2006
- RFID Privacy: An Overview of Problems and Proposed
Solutions, IEEE Security and Privacy. v3 i3. 34-43,
Pages: 897-914, 2007
- Protocols for RFID tag/reader authentication,
Selwyn Piramuthu,
Decision Support Systems, Volume 43, Issue 3, April 2007, Pages
897-914
6. Web Search and Web IR
- Bigtable: A Distributed Storage System for Structured Data,
Fay Chang, Jeffrey Dean, Sanjay
Ghemawat, Wilson C. Hsieh,
Deborah A. Wallach, Mike Burrows,Tushar Chandra,
Andrew Fikes, Robert E. Gruber, 7th USENIX
Symposium on Operating Systems Design and Implementation (OSDI), 2006
2.
MapReduce: Simplified Data Processing on Large Clusters,
Jeffrey
Dean, Sanjay Ghemawat,
OSDI'04: Sixth Symposium on Operating System Design and Implementation, 2004
3.
Clustering Billions of Images with Large Scale Nearest
Neighbor Search, Ting Liu, Charles Rosenberg, Henry A. Rowley,
IEEE Workshop on Applications of Computer Vision, 2007
4.
Scaling Up All Pairs Similarity Search,
Roberto Bayardo, Yiming Ma,
Ramakrishnan Srikant, Proc.
of the 16th Int'l Conf. on the World Wide Web, 2007
5.
Adaptive Product Normalization: Using Online Learning for
Record Linkage in Comparison Shopping, Mikhail Bilenko, Sugato Basu, Mehran Sahami,
Proceedings of the 5th IEEE International Conference on Data Mining, 2005
6.
Evaluating similarity measures: a large-scale study in the orkut social network, Ellen Spertus, Mehran Sahami, Orkut Buyukkokten,
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD-2005), 2005
7.
Unweaving a web of documents, R. Guha, Ravi Kumar, D. Sivakumar,
Ravi Sundaram, KDD, 2005
8.
Mining Optimized Gain Rules for Numeric Attributes,
Sergey
Brin, Rajeev
Rastogi, Kyuseok Shim,
IEEE Trans. Knowl. Data Eng., 2003
9.
Scalable Techniques for Mining Causal Structures,
Craig Silverstein, Sergey Brin,
Rajeev Motwani, Jeffrey D. Ullman,
VLDB, 1998
10. Query by Semantic Example, Nikhil Rasiwasia, Nuno Vasconcelos, Pedro J. Moreno, CIVR, 2006
11. Indexing Dataspaces,
Xin Dong, Alon Halevy,
Proc. ACM SIGMOD, 2007
12. Query Suspend and Resume, Badrish Chandramouli, Chris Bond,
Shivnath Babu, Jun Yang,
Proc. ACM SIGMOD, 2007
13. Web-scale Data Integration: You can only afford to Pay As
You Go, Jayant Madhavan,
Shawn R. Jeffery, Shirley Cohen, Xin (Luna) Dong,
David Ko, Cong Yu, Alon
Halevy, Proceedings of the Conference on Innovative Data Systems Research
(CIDR), 2007
14. Data integration: the teenage years, Alon Halevy, Anand Rajaraman,
Joann Ordille, Proc. 32nd International Conference on
Very Large Databases, 2006
15. Data
management projects at Google, Wilson Hsieh, Jayant
Madhavan, Rob Pike, SIGMOD Conference, 2006
16. On-the-fly
Sharing for Streamed Aggregation, Sailesh
Krishnamurthy, Chung Wu, Michael J. Franklin, SIGMOD Conference, 2006
17. Principles
of dataspace systems, Alon Y. Halevy, Michael J. Franklin, David Maier, PODS,
2006
18. Structured Data Meets the Web: A Few Observations,
Jayant Madhavan, Alon Halevy, Shirley Cohen, Xin
(Luna) Dong, Shawn R. Jeffery, David Ko, Cong Yu,
Data Engineering Bulletin, 2006
19. ULDBs: databases with uncertainty and lineage,
Omar Benjelloun, Anish Das Sarma, Alon Halevy, Jennifer Widom, Proc. 32nd International Conference on Very Large
Databases, 2006
20. Web Search for a Planet: The Google Cluster Architecture,
Luiz Andre Barroso, Jeffrey
Dean, Urs Hlzle,
IEEE Micro, 2003
21. Finding Near-Duplicate Web Pages: A Large-Scale Evaluation
of Algorithms, Monika Henzinger,
Proc. SIGIR, 2006
22. Indexing Shared Content in Information Retrieval Systems,
Andrei Z. Broder, Nadav Eiron, Marcus Fontoura, Michael Herscovici, Ronny Lempel, John McPherson, Runping Qi, Eugene J. Shekita, EDBT, 2006
23. Introduction to the special issue on XML retrieval,
Ricardo Baeza-Yates, Norbert Fuhr,
Yoelle Maarek, ACM
Transactions on Information Systems, 2006
24. Retroactive Answering of Search Queries, Beverly
Yang, Glen Jeh, Proc. International World Wide Web
Conference, 2006
25. Semantic Search via XML Fragments: A High Precision
Approach to IR, Jennifer Chu-Carroll, John Prager,
Krzysztof Czuba, David Ferrucci,
Pablo Duboue, Proc. 29th ACM SIGIR Conference on
Research and Development in Information Retrieval, 2006
26. Using
annotations in enterprise search, Pavel
A. Dmitriev, Nadav Eiron, Marcus Fontoura, Eugene Shekita, WWW, 2006
27. Web mining with search engines: A web-based kernel function
for measuring the similarity of short text snippets, Mehran Sahami, Timothy D. Heilman, Proc. 15th International World Wide Web
Conference, 2006
28. Concept-based
interactive query expansion, Bruno M. Fonseca, Paulo Braz Golgher, Bruno Possas, Berthier A. Ribeiro-Neto, Nivio Ziviani, CIKM, 2005
29. Information Discovery--Needles and Haystacks,
Carl Lagoze, Amit Singhal, IEEE Internet Computing, 2005
30. Algorithmic Aspects of Web Search Engines, Monika
Rauch Henzinger, ESA, 2004
31. eBizSearch:
a niche search engine for e-business, C. Lee Giles, Yves Petinot, Pradeep B. Teregowda, Hui Han, Steve
Lawrence, Arvind Rangaswamy,
Nirmal Pal, SIGIR, 2003
32. Semantic Associations for Contextual Advertising.
Massimiliano Ciaramita and
Vanessa Murdock and Vassilis Plachouras.
Journal of Electronic Commerce Research Special Issue on Online Advertising and
Sponsored Search.
33. The Impact of Caching on Search Engines.
Ricardo Baeza-Yates, Aristides Gionis,
Flavio Junqueira, Vanessa
Murdock, Vassilis Plachouras,
Fabrizio Silvestri. 2007.
30th Annual International ACM SIGIR Conference.
34. Tree revision learning for dependency parsing.
G. Attardi and M. Ciaramita.
2007. In Proceedings of HLT-NAACL 2007.
35. Know your Neighbors: Web Spam Detection using the Web
Topology. Carlos Castillo and Debora Donato
and Aristides Gionis and Vanessa Murdock and Fabrizio Silvestri. 2007. In Proceedings of SIGIR. ACM Press. (July 2007), Amsterdam, Netherlands,
423--430.
36. James
Caverlee, Steve Webb, and Ling Liu. "Spam-Resilient Web Rankings via Influence Throttling",
Proceedings of the 21st IEEE International Parallel and Distributed Processing
Symposium (IPDPS), Long Beach,
2007.
37. The Self-Organized Web: The Yin to
the Semantic Web’s Yang. Gary William Flake, David M. Pennock, and Daniel C. Fain. 2003. IEEE Intelligent
Systems. 18, 4 75-77
38. A content and structure website mining model.
Barbara Poblete and Ricardo Baeza-Yates.
2006. In WWW '06: Proceedings of the 15th international conference on World
Wide Web (Edinburgh, Scotland). ACM Press. New York, NY,
USA, 957--958
39. Relationship Between Web Links and
Trade. Ricardo Baeza-Yates and Carlos
Castillo. 2006. In WWW '06: Proceedings of the 15th international conference on
World Wide Web (Edinburgh, Scotland). ACM Press. New York, NY,
USA, 927--928.
40. Communities from Seed Sets. Reid Andersen
and Kevin J. Lang. 2006. In WWW '06: Proceedings of the 15th international
conference on World Wide Web (Edinburgh,
Scotland). ACM
Press. New York, NY, USA,
223--232
41. Generating Query Substitutions. Rosie Jones
and Benjamin Rey and Omid Madani and Wiley Greiner. 2006. In WWW '06: Proceedings of
the 15th international conference on World Wide Web. ACM Press. New York, NY,
USA, 387--396.
42. Multi-structural databases. Ronald Fagin, R.
Guha, Ravi Kumar,
Jasmine Novak, D. Sivakumar, Andrew Tomkins. 2005. In
PODS '05: Proceedings of the twenty-forth ACM SIGMOD-SIGACT-SIGART symposium on
Principles of database systems. ACM Press. New York, NY, USA, 184--195.
43. Unweaving a web of documents. R. Guha, Ravi Kumar, D. Sivakumar
and Ravi Sundaram.
2005. In KDD '05: Proceedings of the eleventh ACM SIGKDD international
conference on Knowledge discovery in data mining. ACM Press. New York, NY, USA, 574--579.
44. Variable latent semantic indexing. Anirban Dasgupta, Ravi Kumar, Prabhakar Raghavan and Andrew Tomkins. 2005. In KDD '05: Proceeding
of the eleventh ACM SIGKDD international conference on Knowledge discovery in
data mining. ACM Press. New York,
NY, USA,
13--21.
45. Efficient implementation of large-scale multi-structural
databases. Ronald Fagin, Phokion Kolaitis, Ravi Kumar, Jasmine
Novak and Andrew Tomkins. 2005. �In VLDB '05: Proceedings of the 31st
international conference on Very Large Data Bases. VLDB Endowment. 958--969.
46. Discovering large dense subgraphs
in massive graphs. David Gibson, Ravi
Kumar and Andrew Tomkins. 2005. In VLDB '05: Proceedings of the 31st
international conference on Very Large Data Bases. VLDB Endowment. 721--732.
47. Query Incentive Networks. Jon M. Kleinberg
and Prabhakar Raghavan.
2005 In FOCS '05: 46th Annual IEEE Symposium on Foundations of Computer
Science. Pittsburgh, PA, 132--141 2.
- Indie: Distributed Indexing of autonomous Internet Services
Peter Danzig, Shih-Hao Li, Katia
Obraczka. Journam of
Computer Systems, 5(4), 1992. Original description of Indie
in 1991 ACM SIGIR
- Measuring Index Quality using Random Walks on the Web
Monika Henzinger, Allan Heydon,
Michael Mitzenmacher, and Marc A. Najork, Proceedings of the 8th International World Wide Web Conference,
pages 213-225, May 1999
- Focused Crawling: A New Approach to Topic-Specific Web
Resource Discovery Soumen Chakrabarti, Martin van den Berg, Byron Dom, Proceedings
of the 8thInternational World Wide Web Conference, May 1999
- Enhanced hypertext categorization using hyperlinks.S. Chakrabarti, B. Dom and P. Indyk.
Proceedings
of ACM SIGMOD 1998.
- Mining the link structure of the World Wide Web. S. Chakrabarti,
B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan,
S. Rajagopalan and A. Tomkins. IEEE
Computer.
- Trawling the Web for emerging cyber-communities. S.R. Kumar, P. Raghavan, S. Rajagopalan,
and A. Tomkins. Eighth
World Wide Web conference, Toronto,
Canada,
May 1999.
- Extracting large scale knowledge bases from the web.
S.R. Kumar, P. Raghavan, S. Rajagopalan,
and A. Tomkins. IEEE International conference on Very Large Databases
(VLDB), Edinburgh,
Scotland.
- Clustering categorical data: an approach based on
dynamical systems. D. Gibson, J. Kleinberg and P. Raghavan. Proceedings of the VLDB conference,
1998.
- The effectiveness of GlOSS
for the Text Database Discovery Problem L. Gravano, H. Garcia-Molina, A. Tomasic.
SIGMOD 1994. (GlOSS)
7. Data Mining
- Uncertain
Data Mining: An Example in Clustering Location Data. Michael Chau,
Reynold Cheng, Ben Kao, Jackey Ng. PAKDD 2006: 199-204.
- Trajectory pattern mining,
Fosca Giannotti, Mirco Nanni,
Fabio Pinelli, Dino Pedreschi, Proceedings of the 13th ACM SIGKDD
international conference on Knowledge discovery and data mining KDD '07
- Geo-word centric association
rule mining, Katsumi Takahashi, Iko
Pramudiono, Masaru
Kitsuregawa, Proceedings of the 6th international conference on Mobile data management MDM '05
- Similarity and matching:
Distributed spatio-temporal similarity search,
Demetrios Zeinalipour-Yazti,
Song Lin, Dimitrios Gunopulos, Proceedings of the 15th ACM
international conference on Information and knowledge management CIKM '06
- Temporal moving pattern mining for location-based
service,
Journal of Systems and Software, Jun Wook
Lee, Ok Hyun Paek,
Keun Ho Ryu,
Volume 73 , Issue 3 (November-December 2004)
- Incorporating Prior Knowledge with Weighted Margin Support
Vector Machines, Xiaoyun Wu, and Rohini Srihari. Proceedings of the tenth ACM SIGKDD international
conference on Knowledge discovery and data mining, 2004
- Query Chains: Learning to Rank from Implicit Feedback,
Filip Radlinski and Thorsten Joachims. Proceeding of the
eleventh ACM SIGKDD international conference on Knowledge discovery in
data mining 2005
- Very Sparse Random Projections, Proceedings of the 12th ACM SIGKDD
international conference on Knowledge discovery and data mining, 2006
- Generating Semantic Annotations for Frequent Patterns
with Context Analysis, Qiaozhu Mei, Dong Xin,
Hong Cheng, Jiawei Han, and ChengXiang
Zhai, Proceedings of
the 12th ACM SIGKDD international conference on Knowledge discovery and
data mining, 2006
- Ongoing Management and Application of Discovered
Knowledge in a Large Regulatory Organization: A Case Study of the Use and
Impact of NASD Regulation's Advanced Detection System,
Ted Senator. Proceedings of the sixth ACM SIGKDD
international conference on Knowledge discovery and data mining 2000|
- Empirical Bayes Screening
for Multi-Item Associations in Massive Datasets,
William DuMouchel and Daryl Pregibon.
Proceedings of the sixth ACM SIGKDD international
conference on Knowledge discovery and data mining 2003
- Capturing Best Practice for Microarray
Gene Expression, Gregory Piatetsky-Shapiro,
Tom Khabaza, and Sridhar Ramaswamy. Proceedings of the sixth ACM SIGKDD international
conference on Knowledge discovery and data mining 2003
8. Privacy Preserving Data Mining
1.
A Privacy-Preserving Index for Range Queries,
Bijit Hore, Sharad Mehrotra, Gene Tsudik, VLDB 2004
2.
Auditing Compliance with a Hippocratic Database,
Rakesh Agrawal Roberto Bayardo Christos Faloutsos Jerry
Kiernan Ralf Rantzau Ramakrishnan
Srikant, VLDB 2004
3.
Privacy-preserving data mining. R. Agrawal and S. Ramakrishnan. In Proceedings
of of the 2000 ACM SIGMOD International Conference on
Management of Data, pp. 439--450, 2000.
4.
On the design and quantification of privacy preserving data
mining algorithms. D. Agrawal and C.
C. Aggarwal, In Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART
Symposium on Principles of Database Systems, Santa Barbara, California,
USA, May 21-23 2001. ACM.
5.
Privacy Preserving Indexing of Documents on the Network,
Mayank Bawa, Roberto Bayardo Jr., and Rakesh Agrawal. In VLDB, 2003
6.
Topk Queries Across Multiple Private Databases,
L. Xiong, S. Chitti, L. Liu. In Proc of Intl Conf
of Distributed Computing Systems ICDCS, 2005
7.
Information Hiding -- A Survey, F. A. P. Petitcolas, R. J. Anderson, M. G. Kuhn. In Proc of IEEE
Special Issue on Protection of MultiMedia Content
87(7):1062-1078, July 1999
8.
L. Xiong, S. Chitti, L. Liu.
``Mining Multiple Private Databases using a kNN Classifier". In ACM Annual
Symposium of Applied Computing (SAC), Data Mining Track, Seoul, Korea,
March, 2007
9.
Keke Chen and Ling Liu. ``Towards Attack-Resilient Geometric Data Perturbation",
Proceedings of the 7th SIAM (Society for Industrial and Applied Mathematics)
International Conference on Data Ming (SDM 2007), to be held in Minneapolis,
Minnesota, April 26-28, 2007.
10.
Keke Chen and Ling Liu. ``A Random Rotation Perturbation Approach to Privacy
Preserving Data Classification", Proceedings of the
Third IEEE International Conference on Data Mining (ICDM'05), New Orleans, Louisiana,
U.S.A.,
November 27-30, 2005. (full paper).
11.
Protecting Privacy when Disclosing Information:
k-Anonymity and its Enforcement Through Generalization and Specialization. P.
Samarati, L. Sweeney, TechReport
SRI-CSL-98-04, SRI Intl., 1998.
9. Workflow Management
1. Joonsoo Bae, Ling Liu,
James Caverlee and William Rouse. Processing Mining, Discovery and Integration using Distance
Measures, Proceedings of IEEE Int. Conf. on Web Services. to be held in Chicago,
USA, Sept
18-22.
2. POESIA: An Ontological Workflow Approach for Composing Web
Services in Agriculture, Renato Fileto,
Ling Liu, Calton Pu, Claudia Bauzer Medeiros, Eduardo
Delgado Assad,
International Journal
of Very Large Database Systems, 12(4): 352-367 (2003). Special issue on
Semantic Web, Guest Editors: Vijay Atluri, Anupam
Joshi, Yelena Yesha.
3. A Systematic Approach to Flexible
Specification, Composition, and Restructuring of Workflow Activities, Ling Liu,
Calton Pu, Duncan Dubugras Ruiz, In Journal of
Database Management, Vol. 15, No.1,Jan/March, 2004. pp1-40.
5. Methodical Restructuring of Complex
Workflow Activities. Ling Liu and
Calton Pu. IEEE 14th International Conference on Data
Engineering, February 23-27, 1998, Orlando,
Florida, USA.
pp342-350.
7.
10. Role based Access Control
1.
Role Based Access Control, D.F. Ferraiolo and D.R. Kuhn (1992) ,15th National Computer
Security
2.
Role Based Access Control: Features and Motivations,
D.F. Ferraiolo, J. Cugini,
D.R. Kuhn, Computer Security Applications Conference - extends the 1992
model
3.
An Introduction to Role Based Access Control
NIST CSL Bulletin on RBAC (December, 1995)
4.
Formal Specification for Role Based Access Control
User/Role and Role/Role Relationship Management, S. Gavrila, J. Barkley, Third ACM Workshop on Role-Based
Access Control.
5.
Role Based Access Control ,
D.F. Ferraiolo, D.R. Kuhn, R. Chandramouli,
Artech House, 2003.
6.
Mutual
Exclusion of Roles as a Means of Implementing Separation of Duty in Role-Based
Access Control Systems, D.R. Kuhn, Second ACM Workshop on
Role-Based Access Control. 1997
7.
Role
Based Access Control on MLS Systems Without Kernel Changes,
D.R. Kuhn,Third ACM Workshop on Role Based Access
Control, October 22-23,1998.
8.
Supporting Relationships in Access Control using Role Based
Access Control , J. Barkley, C. Beznosov,
Uppal, Fourth ACM Workshop on Role-Based Access
Control (1999).
9.
Managing Role/Permission Relationships Using Object Access
Types, J.F. Barkley, A.V. Cincotta,
Third ACM Workshop on Role Based Access Control (1998).
10. A Resource Access Decision Service for CORBA-based
Distributed Systems, Beznosov,
Deng, Blakley, Burt, Barkley, ACSAC (Annual Computer
Security Applications Conference) 1999.
11. The Economic Impact of Role Based Access Control.
Research Triangle Institute. NIST Planning Report 02-01. 2002
12. Comparing Simple Role Based Access Control Models and
Access Control Lists, J. Barkley, (1997), Second ACM
Workshop on Role-Based Access Control.
13. The NIST Model for Role Based Access Control: Towards a
Unified Standard, R. Sandhu, D.
Ferraiolo, R. Kuhn, Proceedings, 5th ACM Workshop on
Role Based Access Control, July 26-27, 2000.
14. Role Based Access Control Features in Commercial Database
Management Systems, R. Chandramouli,
R. Sandhu, 21st National Information Systems Security
Conference, October 6-9, 1998, Crystal City, Virginia.
15. Inheritance Properties of Role Hierarchies,
W.A. Jansen, 21st National Information Systems Security Conference, October
6-9, 1998, Crystal City, Virginia.
16. Business Process Driven Framework for defining an Access
Control Service based on Roles and Rules, R. Chandramouli, 23rd National Information Systems Security
Conference, 2000.
17. A Revised Model for Role Based Access Control,
W.A. Jansen, NIST-IR 6192, July 9, 1998
18. Role-Based Access Control Models, R. S. Sandhu, E.J. Coyne, H.L. Feinstein, C.E. Youman, IEEE Computer 29(2): 38-47, IEEE Press, 1996
19. A Proposed Standard for Role Based Access Control,
D. Ferraiolo, R. Sandhu, S.
Gavrila, D.R. Kuhn, R. Chandramouli.
ACM Transactions on Information and System Security ,
vol. 4, no. 3 (August, 2001) - draft of a consensus standard for RBAC.
20. Implementing Role Based Access Control Using Object
Technology, J. Barkley,First
ACM Workshop on Role-Based Access Control (1995).
21. Role Based Access Control (book), D.F. Ferraiolo, D.R. Kuhn, R. Chandramouli,
Artech House, 2003.
22. Object Retrieval and Access Management in Electronic
Commerce, S. Wakid,
J.F. Barkley, M.Skall,IEEE
Communications Magazine, September 1999.
23. A Marketing Survey of Civil Federal Government
Organizations to Determine the Need for RBAC Security Product,
(SETA Corporation, 1996).
24. Aameek
Singh, Mudhakar Srivatsa, Ling Liu. ``Efficient and Secure Search of Enterprise File Systems",
Proceedings of IEEE International Conference on Web Services (ICWS 2007), July
9-13, 2007, Salt Lake City,
Utah, USA.
11.Data
Warehouse and OLAP
- Lineage Tracing for General Data Warehouse
Transformations, Yingwei
Cui and Jennifer Widom, VLDB, 2001.
- Adapting Materialized Views After Redefinitions: Techniques
and a Performance Study, A. Gupta, I. S. Mumick, J. Rao, and K. A. Ross, Information Systems,
2001 (Special issue on Data Warehousing).
- Edited synoptic cloud reports from ships and land
stations over the globe (1982-1991), C. Hahn, S. Warren, and J. London,2001.
- The UCI KDD archive, S. Hettich and S.
D. Bay.
University of California, Irvine, 2000.
- Olap over uncertain and imprecise data, Doug
Burdick, Prasad M. Deshpande, T. S. Jayram, Raghu Ramakrishnan, and Shivakumar Vaithyanathan,
The VLDB Journal, 16:1, 123—144, 2006.
- Olap solutions: building
multidimensional information systems second edition,
Erik Thomsen, 2002.
- Encoded Bitmap Indexing for Data Warehouses,
M.C. Wu and A.P. Buchmann, ICDE, 220-230, 1998.
- An Alternative Storage Organization for ROLAP
Aggregate Views Based on Cubetrees,
Yannis Kotidis and Nick Roussopoulos, ACM
SIGMOD, 249—258, 1998.
- DocCube: multi-dimensional visualization and exploration of
large document sets, Josiane
Mothe, Claude Chrisment,
Bernard Dousset, and Joel Alaux,Journal
of the American Society for Information Science and Technology, 54:7,
650—659, 2003.
- On the design and evaluation of a multi-dimensional
approach to information retrieval, M. C. McCabe, J.
Lee, A. Chowdhury, D. Grossman, and O. Frieder, SIGIR '00, 363—365, 2000.
- Modeling, querying and reasoning about olap databases: a functional approach,
Ken Q. Pu, DOLAP, 1-8, 2005.
- Reconsidering multi-dimensional schemas,
Tim Martyn, SIGMOD Rec., 33:1, 83—88,
2004.
- Privacy preservation for data cubes,
Sam Y. Sung, Yao Liu, Hui Xiong, and Peter A.
Ng, Knowledge and Information Systems, 9:1, 38—61, 2006.
- A Temporal Query Language for OLAP: Implementation and
a Case Study, Alejandro Vaisman
and Alberto Mendelzon, DBPL, 2001.
- Intelligent rollups in multidimensional OLAP data,
Gayatri Sathe and Sunita Sarawagi, The VLDB
Journal, 531-540, 2001.
- Serving Datacube Tuples from Main Memory, K. A.
Ross and K. A. Zaman, SSDBM, 2000.
- Hybrid Query and Data Ordering for Fast and
Progressive Range-Aggregate Query Answering,
International Journal of Data Warehousing and Mining, Cyrus Shahabi, Mehrdad Jahangiri, and Dimitris
Sacharidis,2005.
- Space-efficient cubes for olap
range-sum queries, Decis.
Support Syst, Seok-Ju
Chun, Chin-Wan Chung, and Seok-Lyong Lee, 37:1,
83—102, 2004.
- pCube: update-efficient online aggregation with progressive
feedback and error bounds, Mirek
Riedewald, Divyakant Agrawal, and Amr El Abbadi, SSDM, 95-108, 2000
- MM-Cubing: computing iceberg cubes by factorizing the
lattice space, Zheng Shao,
Jiawei Han, and Dong Xin,
Proceedings of the 16th International Conference on Scientific and Statitistical Database Management (SSDBM), 2004.
- The cgmCUBE project:
Optimizing parallel data cube generation for ROLAP,
Frank Dehne, Todd Eavis,
and Andrew Rau-Chaplin, Distributed and Parallel Databases, 19:1,
29—62, 2006.
- OLAP: Efficient Parallel Generation and Querying of
Terabyte Size ROLAP Data Cubes, Chen, Y.,
Rau-Chaplin, A., Dehne, F., Eavis,
T., Green, D., and Sithirasenan, ICDE'06, 2006.
- Evaluation of top-k OLAP queries using aggregate
R-trees, N. Mamoulis, S. Bakiras, and P. Kalnis,International
Symposium on Spatial and Temporal Databases (SSTD), 2005.
- Efficient OLAP Operations for Spatial Data Using Peano Trees, B. Wang, F. Pan, D.
Ren, Y. Cui, D. Ding, and W. Perrizo,
8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge
Discovery, 2003.
- A Pareto Model for OLAP View Size Estimation,
Thomas P. Nadeau and Toby J. Teorey,CASCON,
2001.