Contents
State-of-the-art algorithmic techniques and models for massive data sets.
Teachers
Inge Li Gørtz, ilg@imm.dtu.dk
Philip Bille, phbi@imm.dtu.dk
When and where
Monday 8.15- 10. Bldg. 208, Aud. 51.
Monday 10-12. Bldg. 210, Rooms 112 and 118.
Weekplan
The weekplan is preliminary. It will be updated during the course. Under each week there is a number of suggestions for reading material regarding that weeks lecture. It is not the intention that you read ALL of the papers. It is merely a list of papers and notes where you can read about the subject discussed at the lecture.
Week 1: Introduction and Hashing: Chained, Universal, and Perfect.
- J. Carter and M. Wegman, Universal Classes of Hash Functions. J. Comp. Sys. Sci., 1977
- M. Fredman, J. Komlos and E. Szemeredi, Storing a Sparse Table with O(1) Worst Case Access Time, J. ACM., 1984
- Scribe notes from MIT
- Peter Bro Miltersen’s notes from Aarhus
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 2: Predecessor Data Structures: x-fast tries and y-fast tries.
- P. van Emde Boas, Preserving Order in a Forest in less than Logarithmic Time, FOCS, 1975
- Dan E. Willard, Log-Logarithmic Worst-Case Range Queries are Possible in Space Θ(n), Inf. Process. Lett., 1983
- Scribe notes from MIT
- Mihai Patrascu and Mikkel Thorup, Time-space trade-offs for predecessor search, STOC 2006
- Cormen, Rivest, Leiserson, Introduction to Algorithms, 3rd edition, Chap. 20
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 3: Decremental Connectivity in Trees: Cluster decomposition, Word-Level Parallelism.
- S. Alstrup, J. P. Secher, M. Spork: Optimal On-Line Decremental Connectivity in Trees, Inf. Process. Lett., 1997
- S. Alstrup, J. P. Secher, M. Thorup: Word encoding tree connectivity works. SODA, 2000
- Scribe notes from MIT
- Slides
- More slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 4: Nearest Common Ancestors: Distributed data structures, Heavy-path decomposition, alphabetic codes.
- S. Alstrup, C. Gavoille, H. Kaplan, T. Rauhe, Nearest Common Ancestors: A Survey and a New Algorithm for a Distributed Environment, Theory of Comput. Sys., 2004
- D. Harel, R. E. Tarjan: Fast Algorithms for Finding Nearest Common Ancestors. SIAM J. Comput., 1984
- Scribe notes from MIT
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 5: Amortized analysis and Union-Find.
Amortized Analysis:
- Rebecca Fiebrink’s notes on amortized analysis from Princeton.
- Pawel Winter’s notes on amortized analysis from DIKU.
- R. E. Tarjan: Amortized Computational Complexity, SIAM. J. on Algebraic and Discrete Methods Volume 6, Issue 2, pp. 306-318 (April 1985)
Union-Find:
- R. E. Tarjan and J. van Leeuwen: Worst-case Analysis of Set Union Algorithms, JACM, 1984 .
- R. E. Tarjan: Efficiency of a Good But Not Linear Set Union Algorithm, JACM, 1975.
- Alstrup et al.: Union-Find with Constant Time Deletions, ICALP 2005.
- R. Seidel: Top-Down Analysis of Path Compression: Deriving the Inverse-Ackermann Bound Naturally (and Easily), SWAT 2006.
Week 6: Range Reporting: Range Trees, Fractional Cascading, and kD Trees.
- M. de Berg, O. Cheong, M. van Kreveld and M. Overmars, Computational Geometry: Algorithms and Applications, 2008
- B. Chazelle and L. Guibas: Fractional cascading: I. A data structuring technique, Algoritmica, 1986
- J. L. Bentley and D. F. Stanat. Analysis of range searches in quad trees. Inf. Process. Lett., 1975
- Scribe notes from MIT
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 7: Persistent data structures.
- H. Kaplan, Persistent Data Structures, In Handbook on Data Structures and Applications, D. Mehta and S. Sahni, editors, CRC Press, 2005.
- N. Sarnak and R. E. Tarjan, Planar Point Location Using Persistent Search Trees, CACM, 1986.
- J. R. Driscoll, N. Sarnak, D. D. Sleator, R. E. Tarjan, Making Data Structures Persistent, JCSS, 1989.
- P.F. Dietz, Fully Persistent Arrays, WADS 1989.
- G. F. Italiano and R.Raman, Topics in Data Strutures.
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 8: String matching.
- D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 1977.
- R. M. Karp and M. O. Rabin. Efficient randomized pattern-matching algorithms. IBM J. Res. Dev., 1987.
- A. V. Aho and M. J. Corasick. Efficient string matching: an aid to bibliographic search. Commun. ACM, 1975
- Cormen, Rivest, Leiserson, Introduction to Algorithms, 3rd edition, Chap. 32
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 9: String Indexing: Dictionaries, Tries, Suffix trees, and Suffix Sorting.
- Martin Farach-Colton, Paolo Ferragina, S. Muthukrishnan: On the sorting-complexity of suffix tree construction, J. ACM, 2000.
- Juha Kärkkäinen, Peter Sanders, Stefan Burkhardt: Linear work suffix array construction. J. ACM, 2006
- Dan Gusfield. Algorithms on Strings, Trees, and Sequences, Chap. 5-9.
- Scribe notes from MIT
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 10: Introduction to approximation algorithms. TSP, k-center, vertex cover.
- David P. Williamson and David Shmoys: The Design of Approximation Algorithms (sections 1.1., 2.2, and 2.4)
- Jeff Erickson: Non-Lecture K: Approximation Algorithms
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
You can also read about some of these problems in Kleinberg and Tardos: “Algorithm Design”, V.V. Vazirani: Approximation Algorithms, Cormen, Leierson, Rivest, and Stein: “Introduction to Algorithms”.
Week 11: Approximation algorithms: Set Cover, stable matching.
David P. Williamson and David Shmoys: The Design of Approximation Algorithms (page 24-28).Jeff Erickson: Non-Lecture K: Approximation Algorithms- Zoltán Király: Better and Simpler Approximation Algorithms for the Stable Marriage Problem (page 3-10)
- Slides
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
- A. Aggarwal and J. Vitter, “The Input/Output Complexity of Sorting and Related Problems“, CACM 31 (9), 1988.
- Erik Demaine, “Cache-Oblivious Algorithms and Data Structures”, Lecture Notes from the EEF Summer School on Massive Data Sets.
- Rezaul Alam Chowdhury and Vijaya Ramachandran, “Cache-oblivious dynamic programming“, SODA 2006.
- Exercises (Please hand in the mandatory one through Campusnet using the template provided)
Week 13: Round up, Questions, and further Perspectives.