Archive 2013

Adminstrative info


State-of-the-art algorithmic techniques and models for massive data sets.

Official description


Inge Li Gø, office hours Friday 12-13, Building 322, room 018.



When and where

Monday 8.15- 10.  Bldg. 308/Aud. 11

Monday 10-12. Bldg. 324/Room 040

The course runs in the DTU spring semester (Feb 4th to May 13th). There is no teaching in the Easter break (March 25th and April 1st).

Mandatory exercises

Use the template.tex file to prepare your hand in exercises. Compile using LaTeX. Upload the resulting pdf file (and only this file) via Campusnet. The maximum size of the finished pdf must be at most 2 pages.

Collaboration policy for mandatory exercises.

  • You may collaborate with fellow students on the hand in exercises.
  • Collaboration is limited to discussion of ideas only, and you should write up the solutions entirely on your own.
  • Do not use or seek out solutions from previous years of the course, solutions from similar courses, or solutions found on the internet.
  • You should list your collaborators (see the template) as well as cite any references you may have used.


The weekplan is preliminary. It will be updated during the course. Under each week there is a number of suggestions for reading material regarding that weeks lecture. It is not the intention that you read ALL of the papers. It is a list of papers and notes where you can read about the subject discussed at the lecture.

Week 1: Introduction and Hashing: Chained, Universal, and Perfect.

Week 2Predecessor Data Structures: x-fast tries and y-fast tries.

Week 3: Decremental Connectivity in Trees: Cluster decomposition, Word-Level Parallelism.

Week 4Nearest Common Ancestors: Distributed data structures, Heavy-path decomposition, alphabetic codes.

Week 5: Range Reporting: Range Trees, Fractional Cascading, and kD Trees.

Week 6: Persistent data structures.

Week 7: Union-Find and amortized analysis (potential method).

Amortized Analysis:


Week 8: String Indexing: Dictionaries, Tries, Suffix trees, and Suffix Sorting.

Week  9: Introduction to approximation algorithms: TSP, k-center, and vertex cover.

Week 10:  Approximation algorithms: Stable matching and TSP.

Week 11: Data Compression: Lempel-Ziv, Entropy, and Burrows-Wheeler Transform.

  • Jacob Ziv and  Abraham Lempel: “A Universal Algorithm for Sequential Data Compression”. IEEE Transactions on Information Theory 23 (3): 337–34
  •  Jacob Ziv and  Abraham Lempel: “Compression of Individual Sequences via Variable-Rate Coding”. IEEE Transactions on Information Theory 24 (5): 530–536.
  • Guy E. Blelloch: Introduction to Data Compression.
  • Slides
  • Exercise

Week 12: External Memory: I/O Algorithms, Cache-Oblivious Algorithms, and Dynamic Programming

Week 13: Round up,  Questions,  and Further Perspectives.


How should I write my mandatory exercises?

The ideal writing format for mandatory exercises is classical scientific writing, such as the writing found in the peer-reviewed articles listed as reading material for this course (not textbooks and other pedagogical material). One of the objectives of this course is to practice and learn this kind of writing. A few tips:

  • Write things directly: Cut to the chase and avoid anything that is not essential. Test your own writing by answering the following question: “Is this the shortest, clearest, and most direct exposition of my ideas/analysis/etc.?”
  • Add structure: Don’t mix up description and analysis unless you know exactly what you are doing. For a data structure explain following things separately: The contents of the data structure, how to build it, how to query/update it, analysis of space, analysis of query/update time, and analysis of preprocessing time. For an algorithm explain separately what it does, analysis of time complexity, and analysis of space complexity.
  • Be concise: Convoluted explanations, excessively long sentences, fancy wording, etc. have no in place scientific writing. Do not repeat the problem statement.
  • Try to avoid pseudocode: Generally, aim for human readable description of algorithms  that can easily and unambiguously be translated into code.
  • Examples only as support: Do not explain your algorithms and data structures with an example. Only use examples as additionally illustration of your ideas.

How much do the mandatory exercises count in the final grade?

The final grade is an overall evaluation of your mandatory exercise and the oral exam combined. Thus, there is no precise division of these part in the final grade. However, expect that (in most cases, and under normal circumstances) the mandatory exercises account for a large fraction of the final grade, and the oral exam is a “fine tuning” of your scores in the mandatory exercises.

What do I do if I want to do a MSc/BSc thesis or project in Algorithms? 

Great! Algorithms is an excellent topic to work on 🙂 and Algorithms for Massive Data Sets is designed to prepare you to write a strong thesis. Some basic tips and points.

  • Let us know well in advance: Identifying an interesting problem in algorithms that matches your interest can take time. With enough time to go over the related litterature and study up on relevant topics your project will likely be more succesful. It may also be a good idea to do an initial “warm up” project before a large thesis to test ideas or survey an area.
  • Join the community: It is very good idea to enter the local algorithms community at DTU to get a feel for what kind of stuff you could work on for your thesis and what thesis work algorithms is about. Talk to other students doing thesis work in algorithms. Join the algolog mailing list and go to algorithms talks. Also, we strongly encourage you to go to the thesis defenses in algorithms announced on the mailing lists.
  • Collaborate: We strongly encourage you to do your thesis in pairs. We think that having a collaborator to discuss with greatly helps in many aspects of thesis work in algorithms. Our experience confirms this.
  • No strings attached. Choosing a topic for your thesis is important. You are welcome to discuss master thesis topics with us without pressure to actually write your thesis in algorithms. We encourage you to carefully select your topic.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s