Monthly Archives: December 2009

Part I: Data structures

The precise contents of Part I of algorithms for massive data sets is beginning to finalize. I’ll put some more detailed (But still preliminary) descriptions in the weekplan asap.┬áPart I will be about advanced data structures and will cover a small selection of data structure problems. Collectively, the solutions to these problems will illustrate some the most important techniques for organizing data efficiently. These are the problems that we plan cover:

  • Hashing. Mostly universal hashing and perfect hashing. Universal hashing allows us to come up with efficient hash table without the immensely unrealistic assumption of a random function.
  • Predecessor data structures. A balanced search tree supports O(log n) predecessor queries. We will dive into the faster data structures such as the van Emde Boas data structure and “x-fast” and “y-fast” tries.
  • Nearest Common Ancestor. The nearest common ancestor of two nodes in a rooted tree is the common ancestor of greatest depth. We will attack this problem from a recent model of distributed data structures and look at some of the important applications of it.
  • Marked Ancestors and Decremental Connectivity. More juicy problems on trees.
  • Temporal Data Structures. Persistent data structures and retroactive data structures adds a time dimension to operations on data structures.


Welcome to the homepage for the course: “Algorithms for Massive Data Sets”, Spring 2010. We will use this blog to easily communicate and discuss course topics and issues. The front page contains the basic information for the course, including contents, teachers, schedule, exercises, and pointers to relevant literature.