Feedback on G5BADS formal coursework 2004-2005

The average mark for the coursework was 63 (out of 100). There were 161 submissions. The top mark was 93.

The mark is a composite of the mark for the program (whether it worked, how well it was written, how challenging the solution was) and the essay (mostly, complexity analysis, and how well the solution was justified).

The coursework was tested with the following test harness on a relatively small Scheduler (10000 tasks). The programs which ran in reasonable time then were tested on Schedulers of several millions tasks. The prize for the best ADS coursework goes to Ho Duc Thang , who wrote a very general solution . It is based on a binary heap, but makes it possible to change the branching factor of the tree, or in some cases, sort the underlying array (if there seem to be many consecutive calls to next()). It was the fastest program in most of the tests on large schedulers, and the background design and testing were very impressive.

There were several types of solutions.

  1. The simplest solution is along the lines of It stores tasks in an ArrayList using add() method of ArrayList, so complexity of insert() is O(1). To find the task with the smallest time stamp, it iterates through the ArrayList checking the stamps. Then it removes the task with the smallest time stamp (note that the removal in ArrayList is O(n) because we need to close the gap in the underlying array). So next() is O(n) and quite slow.

    There is a variant of this solution which is even slower. In that variant, insert() adds the task to the ArrayList and then sorts the ArrayList. Depending on the sorting method, this is O(n2) or O(n log n). If the ArrayList is sorted in descending order of tasks, the task with the smallest time stamp is in the last slot in the array and removing it is O(1). However, if the ArrayList was sorted in ascending order, then the task with the smallest time stamp is in slot 0, and removing it is O(n) since all the tasks have to be shifted one place to the left.

  2. Most people did or attempted to do a heap implementation of priority queue. Here is my model solution, which is not particularly fast; it could be speeded up by, for example, using an array to store elements and resizing it myself using System.arraycopy(). You need two files, implementing a heap of Tasks and which uses a TaskHeap. Both insert() and next() here are O(log n).

  3. There were several excellent solutions using binary search trees. insert() adds the task to the tree respecting the order of stamps, and next() finds, removes and returns the leaf on the leftmost branch. In a balanced tree, both operations are O(log n). In an arbitrary binary search tree, if the tree becomes unbalanced, they are O(n). When tasks are created with random time stamps, trees stay balanced and the program runs very fast. AVL and red-black trees of course stay balanced in any case.
  4. Finally, there were several interesting solutions which hashed tasks by time stamps in a huge array. Tasks with the same stamp i were placed in the bucket at slot i. insert() is O(1), however next() is O(n) in the worst case.
We did not mark down good programs which used a fixed size array too much. However, I did not include them in the prize competition.