RED-BLACK TREES From Ch.3 of Chris Okasaki, "Purely Functional Data Structures" > module RedBlackTrees where > data Color = Red | Black > deriving (Eq,Show) Red-Black trees are binary search trees in which all nodes are assigned a color. The leaves are considered black. > data RBTree a = Leaf | Node Color (RBTree a) a (RBTree a) > deriving (Eq,Show) When working with trees, we want to keep the following invariants true: 1) Children of a red node are black; 2) Every path from the root to a leaf has the same number of black nodes. Notice that these conditions imply that a long path can be at most twice as long as a short one. A shortest path has only black nodes, a longest path has alternating black and red nose. This in turn implies that the depth of the tree (maximum length of a path) is logarithmic in the number of elements in the tree. We must guarantee that insertion of a new element preserves the invariants. The balance function needs to fix violations of the first invariant. These happen when there are two consecutive red nodes under the root (the root will be colored black anyway). This can happen if four different ways. > balance :: Color -> RBTree a -> a -> RBTree a -> RBTree a > balance Black (Node Red (Node Red t1 x1 t2) x2 t3) x3 t4 > = Node Red (Node Black t1 x1 t2) x2 (Node Black t3 x3 t4) > balance Black (Node Red t1 x1 (Node Red t2 x2 t3)) x3 t4 > = Node Red (Node Black t1 x1 t2) x2 (Node Black t3 x3 t4) > balance Black t1 x1 (Node Red (Node Red t2 x2 t3) x3 t4) > = Node Red (Node Black t1 x1 t2) x2 (Node Black t3 x3 t4) > balance Black t1 x1 (Node Red t2 x2 (Node red t3 x3 t4)) > = Node Red (Node Black t1 x1 t2) x2 (Node Black t3 x3 t4) > balance c t x t' = Node c t x t' Insertion is done like for the regular binary search trees except that we fix the imbalance at every step. The auxilliary function ins can give an umbalanced tree only by having two red nodes at the top; this property is preserved when we apply balance. > insTree :: Ord a => a -> RBTree a -> RBTree a > insTree a tree = blackRoot (insa tree) > where insa Leaf = Node Red Leaf a Leaf > insa t@(Node color t1 x t2) > | a | a>x = balance color t1 x (insa t2) > | otherwise = t > blackRoot (Node _ t1 a t2) = Node Black t1 a t2 The advantage of using balanced trees is that insertion runs in log n time. One can then show that sorting by turning a list into a tree and viceversa take n.log n time. > listTree :: Ord a => [a] -> RBTree a > listTree [] = Leaf > listTree (a:as) = insTree a (listTree as) > treeList :: RBTree a -> [a] > treeList Leaf = [] > treeList (Node _ t1 x t2) = (treeList t1) ++ x:(treeList t2) > treeSort :: Ord a => [a] -> [a] > treeSort = treeList . listTree As an instance of a dictionary, a balanced tree can be searched in logarithmic time, instead of a worst-case of linear time. > class Dictionary h where > empty :: Ord a => h a > isEmpty :: Ord a => h a -> Bool > insDic :: Ord a => a -> h a -> h a > search :: Ord a => a -> h a -> Bool > instance Dictionary RBTree where > empty = Leaf > isEmpty Leaf = True > isEmpty _ = False > insDic = insTree > search = searchTree > searchTree :: Ord a => a -> RBTree a -> Bool > searchTree a Leaf = False > searchTree a (Node _ t1 x t2) > | a==x = True > | a<=x = searchTree a t1 > | otherwise = searchTree a t2