Marco Benini

A few notes on balanced trees

In the last project of the course “Algoritmi and Strutture Dati”, there was a problem requiring to find all the points containing some objects in a square of given width and centred aroung a given point.

The space is a plane (well, the project was a bit more complex, but it doesn’t matter here), whose points have integer coordinates ranging from 0 to MAX, where MAX is some large value. Any point may contain some object, represented by an integer and the goal was to compute a function depending on the objects inside the square.

Since the space is large, and the objects are rare, it is reasonable to organise them in a balanced tree (of some sort), to improve efficiency of the searching algorithm. So let’s assume that the objects are organised in a tree such that, for every node, its left children are “less” than the node, and the right children are “greater” than the node.

A not-so-obvious improvement (at least for students) in retrieving the objects in the square is as follows. First, we can order the objects by their coordinates: the point (x,y) is less then (u,v) when either x < u or x = u and y < v. This is just the good, old lexicographic ordering of pair of integers. Let’s assume the tree uses this ordering.

Then, calling (cx,cy) the center of the square and D the length of its edge, a point (x,y) lies in the square if and only if, (cx -D/2) < x < (cx + D/2) and (cy – D/2) < y < (cy + D/2). Thus, if a node (x,y) in the tree satisfies these two conditions, it lies inside the square.

But is it really needed to scan the whole tree? The answer is NO! In fact, if (x,y) is in the tree and it falls inside the square but its left child (lx,ly) is outside, then we automatically know that all the points in the left subtree of (lx,ly) are necessarily outside the square. Simmetrically, if the right child (rx,ry) of (x,y) lies outside the square then the right subtree of (rx,ry) lies outside the square.

So, we can cut off potentially large parts of the tree by a simple comparison. Doing it extensively individuates the “area” of the tree we have to consider, and reduces the part we have to scan.

——

Another part of the problem required to find the object which was closest to a given point.

Again, we can use the same “trick” as above. In fact, calling (px,py) the point, we can calculate the distance from (x,y), the root node of the tree. The distance is d = ((px -x)^2 + (py-y)^2)^(1/2).

Now, an object in (x’,y’) can be closer to (px,py) than (x,y) only if (x’,y’) lies in the square centred in (px,py) of edge 2d. Notice that a point in this square is not necessarily closer, but for sure a point outside the square is not!

So we can cut part of the tree when scanning for the closest object. By recursively applying the same idea, we can further restrict our search and test a limited number of candidates.

——

The lesson of this post is that balanced trees with a clever ordering, can save a lot of computational work when we need to scan an area, with no extra data structures devoted to the task.

Leave a comment

Information

This entry was posted on May 3, 2011 by in Computer Science, Courses & Lectures, Students, Theoretical Computer Science.