While doing computational geometry research with Professor Pankaj K. Agarwal and Thomas Mølhave at Duke, we looked at using quadtrees and the well-separated pairs decomposition (WSPD) for one of our algorithms. As I was reading about the algorithms and geometric concepts, I decided to code them in Javascript and HTML5’s canvas to get more intuition about their behavior. **I will describe the basics of the math behind the geometry in this blog post. To just play with the demo go here.**

*[All of what I am describing is based off of my reading of Professor Sariel Har-Peled's book on the subject. I suggest you read that (or other online notes) if you are really interested in the mathematical properties of well-separated pairs decomposition (or a range of other geometric topics).]*

#### Quadtrees

Quadtrees are a fairly common geometric concept and also data structure for storing points in \(d\)-dimensional space. We assume all points are within some large cube or in 2-dimensions, as we will focus on, a square. We split the square into four equal sized smaller squares. We can do this again on each of the smaller squares and do this continuously (recursively). We consider each square to be a cell and the four smaller cells within it to be its “children.” A quadtree is a division of the square into smaller and smaller cells until every point has its own cell.

Therefore, we can search the space fairly easily and quickly. We can even plot non-spacial things in a spacial metric to make use of the data structure. For computer science-minded people, these cells are organized in a tree (why its called a quadtree), compressed for cells that don’t contain any points, and thus each leaf node contains one point. Searching the quadtree is thus fairly fast. The properties and uses of quadtrees, and its many variants, such as how accurate it is for nearest neighbor searches, are extensive and, again, can be read in Professor Har-Peled’s notes among other places.

#### Well-Separated Pairs Decomposition

The well-separated pairs decomposition provides a more specific method than quadtrees of characterizing the general distance between points in a large set of points without storing the distance for each of the \( {n \choose 2} \) combinations. Broadly, it does this by clustering points that are close together and only describing the distance between such clusters. The conditions for a WSPD are relatively simple, but require some mathematical notation. I will try to give a simple description below.

We say that the points are clustered into sets. The diameter of a set \(Q\) is $$\mathrm{diam}(Q) = \max_{p,q \in Q} \| p – q \|$$ where \( \|p-q\| \) is the distance between \(p\) and \(q\). In English, the diameter is the distance between the two points furthest from each other in the set. Two sets \(Q\) and \(R\) are \(1/\epsilon\)-separated if $$\max({\rm diam}(Q),{\rm diam}(R)) \leq \epsilon \cdot d(Q,R).$$Here $$d(Q,R) = \min_{q\in Q, s\in R} \|q-s\|,$$or \(d(Q,R)\) is the smallest distance between any pair of points from \(Q\) and \(R\), intuitively giving how close the two clusters are. Therefore, if two sets are \(1/\epsilon\)-separated then all of the points in one set are roughly the same distance from all of the same points in the other set, with some error relative to \(\epsilon\). Finally, a pair decomposition is a set of set pairs: $$\mathscr{W} = \{ \{A_1,B_1\}, \ldots \{A_s, B_s\} \}$$ such that:

- All points in any \(A_i\) or \(B_i\) are part of the original set of data points, \(P\): $$A_i,B_i \subset P$$
- For any pair \(\{A_i,B_i\}\) there is no overlap in points between the two sets $$A_i \cap B_i = \emptyset$$
- For any pair of
*points*\(p,q \in P\) there is at least one pair \(\{A_i,B_i\}\) such that \(p\) is in \(A_i\) and \(q\) is in \(B_i\)

Therefore, if we have a well-separated pair decomposition, then we have a pair decomposition where each pair of sets \(\{A_i,B_i\}\) are \(1/\epsilon\)-separated. The third criteria for a pair decomposition here is most interesting, simply stating that for any pair of points we can estimate their distance apart with the WSPD.

One algorithm to construct the WSPD climbs down the quadtree and uses the cells of the quadtree as possible set pairs in the WSPD. This is the algorithm I used in my implementation, as is shown in the Javascript [pseudocode] below:

^{?}View Code JAVASCRIPT

function ConstructWSPD(cell1, cell2) { var arr = new Array(); if(cell1.diam() < cell2.diam()) { // Swap cell1 and cell2 var temp = cell1; cell1 = cell2; cell2 = temp; } if(cell1.diam() < epsilon * d(cell1,cell2)) { //The two cell's points are 1/epsilon-separated // This is a theoretical class SetPair which would link the two sets arr.push(new SetPair(cell1.points, cell2.points)); //My code actually uses the following statement to draw the relationship, rather than store it efficiently arr.push(new Dumbell( new PBall(cell1.points), new PBall(cell2.points) )); return arr; } if(cell1.children != null) { for(var i = 0; i < 4; i++ ) { arr = arr.concat(ConstructWSPD(cell1.children[i], cell2)); } } return arr; } // Here QuadTreeHead is the top cell in the quadtree: var wspd = ConstructWSPD(QuadTreeHead,QuadTreeHead); |

The code above recursively steps through the quadtree attempting to use different cells from the quadtree as sets for the WSPD. **Check out my small web application here to get some more intuition about how WSPDs relate to quadtrees as well as the impact of \(\epsilon\) on the WSPD.** There are many interesting properties and uses of quadtrees and WSPDs that I encourage you to read about further online or in Professor Har-Peled’s book.

## 2 responses so far ↓

1

Chad Brewbaker// Jan 16, 2012 at 4:13 pmGreat post. If more people followed Callahan and Kosaraju IMHO the state of the art would be far better than it is today.

You guys using Eppstien et al’s framework?

1. Morton sort the points.

2. Do an all nearest larger values to get the tree structure.

3. Do a rake and compress.

2

Panos Parchas// May 15, 2012 at 9:00 amWell done!

## Leave a Comment