Wednesday, September 10, 2008

Modified Preorder Tree Traversal

How do you represent an organizational structure in a SQL data table? When I ask this question in interviews the answer I always get is with an adjacency list (although no one refers to it as that)

An adjaceny list structure would look like this:


EmployeeId, EmployeeName,ManagerId
----------------------------------
1, Bob,
2, Frank, 1
3, Stan, 1
4, James, 3


This structure shoes that Bob is the boss with no manager. Frank and Stan report to Bob, and James reports to Stan. This structure is very common. If you want to list all of Stan's reports, its easy:

select EmployeeName from Employees where ManagerId = (select EmployeeId from Employees where EmployeeName = 'Stan')


But what if you want to list all of Bob's reports? I have yet had anyone answer this question completely to my satisfaction. The typical response is to use a recursive look up. That is all well and good if your tree isn't very deep, but can we do better? Yes!

We can instead structure our table as follows:


EmployeeName,LeftId, RightId
----------------------------------
Bob, 1, 8
Frank, 2, 3
Stan, 4, 7
James, 5, 6


Huh you ask what did we just do? Let visualize it like this:


1-bob-8
/ \
2-Frank-3 4-Stan-7
/
5 James 6



We start at the left of Bob and assign the first id 1. We move down his org chart and hit Frank and assign him next id of 2. Frank has no children so we go to his right node and assign next id of 3. The next sibling is Stan with id 4. Stan has a child so we go to James with id 5, then work our way back up the tree assigning James Right 6, Stan 7 and Bob 8.

So our algorithm is:

AssignId(Node n)
n.leftid = nextId()
foreach(Node child : n.children)
AssignId(child)
// No more children assign my right
n.rightid = nextId()


So what does this accomplish? Lets answer my orignal question. Display all reports of Bob:

(leftid,rightid) =select LeftId,RightId from Employee where EmployeeName = 'Bob'
select * from Employee where lft between $leftid, $rightid;


That's it. With this tree structure you can easily find all the children of a given node. Of course this method has drawbacks. Inserts are more difficult. But if you have a database that is mostly read only you may consider giving this a try.

Enjoy...

No comments: