Nested set model

The term nested sets (nested sets) denotes a model image of a tree with the aid of quantities that are nested. Here, the " is - child - of " relationship to an " is - subset - of" mapped relationship. The model was originally presented in the book SQL for Smarties by Joe Celko. It is mainly used in the context of database applications. This technique is also known under the name Modified Preorder Tree Traversal ( MPTT ).

Using Nested sets you bought the possibility subtrees or the path to the root to be able to read in constant time for the price of having to work when changing the tree with more complex operations.

  • Insert 3.1 knots
  • 3.2 Transforming a tree structure in nested sets
  • 3.3 Remove subtree
  • 4.1 Number of all nodes of a subtree
  • 4.2 path to a node
  • 4.3 All nodes of a subtree
  • 4.4 All leaves of a subtree
  • 4.5 Depth-first search

Classical tree

The classic approach to implementing a tree structure in a database is the adjacency model. Here, a reference is stored on its parent node to each node. The root thus has no reference to a parent node (the field is created with NULL ); a leaf is a node for which no other node points.

Tree as nested sets

Nested sets the model, the tree structure is transformed into interleaved sets. Each node corresponds to a subset; each subset is a subset of all its parent volumes.

In a database, this nested levels may be represented by two values ​​are determined for each subset, determining the limits of this subset. These values ​​are often referred to as left and right. The value on the left is always less than the right. Both values ​​of a subset are greater than the value of their parents left amount and less than the value of the right.

Example

The graph shows a tree which has been transformed into nested sets. The green numbers on the edges of a quantity, the above described values ​​to the left of the left edge and right on the right edge.

Operations

Insert node

The subset for the first node receives the values ​​1 for left and 2 for right. If another subset for a new node under an existing node inserted, is increased right for the new parents amount to 2. For this to be possible, each value left or right, which is larger than the original value to the right of the new parents quantity, are also increased by 2 before the entire tree. The values ​​to the right minus 2 and minus 1 right amount of parents are then assigned to the newly inserted subset as left and right.

It must be noted that, unlike the conventional received adjacency model, the order of the children of a node remains. The just- described procedure is equivalent to the new node always append to the end of the list of children of the parent node. It is also conceivable to insert a new subset at any point between the other subsets of the parent quantity. In this case, the values ​​left and right of the following on the newly inserted subset of children must also be increased.

Transforming a tree in nested sets

An existing tree structure can be converted into nested sets in which it is run by the depth. The edges are starting at 1 counted. Each time a node is entered, the value of this counter is assigned as the left and the counter is incremented. If the node is left, after all its children have been processed, the counter value is assigned to it as a right and the counter again increased.

Remove subtree

To remove a complete subtree, first the values ​​of the left and right of the root of the subtree to be determined. Then it can be deleted with all its subsets by all nodes are deleted, their values ​​are for left and right within these values ​​or agree with them. Finally, all the left - right values ​​and all values ​​must be reduced by the width of the subtree that are greater than the right value of the node to remove; the width is in this case the difference between the right and left whose root plus 1

Evaluation

Number of all nodes of a subtree

Half the width of a sub-tree corresponding to the number of nodes in the subtree including the root. Thus, the number of nodes are measured in the whole tree by the right value minus the value to the left of the root divided by two, and ( ab ) is rounded.

Path to a node

In contrast to Adjazenzlistenmodell the path to a node must not be determined recursively when nested sets model. It is enough to select all subsets whose left - values ​​are smaller and their right values ​​are simultaneously greater than the considered node. If these subsets in order of their left - value corresponds to their order the way from the root to the node under consideration.

All nodes of a subtree

All nodes below a given node can be determined by all subsets are selected whose values ​​are within the values ​​left and right of the processed node left and right. When Adjanzenzlistenmodell here should also be taken recursively.

All the leaves of a subtree

The query for a complete subtree can be easily modified so that only leaves so nodes without children of their own selected. For this purpose is used in the selection as an additional criterion right = left 1.

Depth-first search

Using a self -join of the tree can be traversed by the depth. Here, all tuples from two nodes are selected, in which the values ​​is to the left and right of the first node between the values ​​of the left and right of the second node. The depth of each node can be determined in which it is counted how often occurs a tuple with that node as the first node. The results are ordered left of the nodes included after the value.

This query might look like in SQL like this:

SELECT ( COUNT ( parent.id ) -1 ) AS depth, node.id   FROM tree AS node, tree AS parent   WHERE node.l BETWEEN parent.l AND parent.r   GROUP BY ORDER BY node.id node.l; Pros and Cons

  • The complexity of the reading is stable in most cases. When Adjazenzlistenmodell the effort is linearly dependent on the depth of the processed node. In exchange for a change in the tree at the Nested Sets model is much more complex than in Adjanzenzlistenmodell. Thus, it is more suitable for structures that are not modified frequently, but are very often read.
  • The representation as sets fits better in the set-oriented database world. The query language SQL can be better operated on quantities than on hierarchical structures.
  • The queries of the direct children of a node is difficult.
  • The order of the children of a node remains the same.
577757
de