Flexible tree structure,
tree builders (materialized path trees, recursive trees, data wrappers),
tree traversal iterators, filter iterator.
💿
composer require dakujem/oliva
This package is a modern reimplementation of oliva/tree
.
A tree is a form of a graph, specifically a directed acyclic graph (DAG).
Each node in a tree can have at most one parent and any number of children.
The node without a parent is called the root.
A node without children is called a leaf.
Read more on Wikipedia: Tree (data structure).
Oliva trees consist of node instances implementing TreeNodeContract
.
use Dakujem\Oliva\TreeNodeContract;
Oliva provides tree builders, classes that construct structured trees from unstructured data. The data is usually in form of iterable collections, typically a result of SQL queries, API calls, YAML config, and such.
$tree = (new TreeBuilder(
node: fn(mixed $data) => new Node($data), // A node factory.
))->build(
input: $anyDataCollection, // An iterable collection to build a tree from.
);
The builders are flexible and allow to create any node instances via the node factory parameter.
The simplest factory may look like this
fn(mixed $data) => new \Dakujem\Oliva\Node($data);
... but it's really up to the integrator to provide a node factory according to his needs:
fn(mixed $anyItem) => new MyNode(MyTransformer::transform($anyItem)),
Anything callable may be a node factory, as long as it returns a class instance implementing \Dakujem\Oliva\MovableNodeContract
.
use Dakujem\Oliva\MovableNodeContract;
💡
The full signature of the node factory callable is
fn(mixed $data, mixed $dataIndex): MovableNodeContract
.
That means the keys (indexes) of the input collection may also be used for node construction.
Each of the builders also requires extractors, described below, callables that provide the builder with information about the tree structure.
MPT refers to the technique of storing the position of nodes relative to the root node (a.k.a. "path") in the data.
There are multiple ways to actually do that and Oliva is agnostic of them.
A typical path of a node may look like the following:
"1.33.2"
,"/1/33/2"
, also known as the delimited variant"001033002"
, known as the fixed (or fixed-width) variant with constant width of 3 characters per level (here001
,033
and002
)[1,33,2]
, as an array of integers, here also referred to as a vector
Also, the individual references may either mean the position relative to other siblings on a given level, or be direct node references (ancestor IDs).
That is, a path 1.2.3
may mean "the third child of the second child of the first child of the root",
or it may mean that a node's ancestor IDs are [1,2,3]
, 3
being the parent's ID.
To enable all the different techniques, Oliva MPT TreeBuilder
requires the user to pass a vector extractor function, e.g. fn($item) => explode('.', $item->path)
.
Oliva comes with two common-case extractor factories: Path::fixed()
and Path::delimited()
.
The following tree will be used in the examples below:
[0] (the root)
|
+--[1]
| |
| +--[4]
| |
| +--[7]
|
+--[6]
| |
| +--[5]
|
+--[2]
|
+--[3]
Simple example of a fixed-length MPT:
use Any\Item;
use Dakujem\Oliva\MaterializedPath\Path;
use Dakujem\Oliva\MaterializedPath\TreeBuilder;
use Dakujem\Oliva\Node;
// The input data may be a result of an SQL query or any other iterable collection.
$collection = [
new Item(id: 0, path: ''), // the root
new Item(id: 1, path: '000'),
new Item(id: 2, path: '001'),
new Item(id: 3, path: '003'),
new Item(id: 4, path: '000000'),
new Item(id: 5, path: '002000'),
new Item(id: 6, path: '002'),
new Item(id: 7, path: '000001'),
];
$builder = new TreeBuilder(
node: fn(Item $item) => new Node($item), // How to create a node.
vector: Path::fixed( // How to extract path vector.
levelWidth: 3,
accessor: fn(Item $item) => $item->path,
),
);
$root = $builder->build(
input: $collection,
);
Same example with an equivalent delimited MPT:
use Any\Item;
use Dakujem\Oliva\MaterializedPath\Path;
use Dakujem\Oliva\MaterializedPath\TreeBuilder;
use Dakujem\Oliva\Node;
$collection = [
new Item(id: 0, path: null), // the root
new Item(id: 1, path: '.0'),
new Item(id: 2, path: '.1'),
new Item(id: 3, path: '.3'),
new Item(id: 4, path: '.0.0'),
new Item(id: 5, path: '.2.0'),
new Item(id: 6, path: '.2'),
new Item(id: 7, path: '.0.1'),
];
$builder = new TreeBuilder(
node: fn(Item $item) => new Node($item), // How to create a node.
vector: Path::delimited( // How to extract path vector.
delimiter: '.',
accessor: fn(Item $item) => $item->path,
),
);
$root = $builder->build(
input: $collection,
);
💡
Since child nodes are added to parents sequentially (i.e. without specific keys) in the order they appear in the source data, sorting the source collection by path prior to building the tree may be a good idea.
If sorting of siblings is needed after a tree has been built,
LevelOrderTraversal
can be used to traverse and modify the tree.The same is true for cases where the children need to be keyed by specific keys. Use
LevelOrderTraversal
, remove the children, then sort them and/or calculate their keys and add them back under the new keys and/or in the new order.
Each node's data has a (recursive) reference to its parent node's data.
Probably the most common and trivial way of persisting trees.
use Any\Item;
use Dakujem\Oliva\Recursive\TreeBuilder;
use Dakujem\Oliva\Node;
$collection = [
new Item(id: 0, parent: null),
new Item(id: 1, parent: 0),
new Item(id: 2, parent: 0),
new Item(id: 3, parent: 0),
new Item(id: 4, parent: 1),
new Item(id: 5, parent: 6),
new Item(id: 6, parent: 0),
new Item(id: 7, parent: 1),
];
$builder = new TreeBuilder(
node: fn(Item $item) => new Node($item), // How to create a node.
selfRef: fn(Item $item) => $item->id, // How to get ID of self.
parentRef: fn(Item $item) => $item->parent, // How to get parent ID.
root: null, // The root node's parent value.
);
$root = $builder->build(
input: $collection,
);
Above, self
and parent
parameters expect extractors with signature
fn(mixed $data, mixed $dataIndex, TreeNodeContract $node): string|int
,
while root
expects a value of the root node's parent.
The node with this parent value will be returned.
The most natural case for
root
isnull
because nodes without a parent are considered to be root nodes. The value can be changed to0
,""
or whatever else is suitable for the particular dataset.
This is useful when building a subtree.
In case the data is already structured as tree data, a simple wrapper may be used to build the tree structure.
use Any\Item;
use Dakujem\Oliva\Simple\TreeWrapper;
use Dakujem\Oliva\Node;
// $json = (new External\ApiConnector())->call('getJsonData');
// $rawData = json_decode($json);
// $rawData = yaml_parse_file('/config.yaml');
$rawData = [
'children' => [
[
'attributes' => [ ... ],
'children' => [
[
'attributes' => [ ... ],
'children' => [],
]
],
],
[
'attributes' => [ ... ],
'children' => [ ... ],
],
],
];
$builder = new TreeWrapper(
node: function(array $item) { // How to create a node.
unset($item['children']); // Note the unset call optimization.
return new Node($item);
},
children: fn(array $item):array => $item['children'] ?? [], // How to extract children.
);
$root = $builder->wrap($rawData);
Above, children
expects an extractor with signature fn(mixed $data, TreeNodeContract $node): ?iterable
.
💡
Remember, it is up to the integrator to construct the tree nodes. Any transformation can be done with the data, as long as a node implementing the
MovableNodeContract
is returned.
Using a manual node builder:
use Dakujem\Oliva\Node;
use Dakujem\Oliva\Simple\NodeBuilder;
// Tell the builder how to create a node.
// The `$item` parameter of the node factory
// will contain the first argument to each `NodeBuilder::node` call.
$proxy = new NodeBuilder(
node: fn(mixed $item) => new Node($item),
);
$root =
$proxy->node('root', [
$proxy->node('first child', [
$proxy->node('leaf of the first child node'),
$proxy->node('another leaf of the first child node'),
]),
$proxy->node('second child'),
]);
Using the low-level node interface (MovableNodeContract
) for building a tree:
use Dakujem\Oliva\Node;
$root = new Node('root');
$root->addChild($child1 = new Node('first child'));
$root->addChild($child2 = new Node('second child'));
$child1->setParent($root);
$child2->setParent($root);
$leaf1 = new Node('leaf of the first child node');
$child1->addChild($leaf1);
$leaf1->setParent($child1);
$leaf2 = new Node('another leaf of the first child node');
$child1->addChild($leaf2);
$leaf2->setParent($child1);
... yeah, that is not the most optimal way to build a tree.
Using high-level manipulator (Tree
) for doing the same:
use Dakujem\Oliva\Node;
use Dakujem\Oliva\Tree;
$root = new Node('root');
Tree::link(node: $child1 = new Node('first child'), parent: $root);
Tree::link(node: $child2 = new Node('second child'), parent: $root);
Tree::link(node: new Node('leaf of the first child node'), parent: $child1);
Tree::link(node: new Node('another leaf of the first child node'), parent: $child1);
... or a more concise way:
use Dakujem\Oliva\Node;
use Dakujem\Oliva\Tree;
Tree::linkChildren(node: $root = new Node('root'), children: [
Tree::linkChildren(node: new Node('first child'), children: [
new Node('leaf of the first child node'),
new Node('another leaf of the first child node'),
]),
new Node('second child'),
]);
TL;DR Use the Tree::link
static method.
In some cases it is sufficient to use the low-level interface of nodes (setParent
, addChild
, removeChild
methods),
but in most cases the Tree::link
works better:
use Dakujem\Oliva\Iterator\Filter;
use Dakujem\Oliva\Node;
use Dakujem\Oliva\Seed;
use Dakujem\Oliva\Tree;
// Create a find-by-id function (see Iterators section below):
$findById = fn(int $id, Node $tree) => Seed::firstOf(
new Filter($tree, fn(Node $node) => $node->data()?->id === $id)
);
// Move node with ID 7 into node ID 14:
$node = $findById(7, $root);
$parent = $findById(14, $root);
Tree::link($node, $parent);
The Tree::link
call will take care of all the relations: unlinking the original parent and linking the subtree to the new one.
Oliva provides tree traversal iterators, generators and a filter iterator.
The traversal iterators and generators will iterate over all the tree's nodes in a specific order.
Depth-first search, pre-order traversal
Traversal::preOrder
generatorPreOrderTraversal
iterator
Depth-first search, post-order traversal
Traversal::postOrder
generatorPostOrderTraversal
iterator
Breadth-first search, level-order traversal
Traversal::levelOrder
generatorLevelOrderTraversal
iterator
If unsure what the different traversals mean, read more about Tree traversal.
use Dakujem\Oliva\Iterator\Traversal;
use Dakujem\Oliva\Iterator\PreOrderTraversal;
use Dakujem\Oliva\Iterator\PostOrderTraversal;
use Dakujem\Oliva\Iterator\LevelOrderTraversal;
foreach(Traversal::levelOrder($root) as $node) { /* ... */ }
foreach(new LevelOrderTraversal($root) as $node) { /* ... */ }
💡
The key difference between the iterator classes and generator methods is in the keys. If the keys in the iteration do not matter, use the
Traversal
's generator methods for slightly better performance.
If the order of traversal is not important, a Node
instance can simply be iterated over:
use Dakujem\Oliva\Node;
$root = new Node( ... );
foreach ($root as $node) {
// do something useful with the node
}
The above will iterate over the whole subtree, including the node itself and all its descendants.
💡
Traversals may be used to decorate nodes or even alter the trees.
Be sure to understand how each of the traversals work before altering the tree structure within a traversal, otherwise you may experience the unexpected.
Finally, the filter iterator Iterator\Filter
may be used for filtering either the input data or tree nodes.
use Dakujem\Oliva\Iterator\Filter;
use Dakujem\Oliva\Node;
use Dakujem\Oliva\Seed;
// Filter the input before building a tree.
$filteredCollection = new Filter($sourceCollection, fn(Item $item): bool => $item->id > 5);
$root = (new TreeBuilder( ... ))->build(
$filteredCollection,
);
// Iterate over leafs only.
$filter = new Filter($root, fn(Node $node): bool => $node->isLeaf());
foreach($filter as $node){
// ...
}
// Find the first node that matches a criterion (data with ID = 42).
$node = Seed::firstOf(new Filter(
input: $root,
accept: fn(Node $node): bool => $node->data()?->id === 42),
);
Oliva does not provide a direct way to search for specific nodes, but it is trivial to create one yourself:
use Dakujem\Oliva\Iterator\Filter;
use Dakujem\Oliva\Node;
use Dakujem\Oliva\Seed;
// Create a find-by-id function.
$findById = fn(int $id, Node $tree): ?Node => Seed::firstOf(
new Filter($tree, fn(Node $node) => $node->data()?->id === $id)
);
// Search for any node by ID.
$node_42 = $findById(42, $root);
A class may be appropriate too...
class TreeFinder
{
public function __construct(
public Node $tree,
) {
}
public function byId(int $id){ ... }
public function byName(string $name){ ... }
}
Normally, the keys will increment during a traversal (using any traversal iterator or generator).
use Dakujem\Oliva\Node;
$root = new Node( ... );
foreach ($root as $key => $node) {
// The keys will increment 0, 1, 2, 3, ... and so on.
}
When using any of the traversal iterators, it is possible to alter the key sequence using a key callable.
This example generates a delimited materialized path:
use Dakujem\Oliva\Iterator\PreOrderTraversal;
use Dakujem\Oliva\Node;
$iterator = new PreOrderTraversal(
node: $root,
key: fn(Node $node, array $vector): string => '.' . implode('.', $vector),
startingVector: [],
);
$result = iterator_to_array($iterator);
//[
// '.' => 'F',
// '.0' => 'B',
// '.0.0' => 'A',
// '.0.1' => 'D',
// '.0.1.0' => 'C',
// '.0.1.1' => 'E',
// '.1' => 'G',
// '.1.0' => 'I',
// '.1.0.0' => 'H',
//];
This example indexes the nodes by an ID found in the data:
use Dakujem\Oliva\Iterator\PreOrderTraversal;
use Dakujem\Oliva\Node;
$iterator = new PreOrderTraversal(
node: $root,
key: fn(Node $node): int => $node->data()->id,
);
The full signature of the key callable is
fn(
Dakujem\Oliva\TreeNodeContract $node,
array $vector, // array<int, string|int>
int $seq,
int $counter
): string|int
where
$node
is the current node$vector
is the node's vector in a tree- it is a path from the root to the node with child indexes being the vector's elements
- vector of a root is empty
[]
(or equal to$startingVector
if passed to the iterator constructor) - current node's index within its parent's children is the last element of the vector
$seq
is the current sibling numerator (first child is0
, second child is1
, and so on)$counter
is the default iteration numerator that increments by 1 with each node (0, 1, 2, ...)- without a key callable, this is the key sequence
All Oliva traversal iterators accept a key callable and a starting vector (a prefix to the $vector
).
💡
Be careful with
iterator_to_array
when using key callable, because colliding keys will be overwritten without a warning.
The key callable SHOULD generate unique keys.
There may be situations where the source data does not contain a root.
It may be a result of storing article comments, menus or forum posts
and considering the parent object (the article, the thread or the site) to be the root.
One of the solutions is to prepend an empty data element and then ignore it during iterations if it is not desired to iterate over the root.
Observe the use of the Seed
helper class:
use Dakujem\Oliva\MaterializedPath;
use Dakujem\Oliva\Seed;
$source = Sql::getMeTheCommentsFor($article);
// When prepending `null`, care must be taken that both the extractor and the factory are able to cope with `null` values.
// Note the use of `?` nullable type hint indicator and null-safe `?->` operator.
$factory = fn(?Item $item) => new Node($item);
$pathExtractor = fn(?Item $item) => $item?->path;
$builder = new MaterializedPath\TreeBuilder( ... );
$root = $builder->build(
input: Seed::nullFirst($source), // `null` is prepended to the data
);
foreach(Seed::omitNull($root) as $node) { // The node with `null` data is omitted from the iteration
display($node);
}
We could also use Seed::merged
to prepend an item with fabricated root data, but then Seed::omitRoot
must be used to omit the root instead:
use Dakujem\Oliva\MaterializedPath;
use Dakujem\Oliva\Seed;
$source = Sql::getMeTheCommentsFor($article);
// We need not take care of null values anymore.
$factory = fn(Item $item) => new Node($item);
$pathExtractor = fn(Item $item) => $item->path;
$builder = new MaterializedPath\TreeBuilder( ... );
$root = $builder->build(
input: Seed::merged([new Item(id: 0, path: '')], $source),
);
foreach(Seed::omitRoot($root) as $node) { // The root node is omitted from the iteration
display($node);
}
Similar situation may happen when using the recursive builder on a subtree, when the root node of the subtree has a non-null parent.
This is very simply solved by passing the $root
argument to the tree builder.
use Any\Item;
use Dakujem\Oliva\Recursive\TreeBuilder;
use Dakujem\Oliva\Node;
$collection = [
new Item(id: 100, parent: 99), // Note that no data with ID 99 is present
new Item(id: 101, parent: 100),
new Item(id: 102, parent: 100),
new Item(id: 103, parent: 100),
new Item(id: 104, parent: 101),
new Item(id: 105, parent: 106),
new Item(id: 106, parent: 100),
new Item(id: 107, parent: 101),
];
$builder = new TreeBuilder(
node: fn(Item $item) => new Node($item),
self: fn(Item $item) => $item->id,
parent: fn(Item $item) => $item->parent,
root: 99, // Here we indicate what the parent of the root is
);
$root = $builder->build(
input: $collection,
);
If a node's parent matches the value, it is considered the root node.
Builders and iterators
If migrating from the previous library (oliva/tree
), the most common problems are caused by
- the new builders not automatically adding an empty root node
- the new iterators iterating over root node
For both, see "Materialized path tree without root data" and "Recursive tree without root data" sections in the "Cookbook" section above.
Node classes
Neither magic props proxying nor array access of the Oliva\Utils\Tree\Node\Node
are supported.
Migrating to the new Dakujem\Oliva\Node
class is recommended instead of attempting to recreate the old behaviour.
The Dakujem\Oliva\Node
is very similar to Oliva\Utils\Tree\Node\SimpleNode
,
migrating from that one should be trivial (migrate the getter/setter usage).
Run unit tests using the following command:
composer test
Ideas, issues or code contribution is welcome. Please send a PR or submit an issue.
And if you happen to like the library, give it a star and spread the word 🙏.