A TreeKEM Library in JavaScript/ECMAScript Focusing on Generalization about Trees and Known Variants, Extensibility, and Modularization
TreeKEM is a continuous group key agreement (CGKA) protocol, which means:
- it aims at establishing agreed/shared keys among users;
- it serves a group (in contrast to just a pair) of users;
- it reacts to continuous, dynamic group membership changes, where forward secrecy (FS) and post-compromise security (PCS) get into consideration.
Find the library source codes in src/
.
src/TreeKEM.js
gives the TreeKEM protocol.
src/trees/
contains various tree data structures.
src/crypto/
contains various cryptography suites, including an insecure counter implementation for profiling efficiency.
TBD.
In this specification, for simplicity, we regard a tree as a child list, despite there being tons of other fields at a tree node.
E.g., a leaf is then the empty list ()
, and ((), ())
gives a simple binary tree.
We denote some handy list operations as follows:
#L
: the length of listL
;L/x/y
: replace itemx
in listL
by itemy
, with the assertion thatx
must exist (only once) inL
; we also writeL/x/
for removing itemx
;L1 ++ L2
: concatenate two listsL1
andL2
.
Each (balanced) tree data structure provides the following interface:
init(n: PosInt) -> Tree
: initialize a tree withn
leaves;add(t: Tree, l: Leaf; h: Leaf?) -> Tree
: add a leafl
to treet
, with an optional "hint" leafh
suggesting that leafl
is added as "close" toh
as possible, where the distance is roughly measured by the graph distance betweenl
andh
on the tree;remove(t: Tree, l: Leaf; h: Leaf?) -> Tree
: remove a leafl
from treet
, with an optional "hint" leafh
suggesting that, roughly speaking, changes to the tree are as "close" toh
as possible.
We also assume that parent links are available, denoted by parent(t: Tree) -> Tree?
, which enables writing algorithms bottom-up.
Note that parent(t) = null
checks whether t
is a root.
We define a handy bottom-up utility replace(t: Tree, t': Tree) -> Tree
:
- if
t
is root then returnt'
- let
p := parent(t)
- let
p' := p/t/t'
- return
replace(p, p')
MLS takes a "lazy remove" approach and introduces "blank" leaf nodes. Here we refer to these nodes as "removed" instead, since the concept of "blank" nodes also occurs in TreeKEM with a different (yet related) meaning. Also, the notion of being "removed" naturally extends to internal nodes: an internal node is "removed" if all of its children are "removed", recursively. To remark, the "lazy remove" approach applies to arbitrary tree data structure; in particular we have the following generic implementations.
Function add(t: Tree, l: Leaf; h: Leaf?) -> Tree
:
- if there is no "removed" leaf in
t
then undefinedThis branch means the generic add operation is infeasible, and a concrete tree data structure must give its own implementation for the add operation at least in this case.
- if
h ≠ null
then letr
be an arbitrary (say, leftmost) closest "removed" leaf toh
- else let
r
be an arbitrary (say, random) "removed" leaf int
- return
replace(r, l)
Function remove(_t: Tree, l: Leaf; _h: Leaf?) -> Tree
:
(ineffective hint)
- let
r
be a new "removed" leaf - return
replace(l, r)
Definition. An LBBT is either a single leaf node, or a tree satisfying all of the following:
- the root has two children (i.e., binary and full);
- the left child is a perfect binary tree;
- the right child is recursively an LBBT, whose height is no larger than the left child.
In some sense LBBT is the append-only binary tree data structure.
Function init(n: PosInt) -> LBBT
:
There is a unique structure for LBBT with n
leaves; namely:
- if
n = 1
then return()
- let
h := floor(log2(n))
- if
n
is a power of 2, i.e., ifn = 2^h
, then return a perfect binary tree with heighth
- let
tl
be a perfect binary tree with heighth
- let
tr := init(n - 2^h)
, recursively. - return
(tl, tr)
Function append(t: LBBT, l: Leaf) -> LBBT
:
- if
t
is perfect then return(t, l)
Note that leaf is always perfect.
- let
(tl, tr) := t
- return
(tl, append(tr, l))
Function truncate(t: LBBT) -> LBBT?
:
This function truncates a tree with "removed" nodes, so that the rightmost leaf is not "removed" (and non-"removed" leaves remain untouched).
- if all leaves in
t
are "removed" then returnnull
- if
t
is leaf then returnt
- let
(tl, tr) := t
- let
tr' := truncate(tr)
- if
tr' ≠ null
then return(tl, tr')
- return
truncate(tl)
Function add(t: LBBT, l: Leaf; h: Leaf?) -> LBBT
:
- if there is "removed" leaf in
t
then returnGenericLazy.add(t, l; h)
- return
append(t, l)
Function remove(t: LBBT, l: Leaf; _h: Leaf?) -> LBBT
:
(ineffective hint)
- let
t' := GenericLazy.remove(t, l)
- return
truncate(t')
If the return value is
null
then it means the last node in the tree is removed and we might raise an error, depending on the use case.
In add
:
instead of GenericLazy.add(t, l; h)
, use GenericLazy.add(t, l; null)
(notice the null hint, which lets GenericLazy.add
choose a random "removed" leaf).
In add
:
- let
m
be the number of "removed" leaf int
- with probability
m/(m+1)
, returnGenericLazy.add(t, l; null)
(notice the null hint, which letsGenericLazy.add
choose a random "removed" leaf) - with probability
1/(m+1)
, returnappend(t, l)
In add
:
do not GenericLazy.add
and always append
.
In add
:
instead of GenericLazy.add(t, l; h)
, use GenericLazy.add(t, l; null)
and inside use the leftmost instead of a random "removed" leaf as r
.
In remove
:
do not truncate
and return t'
directly.
In perfect mode, LBBT is restricted to always be a perfect binary tree (of course allowing "removed" leaves). The algorithms change as follows.
- In
init
: create a perfect binary tree with heighth := ceil(log2(n))
, where the rightmost2^h - n
leaves are "removed" - In
append
: create a perfect binary treetr
with the same height ast
, and with leftmost leafl
(and other leaves intr
are all "removed"); return(t, tr)
- In
truncate
: instead of computingtr'
, if not all leaves intr
are "removed" then returnt
(and otherwise still returntruncate(tl)
recursively)
Our implementation of LBBT above is based on the generic "lazy" operations.
Note that technically speaking, the tree is no longer worst-case balanced (i.e., height is logarithmic in the number of true leaves) by having lazily removed leaves.
It is actually possible to get rid of the generic "lazy" operations and design a balanced mode for LBBT.
In particular, truncate
can only help remove a rightmost leaf, and we need an operation for removing node at arbitrary position.
To better see the idea behind the new algorithm, consider the following view:
an LBBT with n
leaves is a "chain on the right" of perfect subtrees with respectively 2^h[1], …, 2^h[m]
leaves, corresponding to the binary expansion of n
(and we order from the least significant power of 2 to the most);
the perfect subtrees t[1], …, t[m]
are "chained on the right" as (t[m], (t[m-1], … (t[2], t[1]) … ))
.
Under this view, to remove a leaf is then to "split" the perfect subtree for h[i]
containing that leaf into 0, 1, …, h[i]-1
, and then to "merge" these with h[1], …, h[i-1]
.
One (or essentially, the) strategy minimizing further breaking up the perfect subtrees is to borrow h[1]
(note that h[1] < h[i]
and thus h[1] ≤ h[i]-1
), let h[1], h[1], …, h[i]-1
reconstruct h[i]
, and put the remaining 0, 1, …, h[1]-1
to the head.
This strategy leads to the following pop
algorithm.
Function split(t: LBBT, v: LBBT) -> LBBT[]
:
- return the copath of
v
belowt
, bottom-up
Function merge(L: LBBT[]) -> LBBT
:
- if
#L = 1
then return the first item inL
- let
tr, tl
be the first two items inL
, andL'
be the remaining items - return
merge([(tl, tr)] ++ L')
Function pop(t: LBBT, l: Leaf) -> LBBT
:
- let
p
be the highest root of a perfect subtree on the path froml
to the rootp
would be the root of the perfect subtree forh[i]
containingl
. - let
ip := parent(p)
Note that
ip
, if not null, must be a root of an imperfect subtree, by the extremality ofp
. - if
p = l
:- if
p
is root then undefinedThis branch means the last node in the tree is removed and we might raise an error, depending on the use case.
- let
(ps, _p) := ip
Note that
p = l
must be a right child: ifp
were a left child, then the right child must also be a single leaf by the constraint of LBBT, and thenip
would be perfect, contradicting the fact thatip
must be imperfect. - return
replace(ip, ps)
- if
- let
S := split(p, t)
- if
p
is root orp
is the right child ofip
:- return
replace(p, merge(S))
- return
- let
(_p, ps) := ip
- let
r
be the first root of a perfect subtree along the "right child chain" ofip
r
would be the root of the perfect subtree for the smallesth[m]
. - if
r = ps
andr
is a single leaf:- return
replace(ip, merge([r] ++ S))
- return
- let
Sr := split(ps, r)
(ifr = ps
thenSr := []
) - split
S
intoSlt ++ Sge
, where each perfect subtree inSlt
has height less than that ofr
, and each inSge
has height at least that ofr
- let
p' := merge([r] ++ Sge)
Note that
p'
is a perfect (sub)tree. - let
ps' := merge(Slt ++ Sr)
- let
ip' := (p', ps')
- return
replace(ip, ip')
Then in remove
, we can use pop
instead of GenericLazy.remove
and enjoy truly balanced LBBT.
Definition.
A BT with maximum degree/order Δ ≥ 3
and minimum degree δ := ceil(Δ/2)
is a tree satisfying all of the following:
- every node has at most
Δ
children; - the root has at least two children, and every non-root internal node has at least
δ
children; - every leaf has the same depth.
BT with Δ=3
(and δ=2
) is also called 2-3 tree.
BT with Δ=4
(and δ=2
) is also called 2-3-4 tree.
Function init(n: PosInt; h: Int ?= ceil(log_{Δ}(n))) -> BT
:
In contrast to LBBT, the structure of a BT with n
leaves is not unique, and here we just give an arbitrary design (that achieves minimum depth):
- if
n = 1
then return()
- let
k := min(floor(n / δ^{h-1}), Δ)
- let
m := floor(n / k)
,r := n % k
- return
(t[1], …, t[k])
, wheret[i] := init(m+Int(i≤r); h-1)
Function addSibling(t: BT, s: BT; t': BT ?= t) -> BT
:
This function can be interpreted as to "add
l
as a sibling tot
, while replacingt
byt'
".
- if
t
is root then return(t', s)
- let
p := parent(t)
- let
peers := p/t/t'
- if
#p < Δ
then:- let
p' := peers ++ (s)
- return
replace(p, p')
Now we can assert
#p = #peers = Δ
. - let
- split
peers
intop'
andsibs
, described belowThe strategy ensures
#p' = δ
, and hence#sibs = Δ-δ ∈ [δ-1,δ]
. - let
ps := sibs ++ (s)
Note that
#ps ∈ [δ,δ+1] ⊆ [δ,Δ]
. - return
addSibling(p, ps; p')
Split strategy:
At a high level, the strategy splits peers
(length Δ
) into two balanced halves p'
and sibs
, with t'
in the sibs
part, which is no larger than the p'
part; constructively:
- if
Δ
is odd andt'
is at the middle ofpeers
(i.e., has indexδ
if counting from 1) thenpeers := peers/t'/ ++ (t')
(i.e., to movet'
to the end) - if
t'
is not among the firstδ
items ofpeers
then split byp' ++ sibs
, where#p' = δ
- if
t'
is not among the lastδ
items ofpeers
then split bysibs ++ p'
, where#p' = δ
The two cases are now complete due to the fact that
t'
is not at the middle for oddΔ
.
Function removeSelf(t: BT, h: BT?; s: BT? ?= null, s': BT? ?= null)
:
This function can be interpreted as to "remove
t
with 'hint'h
, while replacing siblings
bys'
".
- if
t
is root then undefinedThis branch means the last node in the tree is removed and we might raise an error, depending on the use case.
- let
p := parent(t)
- let
p' := p/s/s'/t/
(ifs = null
thenp' := p/t/
) - if
p
is root or#p' ≥ δ
then:- if
#p' = 1
then returnsib
, where(sib) := p'
- return
replace(p, p')
Now we can assert
#p' = δ-1
. - if
- let
gp := parent(p)
,hp := parent(h)
(ifh = null
thenhp := null
) - choose a sibling
ps
ofp
to borrow from or merge into, described belowThe strategy ensures
#ps > δ
if to borrow from, and#ps ≤ Δ-δ+1
if to merge into. - if to borrow from:
- if
ps = hp
then splithp
intops'
andsibs
, described belowThe strategy ensures
#ps' ≥ δ
, and#sibs ≥ 1
(assibs
at least containsh
). - else split
ps
intops' ++ sibs
, where#ps' = δ
- let
p'' := sibs ++ p'
Note that
#p'' ∈ #ps - [δ,#ps-1] + δ-1 = [δ,#ps-1] ⊆ [δ,Δ-1]
. - let
gp' := gp/p/p''/ps/ps'
- return
replace(gp, gp')
- if
- let
ps' := ps ++ p'
Note that
#ps' ∈ [δ,Δ-δ+1] + δ-1 = [2δ-1,Δ] ⊆ [Δ-1,Δ]
. - return
removeSelf(p, hp; ps, ps')
Borrow-or-merge strategy:
- if
hp
is a sibling ofp
then:- if
#hp ≤ Δ-δ+1
then merge intohp
- else borrow from
hp
Note that in the borrow case we always have valid
#hp > Δ-δ+1 ≥ δ
; similarly for the borrow case below.
- if
- if there exists a sibling
ps
ofp
with#ps ≤ Δ-δ+1
then merge intops
- else borrow from a sibling
ps
ofp
The choice of
ps
in these two cases can be arbitrary; we simply pick the first one (meeting the requirement).
Borrow-hint strategy:
- if
h
is not among the firstδ
items ofhp
then splithp
intops' ++ sibs
, where#ps' = δ
- if
h
is not in the lastδ
items ofhp
then splithp
intosibs ++ ps'
, where#ps' = δ
- else let
ps' := ps/h/
,sibs := (h)
In all cases,
h
is contained in thesibs
part.
Function add(t: BT, l: Leaf; h: Leaf?) -> BT
:
- if
h = null
then use an arbitrary (say, random) leaf int
ash
- return
addSibling(h, l)
Function remove(t: BT, l: Leaf; h: Leaf?) -> BT
:
- return
removeSelf(l, h)
In add
:
use a random leaf instead of the given value (if not null) as h
.
In add
:
use a (say, leftmost) "optimal" leaf, described below, instead of the given value (if not null) as h
.
For every leaf, we define its "imperfect" height to be the minimum h
so that the height-h
subtree containing the leaf is non-full/imperfect, i.e., the subtree has less than Δ^h
leaves.
E.g., a leaf itself is always perfect; a height-1 internal node is imperfect if and only if it has degree less than Δ
, etc.
An "optimal" leaf is a leaf with minimum "imperfect" height.
The purpose of using an "optimal" leaf is to greedily minimize the number of recursions in addSibling
.
The borrow-or-merge strategy above essentially defines a precedence among the 4 choices:
- merge into
hp
(parent of hint) - borrow from
hp
- merge into some
ps
(sibling of parent) - borrow from some
ps
We pick the above particular permutation as it intuitively best respects the hint: to recall, the hint suggests that changes to the tree are as "close" to the hint as possible. This is because:
- If
hp
, parent of hint, is among the siblings of parent then either merge into or borrow (hint) fromhp
helps align the changes to the tree with hint. - In this case we prefer merge than borrow as
hp
after borrowing would be an extra change on the copath of hint, while merging gives no extra change on the copath. - Otherwise, we still prefer merge than borrow as it causes recursion of
removeSelf
, which intuitively gets "closer" to the hint: there might be hope thathp
is among the siblings of parent in future recursions. - Also, if hint is already sibling or even the same node then we still prefer merge than borrow for a similar "copath change" observation.
We emphasize that these justifications are greedy and merely heuristic.
Technically, it is also possible to consider a precedence with any permutation of the 4 choices (so 24 permutations and 24 possible strategies).
In the implementation, we limit the possibilities by only considering whether hint takes precedence and which of borrow/merge takes precedence; i.e., we only consider the size-4 permutation subgroup generated by
Definition. An RBT is a tree satisfying all of the following:
- every node is "colored" either red or black, where the root and leaves are always black, and red nodes can only have black children;
- every internal node has two children (i.e., binary and full);
- every leaf has the same "black depth", defined as the number of black nodes on their path to the root.
An LLRBT is an RBT satisfying the extra constraint that black nodes cannot have black left and red right children nodes.
I.e., red nodes are always "leaning to the left" inside the children of black nodes.
We have the following equivalent definition for LLRBT which is recursive and more explicit: An LLRBT is a tree matching one of the following pattern:
- a single (black) leaf node
()
; - a "black-black" node
(t1, t2)
, wheret1
,t2
are LLRBTs with the same "black depth"; - a "red-black" node
(R(t1, t2), t3)
, wheret1
,t2
,t3
are LLRBTs with the same "black depth";Here
R(…)
marks a red node. Recall that "left-leaning" in LLRBT requires that a "black-red" pattern is forbidden. - a "red-red" node
(R(t1, t2), R(t3, t4))
, wheret1
,t2
,t3
,t4
are LLRBTs with the same "black depth".
To remark, it might be more convenient to view colors as being associated with parent-child links instead of with nodes. In particular, if a node after some tree operation has the same child list but flipped color (well, associated with the node), then it is regarded as an "old" instead of "new" node (as defined in the later "skeleton" section). In this situation, just imagine that the parent of the node changes and the color on the parent-child link changes as well, and then it is totally fine to say that nothing changes about node itself.
With the left-leaning property, red-black tree is isomorphic to 2-3-4 tree, via the following isomorphism iso(t: 234Tree) -> LLRBT
:
- if
t
is a leaf then returnt
- if
(t1, t2) := t
then return(t1, t2)
- if
(t1, t2, t3) := t
then return(R(t1, t2), t3)
- if
(t1, t2, t3, t4) := t
then return(R(t1, t2), R(t3, t4))
We define the "2-3 mode" of LLRBT by further forbidding the "red-red" pattern (i.e., there are only "black-black" and "red-black" patterns).
Similarly, it is easy to see that iso
gives an isomorphism between "2-3 mode" LLRBT and 2-3 tree.
With the isomorphism iso
and its inverse inv
(which is also easy to compute), the operations for LLRBT are immediately induced by the corresponding operations for 2-3-4 tree (and similarly "2-3 mode" operations are induced by 2-3 tree operations):
init(n) := iso(234Tree.init(n))
;add(t, l; h) := iso(234Tree.add(inv(t), inv(l); inv(h)))
;remove(t, l; h) := iso(234Tree.remove(inv(t), inv(l); inv(h)))
.
LLRBT has normal and "2-3 mode" variants. LLRBT also generically inherits all variants of BT.
In a previous section we have described variants of the borrow-or-merge strategy in BT, and elaborated some reasoning behind its design. However, the other strategies in BT might still appear quite mysterious, so here we try to unpack the mysteries.
The high-level idea in designing the split and borrow-hint strategies is to maximum the "reusing" of LLRBT nodes under the isomorphism, while keeping the strategy rules as simple and oblivious as possible.
In split strategy:
- For normal mode,
#peers = Δ = 4
, and it is split into two halves, each of size 2. Under the isomorphism, this means to split(R1(B,B), R2(B,B))
intoR1(B,B)
andR2(B,B)
, successfully reusing the LLRBT nodes. - For "2-3 mode",
#peers = Δ = 3
, and it is split into sizes 1 and 2. The desired behavior is to split(R1(B,B), B2)
intoR1(B,B)
andB2
, but this is not necessarily possible as the hint could occur in theR1(B,B)
part (recall that we need the hint in the no-larger half, i.e., the size-1 half). The only thing we could do is to split the hint alone, which is also what the split strategy does. Here note that when the hint is indeed theB2
part, we successfully reuse the other LLRBT nodeR1(B,B)
.
In borrow-hint strategy:
- For normal mode and
#hp = 4
, we splithp = (R1(B,B), R2(B,B))
intoR1(B,B)
andR2(B,B)
and borrow the half containingh
, successfully reusing the LLRBT nodes. - For normal mode and
#hp = 3
, we have no choice but to borrowh
and leave the other two siblings inhp
. Note that whenh
is indeed theB2
part ofhp = (R1(B,B), B2)
, we successfully reuse the other LLRBT nodeR1(B,B)
. - For "2-3 mode" we have the unique case
#hp = 3
, and it is the same situation as in normal mode. - By the way, the "borrow-non-hint strategy" is to "split
ps
intops' ++ sibs
, where#ps' = δ
". Withδ = 2
, this successfully reuses the LLRBT nodes no matter whether#ps = 4
or#ps = 3
. - Also by the way, the "merge strategy" is to merge by "
ps' := ps ++ p'
", i.e., to merge at the end. With#p' = δ-1 = 1
, this also reuses the LLRBT node inps
(no matter what size#ps ≤ Δ-δ+1
is, i.e., no matter whether#ps = 2
or#ps = 3
).
Each tree operation in the interface gives a new tree, and we define the skeleton to be the subset/subgraph of the new tree consisting of the "new" nodes, i.e., nodes that are not in the old tree (if any).
E.g., the skeleton given by the init
operation is the entire initialized tree, as there is no old tree.
Also see the following examples for add
and remove
.
Example of a skeleton given by add
(new node 8()
, with "hint" 2()
) in 2-3 tree, where nodes are labeled in front of their child list:
d(
a(1(),2(),3()),
b(4(),5()),
c(6(),7()),
) --add(8;2)-> i(
g(
b(4(),5()),
c(6(),7()),
),
h(
e(1(),3()),
f(2(),8()),
),
)
It is easy to see that 6,e,f,g,h,i
are the "new" nodes, and thus form the skeleton;
also note that b,c
are "old" internal nodes that remains in the tree.
Example of a skeleton given by remove
(node 7()
, with "hint" 2()
) in 2-3 tree:
d(
a(1(),2(),3()),
b(4(),5()),
c(6(),7()),
) --remove(7;2)-> g(
e(1(),3()),
b(4(),5()),
f(2(),6()),
)
Here e,f,g
are the "new" nodes that form the skeleton.
Another example of a skeleton given by remove
(node 3()
) in LBBT:
b(
a(1(),2()),
3()
) --remove(3)-> a(1(),2())
Here we have an empty skeleton!
Observe the following properties of a skeleton:
- if a node is in skeleton, then so is its parent node (since "old" node cannot have "new" node as a child);
- as a result, a skeleton is always connected, and contains the root (if nonempty).
We describe the TreeKEM protocol in terms of skeletons. More specifically, the TreeKEM protocol requires to take a path decomposition of the skeleton, i.e., a decomposition of the skeleton into a set of disjoint paths. We implement both notions in implicit ways:
- For skeleton, we label each node by an epoch recording the time/seqno of the operation that generates the node, and by collecting all nodes with the newest epoch one could reconstruct the skeleton.
- For path decomposition, we add a "tracing child" field to each (internal) node, that points to one of its children with the same epoch (or is
null
if no such child exists). Note that the skeleton consists of nodes with the same newest epoch. Hence by following the "tracing child" relationship one could effectively follow a path decomposition of the skeleton.We ensure that "tracing child" is
null
only if there is no child with the same epoch, and as a result the induced path decomposition is "minimal", meaning that there is no pair of paths in the decomposition that could merge into a longer path.
We make the following particular choices of "tracing child":
Most of the choices are just unique by definition. For those with some freedom, the high-level idea is to (heuristically) form a "long master path" in the induced path decomposition.
- generic
replace
:- trace
t'
inp'
- trace
- all
init
: always trace to the first child in every internal node; this is just an arbitrary choice - LBBT
append
:- trace
l
in(t, l)
- trace
append(tr, l)
in(tl, append(tr, l))
- trace
- LBBT
truncate
:- trace
tr'
in(tl, tr')
(iftr' ≠ tr
)
Actually if
tr'
is "new". - trace
- BT
addSibling
:- trace
s
in(t', s)
- trace
s
inp'
- trace
null
in the splitp'
- trace
s
inps
- trace
- BT
removeSelf
:- trace
s'
inp'
- trace
s'
inp''
- trace
null
inps'
- trace
p''
ingp'
- trace
s'
in the mergedps'
- trace
- LLRBT: rather complicated, and for now please refer to the code if interested; to remark, the isomorphism does not completely determine the "tracing child" relationship
MLS has an optimization about reusing (the secrets at) "old" nodes (introduced at version 8). In tree data structures, we accordingly consider the following task: to find a decomposition (if any) of a "new" node into an "old" node (probably no longer existing in the current tree) along with a list of nodes existing in the current tree, so that their leaves disjointly split the leaves of the "new" node.
We figure out the following particular decomposition opportunities:
- generic "lazy"
add
:- for every recursive
replace(t, t')
inreplace(r, l)
, decomposet'
into[t,l]
Note that we only care about true leaves, so here replacing a lazily removed leaf by a true leaf is essentially adding a leaf.
- for every recursive
- LBBT
append
:- decompose
(tl, append(tr, l))
into[t,l]
- decompose
- BT
addSibling
:- for every recursive
replace(t, t')
inreplace(p, p')
, decomposet'
into[t,s*]
, wheres*
is the value ofs
in the first recursion (in the normal case where the first recursion is called for adding a leaf,s*
would simply be the added leaf)
- for every recursive
- BT
removeSelf
:- decompose the merged
ps'
into[ps,…p']
- decompose the merged
- LLRBT: induced by the isomorphism; in particular, if a BT node
t
has a decomposition[t[1], …, t[m]]
, then the LLRBT nodeiso(t)
would have a decomposition[iso(t[1]), …, iso(t[m])]
There is one corner case where
iso(t[1])
is a child node ofiso(t)
(and, e.g.,m = 2
andiso(t[2])
is the other child), and in this case we exclude the decomposition asiso(t[1])
, though being an "old" node, is still existing in the current tree and moreover a direct child of the "new" node.
MLS refers to the list of nodes after the "old" node in the decomposition as "unmerged leaves", and we can see that these nodes are indeed leaf nodes in the case of LBBT. However in general (e.g., see BT) these nodes do not need to be leaf nodes, and thus we refer to them as generally "unmerged nodes".
In TreeKEM, users in a group are placed at the leaf nodes in a tree, and each internal node either is blank or has a secret (more concretely, a public-secret key pair generated from the secret). The high-level idea in the TreeKEM protocol is to preserve the following invariant during group changes: each user knows and only knows the secrets at the (non-blank) nodes on the path from the user's leaf to the root of the tree. As a result, the secret of the root is known by (and only by) every user in the current group, and thus can be used as a shared group secret.
Strictly speaking, it is insecure to use the secret at the root as the shared group secret, and instead the secret at a "hypothetical parent" of the root is used.
TreeKEM uses a tree data structure Tree
.
We assume the state of the TreeKEM protocol includes a tree t
.
For now our implementation provides operations over a global view of the TreeKEM protocol.
Method init((id[1], pk[1], sk[1]), …, (id[n], pk[n], sk[n]))
:
- let
t := Tree.init(n)
(with skeletont
) - write
id[i], pk[i], sk[i]
in thei
-th leaf, fori ∈ [n]
skeletonGen(id[1], t, t)
Here we suppose
id[1]
is the user who initializes the group.
Method add(id, id', pk', sk')
:
- let
l
be the leaf forid
- let
l'
be a new leaf, and writeid', pk', sk'
inl'
- let
t' := Tree.add(t, l'; l)
, with skeletons
skeletonGen(id, t', s)
Method remove(id, id')
:
- let
l
,l'
be the leaves forid
,id'
, respectively - let
t' := Tree.remove(t, l'; l)
, with skeletons
- if
s
is empty thens := {[roof of t']}
skeletonGen(id, t', s)
Method update(id, id' ?= id)
:
- let
l
,l'
be the leaves forid
,id'
, respectively - let
s
be the path froml'
to the root skeletonGen(id, t, s)
All methods above share the subroutine skeletonGen
, described below.
To recall, the skeletons given by the tree operations and their path decompositions are implemented in implicit ways, so they are not really written down and passed around.
Function skeletonGen(id, t, s)
:
- drop leaf nodes from
s
As a corner case, if
s
is empty after dropping all leaf nodes, then one can just (refresh and broadcast byskeletonEnc
the group secret and) return. - let
r
be the path from the leaf forid
to the root oft
- blank
s \ r
- for each path
P[1], …, P[I]
ins ∩ r
, whereP[i] = (v[i,1], …, v[i,h[i]])
:The nodes in
P[i]
are listed bottom-up; also the paths themselves are sorted bottom-up.Also, we remark that the path decomposition of the skeleton
s
naturally induces a path decomposition of any subsets ∩ r
. (Herer
is a path, sos ∩ r
is a path as well and we actually could take the trivial path decomposition of it; we would have more complexr
in general.)- sample
seed[i,1]
uniformly at random - for
j ∈ [h[i]]
:- let
(seed[i,j+1], secret) := PRG(seed[i,j])
- let
(pk, sk) := PKE.Gen(secret)
- write
pk, sk
inv[i,j]
- let
- sample
- for each
seed[i,j]
atv[i,j]
:skeletonEnc(id, seed[i,j], v[i,j], v[i,j-1])
(ifj = 1
thenv[i,j-1] := null
)Note that
v[i,j] ∈ s ∩ r
is always an internal node, as (1)s
never contains the leaf node for userid
(e.g., observe that the part of the skeletons
produced by the tree operations never contains the "old" leaf node for userid
), and (2)r
never contains any leaf node for users other thanid
.
- let
seed[I,h[I]+1]
be the group secretNote that the skeleton
s
passed toskeletonGen
always contains the root oft
. Also note thatv[I,h[I]]
must be the root oft
due to the bottom-up sorting.
Function skeletonEnc(id, seed, v, c*)
:
- for each child
c
ofv
:- if
c = c*
then continue - if
c
is the leaf forid
then continue - if
c
is non-blank, i.e., there ispk
inc
thenPKE.Enc(pk, seed)
- if
c
is blank thenskeletonEnc(id, seed, c, null)
Note that leaf nodes are always non-blank as they hold the long-term keys for the users. Hence the recursion always gets to an internal node.
- if
MLS supports batching operations via a "proposals and commits" mechanism (introduced at version 8). In particular, for each operation (add/remove/update), the user who issues the operation can choose to only announce a "proposal" describing the operation, without changing the state of the secret tree; a new "commit" operation is introduced, where the committing user collects a couple of proposals announced so far by different users and makes corresponding changes to the secret tree as a batch.
To support "proposals and commits", we make the following changes to the algorithms:
- In
add
/remove
/update
: instead of callingskeletonGen
, just record skeletons
- Method
commit(id)
:- let
s
be the union of all recorded skeletons (while dropping "old" nodes that are no longer existing in the current tree) skeletonGen(id, t, s)
- let
Note that here the semantics of
add
/remove
/update
do not correspond to generating proposals in MLS; instead, they roughly mean "to process the operations indicated by all collected proposals in a commit operation", andcommit
roughly means "to finish the commit operation".
Using the decomposition of "new" nodes for reusing "old" nodes, we have an optimization that saves blanks in the protocol.
Function recompose(v)
:
- if
v
has no decomposition for reusing "old" nodes then return - let
[v[1], …, v[m]]
be the decomposition ofv
- if
v[1]
has no secret thenrecompose(v[1])
- if
v[1]
now has secrets:- copy secrets at
v[1]
tov
- set "unmerged nodes" at
v
to be "unmerged nodes" atv[1]
along with[v[2], …, v[m]]
- copy secrets at
We then make the following changes to the algorithms:
- In
skeletonGen
:- for each
v ∈ s \ r
, besides blanking, also applyrecompose(v)
- (set "unmerged nodes" to be
[]
for freshly generated secrets)
- for each
- In
skeletonEnc
:- besides
PKE.Enc(pk, seed)
, if there are "unmerged nodes"[c[1], …, c[m]]
atc
thenskeletonEnc(id, seed, v*, null)
, for a virtual nodev* := (c[1], …, c[m])
collecting the "unmerged nodes" as children
- besides
As a generalization, we replace the paths in the invariant with general secret regions; the generalized invariant is now: each user knows and only knows the secrets at the (non-blank) nodes in the user's secret region. For ease of the (generalized) algorithms, we restrict that a user's secret region must at least include the path from the user's leaf to the root of the tree, and must not include other users' leaves. Under this restriction, it is sometimes more convenient to speak about the additional secret region, which excludes the path.
The actual secret regions in the algorithms are determined by a secret region strategy, which we denote by region(user, node) -> Boolean
, meaning whether user
's additional secret region includes node
;
it is plausible that the actual secret region strategy wants more information from the protocol, and one can easily adapt the interface to the demands.
Moreover, we actually have two independent secret region strategies in the algorithm, and we name the two as regionGen
and regionEnc
; the reason behind the naming would be obvious in the context.
Accordingly, each (internal) node keeps track of the "tainting" users whose additional secret regions include that node. A node is called tainted if it has non-empty "tainting" user set.
The algorithms generalize as follows.
- In
remove
andupdate
:- add the additional secret region of
id'
tos
; moreover, the nodes should be added in a "path-/parent-closed" way, such that for everyv ∈ s
, we haveparent(v) ∈ s
as well (forremove
, only add the nodes that still exist in the current tree)Note that this way, the skeleton
s
after adding the nodes must still be connected.
- add the additional secret region of
- In
skeletonGen
:r
now consists of the pathp
for useruser := id
along with the additional secret region ofuser
- whether
v ∈ s \ r
or is part ofs ∩ r
is decided by whetherv ∈ s \ p ∧ ¬regionGen(user, v)
- (after blanking
v ∈ s \ r
,v
is no longer tainted) - for each
v[i,j] ∈ s ∩ r
with freshly generated secrets, if (and only if)v[i,j] ∈ s \ p
thenv[i,j]
is considered tainted byuser
- for each
v[i,j]
, for each other useruser'
with public keypk'
(at the leaf foruser'
), ifregionEnc(user', v)
thenPKE.Enc(pk', seed[i,j])
, andv[i,j]
is (also) tainted byuser'
It would be inefficient to iterate over all users and check if
regionEnc(user', v)
is satisfied, and one needs to build extra data structure inregionEnc
for (approximately) generating the satisfying users.
- When combining with the "unmerged nodes" optimization, in
recompose
:- inherit the "tainting" users when copying secrets
It can be verified that the above changes become no-op if additional secret region is always empty (i.e., if regionEnc
and regionDec
always return false
).
It is possible (especially when the additional secret region is nonempty) to replace a couple of PKE encryptions by one-time pads (OTPs) in skeletonEnc
, which helps improve efficiency.
To enable OTP, we make the following changes to the algorithms:
- In
skeletonEnc
:- if
c
is the top of a pathP[i]
inskeletonGen
, which means there is a generated but unusedseed[i,h[i]+1]
inskeletonGen
, thenOTP.Enc(seed[i,h[i]+1], seed)
instead ofPKE.Enc(pk, seed)
- if
Following the OTP optimization, every tail value
seed[i,h[i]+1]
will be used (recall that previously only the topmost oneseed[I,h[I]+1]
is used, as the new group secret). In this sense, it seems more natural for the algorithm to include the OTP optimization.
PKE encryptions are commonly orders of magnitudes slower than SKE encryptions. Hence when the space is not a big concern, it would be valuable if one can replace PKE encryptions by SKE encryptions in cost of storing extra SKE keys.
To enable the replacing, we make the following changes to the algorithms:
- In
skeletonGen
:- for each
v[i,j] ∈ s ∩ r
with freshly generated secrets, PRG-expand further by(seed[i,j+1], secret, secret') := PRG(seed[i,j])
instead, generate SKE keyk := SKE.Gen(secret')
, and writek
inv[i,j]
as well
- for each
- In
skeletonEnc
:- before checking for
pk
, ifc
is non-blank and has SKE keyk
andc
is in the secret region ofid
thenSKE.Enc(k, seed)
(and return)Here we write in terms of general secret regions; to remark, this optimization is (almost) useless (and merely wastes space for storing SKE keys) if the secret regions are just paths.
Note that when combining with the "unmerged nodes" optimization, one still needs to recurse in this branch if there are "unmerged nodes" at
c
.
- before checking for
It can be observed that regarding the running time, this optimization will:
- replace some of the PKE encryptions 1:1 by SKE encryptions
- use longer PRG expansions (from double to triple length)
- introduce SKE key generations, the number of which is the same as existing PKE key generations
Alternatively, since the used SKE key is always in the secret region, one could make the SKE key generations lazy and, e.g., derive the SKE key k
from the PKE secret key sk
when k
is needed.
This way there is no longer the space (or PRG) overhead, and the SKE key generation overhead is bound to the number of SKE encryptions instead of PKE key generations.
A caveat for the alternative approach is that it may hurt the forward secrecy (FS) of the protocol. To achieve the strongest notion of FS requires to use updatable PKE (and updatable SKE), and if the lazy SKE key generation does not refresh the underlying
sk
then the strongest FS would break.
The following TreeKEM strategies are obsolete and no longer supported in the latest codes. They are either considered inefficient variants, or superseded by more general optimizations.
- add strategy:
- "sync": normal behavior
- "async": just blank out instead of
skeletonGen
Caveat: for complete functionality, one would need to at least
skeletonGen
at the root in order to get a (new) group secret. (Note that due to the blanks, this new secret at the root would be rather expensive to broadcast inskeletonEnc
though.) (More strictly, as remarked before, we really mean the secret at a "hypothetical parent" of the root.)
- remove strategy:
- "remover": blank out the path from removee, remove from the tree, and then "update" (which is roughly
skeletonGen
, while with a potentially larger skeleton that adds the path from remover); equivalent to normal behavior (as the nodes on the blanked-out path, if still existing in the tree, are always contained in the skeleton of the remove operation), while with the potentially larger skeletonThe strategies were designed before introducing the notion of skeletons, so here the blanked-out path for
skeletonGen
should be interpreted as the "new" nodes that "overlay" the "old" blanked-out path. (Technically, a "new" node created by the functionreplace
is regarded as "overlaying" the corresponding replaced "old" node.) - "remover-before": blank out the path from removee, "update" (roughly
skeletonGen
, for the blanked-out path), and then remove from the tree; equivalent toskeletonGen
while excluding the nodes not "overlaying" some "old" nodes from the (potentially larger) skeletonSame caveat about blanking, in the corner case where all nodes in the skeleton get excluded.
- "removee": "update" (meaninglessly), blank out the path from removee, and then remove from the tree; equivalently, just blank out instead of
skeletonGen
Same caveat about blanking.
- "remover": blank out the path from removee, remove from the tree, and then "update" (which is roughly
- update strategy:
- "LCA": normal behavior
- "root": just blank out instead of
skeletonGen
Same caveat about blanking.
- merge strategy:
- "blank": normal behavior (without the "unmerged nodes" optimization)
- "keep": keep a "partial secret" at certain nodes when "merging" nodes; superseded by the "unmerged nodes" optimization, with in particular the decomposition in BT
removeSelf
- split strategy:
- "blank": normal behavior (without the "unmerged nodes" optimization)
- "keep": keep a "partial secret" at certain grandparent nodes when "splitting" nodes; superseded by the "unmerged nodes" optimization, with in particular the decomposition in BT
addSibling
- MLS: https://messaginglayersecurity.rocks
- Tainted TreeKEM: https://eprint.iacr.org/2019/1489
- Multicast Key Agreement: https://eprint.iacr.org/2021/1570