-
Notifications
You must be signed in to change notification settings - Fork 0
Lower bound on cardinalities for fixed length fixed alphabet
an looks to be the minimal cardinality neighbourhood words. This observation comes from experimentations on different size of alphabet, word length and reachable k values.
Proof needed !
- substitution then deletion on another letter is the same than deletion then substitution
- insert a letter is the same than insert a then substitute it by any other letter
k edits of a word of size n lead to words of size between n-k to n+k. To create the formula, let's count the words by size.
For the size n, we can generate all the words with 0 to k substitution. So the results is a sum over k of exact Hamming neighbourhood.
Ham(n, k) = comb(k, n) * (a-1)k
To reach size m, we will have to perform n-m deletions. As said in the remarks, we can chose to do the deletions first. After doing n-m deletions, we still have k-(n-m) edit operations available for substitutions. So, we can sum hamming distance over i from 0 to k-(n-m). The total is sum of all the possible sizes of Ham(m, i).