De Bruijn Sequence and Universal Cycle Constructions

Recursive Construction Output
Algorithm
Order $n$
Symbols $k$
Constructing DB sequences via recursion

The most important recursive approach applies the inverse of Lempel's D-morphism (Lempel's lift) [Lem70]; given a DB sequence of order $n$ it constructs a DB sequence of order $n+1$. This approach is described in detail below and it has been generalized for $k=2$ [Gam83]. There is also a doubly recursive approach [MEP96]; given a DB sequence of order $n$ it constructs a DB sequence of order $2n$. Knuth provides a nice treatment of both approaches [Knu11, pp.304-305]. When $k=2$, they are combined to obtain an efficiently decodable DB sequence that can be ranked in $O(n \log n)$ steps; the $r$-th bit in the sequence can be found by doing $O(n \log n)$ simple operations on $n$-bit numbers [Knu11, p.317, ex.97-99]. Lempel's lift has also been applied to construct orientable cycles and balanced cut-down DB sequences.

The concatenation-based construction by Ralston [Ral81] applies recursion when $n$ is composite.

Lempel's lift

Lempel's D-morphism [Lem70] maps a cyclic binary string $\alpha = a_1a_2\cdots a_m$ to $\beta = b_1b_2\cdots b_{m}$ where each $b_i = a_{i} \oplus a_{i+1}$, noting $b_m = a_m \oplus a_1$. Both $\alpha$ and its complement $\overline{\alpha}$ map to the same $\beta$. The inverse of this operation, denoted $\mathcal{L}(\beta)$, is known as Lempel's lift. When the weight (number of 1s) of $\beta$ is even $\mathcal{L}(\beta)$ contains two strings (complements); when the weight is odd, it contains one string rotationally equivalent to its complement.

When $\beta = b_1b_2\cdots b_m$ is a universal cycle (UC) of order $n$ for a set $\mathbf{S}$ consisting of $m$ binary strings then:

  1. If $\beta$ has even weight, then $\mathcal{L}(\beta) = \{\alpha_0, \alpha_1\}$ consisting of two disjoint UCs of order $n{+}1$ and length $m$. Each $\alpha_i = a_1a_2\cdots a_m$ where $a_1 = i$ and $a_j = a_{j-1} \oplus b_j$ for $j > 1$.

  2. If $\beta$ has odd weight, then $\mathcal{L}(\beta) = a_1a_2\cdots a_{2m}$ is a single UC of order $n{+}1$ with weight $m$ where $a_1 = 0$ and $a_j = a_{j-1} \oplus b_j$ for $j > 1$ allowing $b_{m+1}\cdots b_{2m} = \beta$.
More generally, for an arbitrary alphabet of size $k>2$, $\mathcal{L}(b_1b_2\cdots b_m) = \{\alpha_0, \alpha_1, \ldots ,\alpha_{k-1}\}$ such that each $\alpha_i = a_1a_2\cdots a_t$ is defined as $a_1 = i$ and $a_j = (a_{j-1} + b_j) \bmod k$ for $j > 1$. Each $t$ is a multiple of $m$ and one or more $\alpha_i$ may correspond to the same cycle (like the odd weight case above).

A punctured DB sequence is a DB sequence with the unique substring $1^n$ replaced with $1^{n-1}$.

Example

Consider DB sequence $\beta_1 = 10111000$ of order $n=3$ with even weight. Then $$\mathcal{L}(\beta_1) = \{01101000, 10010111\}.$$

Consider the punctured DB sequence $\beta_2 = $ 1011000 with odd weight. Then $$\mathcal{L}(\beta_2) = 01101111001000.$$

Below are two methods that apply the above results to recursively construct DB sequences. Like the Euler cycle and greedy approaches, an exponential amount of space is required by these recursive strategies. Each symbol can be generated in $O(1)$-amortized time, but the first symbol requires exponential time delay. Optimized approaches have been considered [Ann97] along with corresponding successor rules [CPKS99,Knu11].

Recursive Construction #1: Lempel's lift for even weight

Let $\mathcal{D}$ be a DB sequence of order $n>1$. Since the weight (number of 1s) of $\mathcal{D}$ is even, let $\mathcal{L}(\mathcal{D}) = \{\alpha_0, \alpha_1\}.$ The UCs $\alpha_0$ and $\alpha_1$ correspond to two edge-disjoint cycles in the underlying DB graph of order $n$ and they both contain the length-$n$ substring (vertex) $1010\cdots$. They can be joined together via this substring (vertex) to obtain a DB sequence of order $n{+}1$.

Example
Consider the DB sequence $\mathcal{D} = 10111000$ of order $n=3$ where $\mathcal{L}(\mathcal{D}) = \{01101000, 10010111\}.$

The DB graph of order $n=3$ on the right highlights the two edge-disjoint cycles 01101000 and 10010111. The two cycles share the common vertex 101 (among others). After rotating the substrings $101$ to the front of each cycle, they can be joined together to obtain $$\mathbf{101}00001~\mathbf{101}11100,$$ which is a DB sequence of order $n{+}1=4$.

     

Recursive Construction #2: : Lempel's lift for odd weight

Let $\mathcal{D}$ be a punctured DB sequence of order $n > 1$. Then $\mathcal{D}$ has odd weight and $\mathcal{L}(\mathcal{D})$ misses only the two alternating substrings $1010\cdots$ and $0101\cdots$ which appear in the cycle 01. The two cycles can be joined to obtain a DB sequence of order $n{+}1$.

Example
Consider the punctured DB sequence $\mathcal{D} = $ 1011000 where $\mathcal{L}(\mathcal{D}) = 01101111001000.$

The DB graph of order 3 on the right highlights the two edge-disjoint cycles 01101111001000 and 01. The two cycles share the common vertex 101. Joining these two cycles together yields $$\mathbf{101}11100100001~\mathbf{10},$$ which is a DB sequence of order $n{+}1=4$.

     

The two recursive constructions generalize naturally to alphabets of size $k>2$ [AA11,Knu11,NW22] where Lempel's lift will create up to $k$ cycles that can be joined together via alternating strings of length $n-1$. As an example for $k=3$ and $n=8$, possible joining strings are 0120120, 1201201, 2012012.

Example ($k$-ary generalization of Recursive #1)

Given DB sequence $\mathcal{D} = 012120200021122210011102201$ of order $n=3$ with $k=3$, then

    $\mathcal{L}(\mathcal{D}) = $ { 010100222212021011120110220, 121211000020102122201221001, 202022111101210200012002112 }

The first two cycles join on 201 to obtain:

    201102200101002222120210111 201221001121211000020102122.

Joining this cycle with the third cycle on 012 yields the following DB sequence:

    012210011212110000201021222011022001010022221202101112 012002112202022111101210200.

Example ($k$-ary generalization of Recursive #2)

Given punctured DB sequence $\mathcal{D} =$ 01201022200202112212100011 of order $n=3$ with $k=3$, then

    $\mathcal{L}(\mathcal{D}) = $ 010011021110020102020000122022002100022120212122220112112210222110121010111120.

Joining this with the small cycle 012 yields the following DB sequence:

    012 012101011112001001102111002010202000012202200210002212021212222011211221022211.

Download source code

» Lempel #1 C program

» Lempel #2 C program

References

  • [AA11] A. Alhakim and M. Akinwande. A recursive construction of nonbinary de Bruijn sequences. Designs Codes and Cryptography, 60:155–169, 08 2011.

  • [Ann97] F. Annexstein. Generating de Bruijn sequences: an efficient implementation. IEEE Transactions on Computers, 46(2):198–200, 1997.

  • [CPKS99] T. Chang, B. Park, Y. H. Kim, and I. Song. An efficient implementation of the D-homomorphism for generation of de Bruijn sequences. IEEE Transactions on Information Theory, 45(4):1280–1283, 1999.

  • [Gam83] R. Games. A generalized recursive construction for de Bruijn sequences. IEEE Transactions on Information Theory, 29(6):843–850, 1983.

  • [Knu11] D. E. Knuth. The Art of Computer Programming. Vol. 4A. Combinatorial Algo- rithms. Part 1. Addison-Wesley, Upper Saddle River, NJ, 2011.

  • [MEP96] C. Mitchell, T. Etzion, and K. Paterson. A method for constructing decodable de Bruijn sequences. IEEE Transactions on Information Theory, 42(5):1472–1478, 1996.

  • [Lem70] A. Lempel. On a homomorphism of the de Bruijn graph and its applications to the design of feedback shift registers. IEEE Transactions on Computers, C-19(12):1204–1209, 1970.

  • [Ral81] A. Ralston. A new memoryless algorithm for de Bruijn sequences. J. Algorithms, 2(1):50–62, 1981.