De Bruijn Sequence and Universal Cycle Constructions

Greedy Construction Output
 Algorithm Pref largest Pref smallest-Granddaddy Pref opposite Pref same Pref XNOR (binary) Order $n$ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Symbols $k$ 2 3 4 5 6 7 8 9
Constructing DB sequences using greedy algorithms

Perhaps the most surprising DB sequence constructions are the greedy constructions. A greedy construction starts with a seed string, then repeatedly applies some greedy rule to determine the next symbol of a sequence. The algorithm stops when it is impossible to add another symbol without creating a duplicate substring of length $n$, or some termination condition is reached. Such constructions, however, have a major drawback: they require exponential space.

The Prefer-largest [Mar34] (smallest) greedy approach does not use any information about the string currently generated to prioritize the next symbol. There are two approaches whose greedy decision is based on the last symbol generated: the Prefer-opposite [Alh10] and the Prefer-same [EGGR58,Fre82,ASS21]. These sequences can also be efficiently generated by a successor rule [SSA20].

These greedy approaches can be extended when the preferences are based more generally on the last $j$ symbols generated [Alh12]. An example using the last two symbols (in the binary case) is the greedy XNOR [WWW18]. In fact, greedy approaches are special cases of an Euler cycle construction based on the de Bruijn graph.

The Prefer-largest construction

The most well-known greedy construction is the Prefer-largest [Mar34], which generates the lexicographically largest DB sequence.

Prefer-largest Greedy Construction
1. Seed with $0^{n-1}$
2. Repeat until no new symbol can be added: Append the largest symbol in $\{k{-}1, \ldots , 2,1,0\}$ that does not create a duplicate length $n$ substring
3. Remove the seed
For example, applying this construction for $n=4$ and $k=2$ we obtain the string:

000 1111011001010000.

Note the importance of the string $0^{n-1}$ to seed the algorithm. Starting with any other seed will not produce a DB sequence using this greedy approach.

The Prefer-smallest is equivalent to the Prefer-largest construction when the symbol 0 is replaced by $k{-}1$, 1 is replaced by $k{-}2$ and so on. The Prefer-smallest greedy algorithm produces the lexicographically smallest DB sequence, and it can be constructed more efficiently by either a successor rule or by a concatenation approach.

The Prefer-same construction

The original presentations of a binary Prefer-same construction [EGGR58,Fre82] required an additional counting test for certain substrings. However, when seeded with an appropriate string, as described in the related Euler cycle algorithms, these tests are not required [ASS21].

Prefer-same Greedy Construction (Binary)
1. Seed with length $n{-}1$ string $\cdots 01010$
2. Append 1
3. Repeat until no new bit is added: Append the same bit as the last if it does not create a duplicate length $n$ substring; otherwise, append the opposite bit as the last if it does not create a duplicate length $n$ substring
4. Remove the seed

The resulting sequence is lexicographically largest with respect to a run-length encoding. For example, the run-length encoding of the generated DB sequence for $n=5$ is given by: $$11111000001110110011010001001010 = 5531222113121111.$$ Note that this property holds when the roles of the 0 and 1 are interchanged. The binary Prefer-same sequence can also be constructed efficiently in $O(n)$ time per bit via a successor rule. One way to generalize this greedy construction to a larger alphabet of size $k$:

1. Let the preference order for the last symbol $s$ be: $s, s{-}1, \ldots, 0, k{-}1, \ldots ,s{+}1$.
2. Update the seed to be the length $n{-}1$ string: $\cdots 01\cdots (k{-}1)01\cdots(k{-}1)01\cdots (k{-}2)$.
3. Append $k{-}1$ instead of 1 at Step 2.

The Prefer-opposite construction

A strict Prefer-opposite approach does not yield a DB sequence. However, as described in the section on de Bruijn graphs, by modifying the preference for the special vertex $1^{n-1}$ in the underlying Euler cycle algorithm, we indeed obtain a DB sequence. The algorithm presented below produces a shifted instance of the originally presented sequence in [Alh10]. Here, the initial seed of $0^{n-1}$ is rotated to the end so the resulting sequence is lexicographically smallest with respect to a run-length encoding [ASS21].

Prefer-opposite Greedy Construction (binary)
1. Seed with $0^{n-1}$
2. Append 0
3. Repeat until no new bit is added:
• if the current suffix is $1^{n-1}$ then append $1$ if it is the first time $1^{n-1}$ has been seen, and append 0 otherwise.
• else append the opposite bit as the last if it does not create a duplicate length $n$ substring; otherwise, append the same bit as the last
4. Remove the seed

The binary Prefer-opposite sequence can also be constructed efficiently in $O(n)$ time per bit via a successor rule. One way to generalize this greedy construction to a larger alphabet of size $k$:

1. Let the preference order for the last symbol $s$ be: $s{+}1, \ldots, k{-}1, 0, 1, \ldots, s$.
2. The special case vertices are $1^{n-1}, 2^{n-1}, \ldots, (k{-}1)^{n-1}$, where their preference lists interchange the final two symbols $s{-}1$ and $s$. Thus, the special case is applied only after $k-2$ occurrences of a given special suffix have been visited.

These special case strings all occur in the final $kn$ symbols of the resulting sequence. Try it out above!

The Prefer-XNOR construction

Like the Prefer-opposite, the greedy XNOR has one special case. There are three different seeds presented in [WWW18] and here we provide one construction.

Prefer-XNOR Greedy Construction
1. Seed with the length $n{-}1$ string $011011011\cdots$
2. Append the sum of the last two bits in the seed (mod 2)
3. Repeat until no new bit is added:
• if the current suffix is $10^{n-1}$ then append 0
• else, append the complement of the sum of the last two bits (mod 2) if it does not create a duplicate length $n$ substring; otherwise, append the sum of the last two bits (mod 2) if it does not create a duplicate length $n$ substring
4. Remove the seed

References

• [Alh10] A. Alhakim. A simple combinatorial algorithm for de Bruijn sequences. Amer. Math. Monthly, 117(8):728-732, 2010.

• [Alh12] A. Alhakim. Spans of preference functions for de Bruijn sequences. Discrete Appl. Math., 160(7-8):992-998, 2012.

• [ASS21] A. Alhakim, E. Sala, and J. Sawada. Revisiting the Prefer-same and Prefer-opposite de Bruijn sequence constructions. Theoretical Computer Science, 852:73–77, 2021.

• [EGGR58] C. Eldert, H. Gray, H. Gurk, and M. Rubino. Shifting counters. AIEE Trans., 77:70-74, 1958.

• [Fre82] H. Fredricksen. A survey of full length nonlinear shift register cycle algorithms. Siam Review, 24(2):195-221, 1982.

• [Mar34] M. H. Martin. A problem in arrangements. Bull. Amer. Math. Soc., 40(12):859-864, 1934.

• [SSA20] E. Sala, J. Sawada, A. Alhakim. Efficient constructions of the Prefer-same and Prefer-opposite de Bruijn sequences. arXiv preprint arXiv:2010.07960, 2020.

• [WWW18] X. Wang, D. Wong, and Z. WeiGuo. A simple greedy de Bruijn sequence construction. In Sequences and Their Applications (SETA 2018), 2018.