De Bruijn Sequence and Universal Cycle Constructions

Cutdown Output
 Order $n$ Symbols $k$ 2 3 4 5 6 7 8 9 Cycle length $m$
Cut-down DB sequences

For some applications, a universal cycle of arbitrary length $1 \leq m \leq k^n$ is required such that every $k$-ary string of length $n$ appears at most once. For instance, consider trying to apply the de Bruijn Card Trick on a complete deck of 52 cards. We call such cycles cut-down de Bruijn sequences. Such universal cycles are known to exist for all $m$ and $k$ [Lem71,Yoe62], however only a few efficient constructions are known.

A cycle-joining based construction for $k=2$ can produce an exponential number of cut-down DB sequences [Etz86]. The construction can be simplified to construct a single sequence by applying PCR3-alt as the underlying feedback function to join cycles, then generalized for $k\geq 2$ to generate a single cut-down DB sequence that runs in $O(n)$ time per symbol using $O(n)$ space [CGS21]. The resulting algorithms are available for download and are used in the generation above. The cycle-joining based construction can be summarized as follows, assuming $k^{n-1} < m \leq k^n$.

$k$-ary cut-down algorithm (cycle-join)

1. Construct a main cycle $C$ that has length $m+j$ where $0 \leq j < n$.
2. Cut out at most two small substrings (corresponding to small cycles in the related de Bruijn graph) from $C$ whose combined length totals $j$.
By starting with the string $0^{n-1}x$, where in most cases $x=(k{-}1)$, and applying the PCR3-alt as the underlying successor, the main cycle can be constructed by cutting out cycles based on their weight and period to obtain a length that meets the bounds. The small cycle(s) can be cut out by modifying the output given by PCR3-alt for at most two strings. The choice of PCR3-alt as the underlying successor is what leads to the simplicity of this construction. By applying a suitable ranking function, this method can be employed to join specific cycles which allows for a successor-rule based approach to create the corresponding cut-down de Bruijn sequence starting from any string on the cycle.

A simple LFSR based construction, as given by Golomb [Gol17, P.193], is summarized below, assuming $m$ is not a power of 2.

Binary cut-down algorithm (LFSR)

1. Create an LFSR with feedback function $f$ based on a primitive polynomial of order $n$ where $2^{n-1} < m < 2^n$.
2. Seed the LFSR with $\alpha = 0^{n-1}1$ and iterate $L$ times to get string $\beta$.
3. Simultaneously run two copies of the LFSR, one seeded with $\alpha$ and one with $\beta$, until the registers contain a conjugate pair; the registers match in the last $n{-}1$ positions.
4. Output the next $m$ bits produced by LFSR that started with $\alpha$.
Example

Consider $m=21$ and $n=5$. The primitive polynomial $1 + x^2 + x^5$ corresponds to the feedback function $f(a_1a_2a_3a_4a_5) = a_1 + a_4$. When the LFSR based on this feedback function is initialized with $\alpha = 00001$ the following bits are produced in the next 21 iterations: $$\underline{00001}~0101110110001111\underline{10011}$$ which yields $\beta = 10011$. Applying the LFSR starting with both $\alpha$ and $\beta$ respectively: $$\mbox{Start with \alpha: } 0000\underline{1010}$$ $$\mbox{Start with \beta: } 1001\underline{1010}$$ until the final $n{-}1=4$ positions match. Continue iterating the LFSR seeded with $\alpha$; the next 21 symbols form the cut-down de Bruijn sequence of length $m=21$: $$01010~\underline{111011000111110011010}.$$

This construction requires $O(n)$ space and produces the sequence in $O(n)$-amortized time per symbol. However unlike the first algorithm, this algorithm requires an exponential time delay to produce the first symbol of the sequence; a unique primitive polynomial is required for each order $n$ to obtain the feedback function; and we are unable to determine whether or not a length $n$ string is a member of the constructed cut-down de Bruijn sequence without constructing the sequence.

An LFSR approach can also be applied to construct cut-down de Bruijn sequences over an arbitrary alphabet size $k > 2$ [Lan00], though no algorithmic analysis has been done.

A generalized de Bruijn sequence is a cut-down de Bruijn sequence with length $k^{n-1} < m \leq k^n$ with an additional property: every $k$-ary string of length $n{-}1$ appears as a substring. Their existence is known for all $m$ and $k$ [GHS19]. An algorithm based on Lempel's D-morphism can be used to construct these sequences [NW21] that have an even stronger property: every $k$-ary string of length $j\leq L$ appears either $\lfloor L/k^j \rfloor$ or $\lceil L/k^j \rceil$ times as a substring. The algorithm can generate the sequence in $O(1)$-amortized time per symbol, but requires exponential space and has an exponential time delay before outputting the first symbol.

References

• [CGS21] B. Cameron, A. Gündoǧan, and J. Sawada. Cut-down de Bruijn sequences. Manuscript, 2021.

• [Etz86] T. Etzion. An algorithm for generating shift-register cycles. Theoret. Comput. Sci., 44(2):209-224, 1986.

• [GHS19] D. Gabric, S. Holub, and J. Shallit. Maximal state complexity and generalized de Bruijn words. Information and Computation, (in press), 2021.

• [Gol17] S. W. Golomb. Shift Register Sequences. World Scientific, Singapore, 2017.

• [Lan00] M. Landsberg. Feedback functions for generating cycles over a finite alphabet. Discrete Mathematics, 219(1):187–194, 2000.

• [Lem71] A. Lempel. $m$-ary closed sequences. Journal of Combinatorial Theory, Series A, 10(3):253-258, 1971.

• [NW21] A. Nellore and R. Ward. Arbitrary-length analogs to de Bruijn sequences, arXiv:2108.07759, 2021.

• [Yoe62] M. Yoeli. Binary ring sequences. The American Mathematical Monthly, 69(9):852– 855, 1962.