De Bruijn Sequence and Universal Cycle Constructions

Binary Successor Rule Output
Algorithm
Order $n$
Constructing binary DB sequences via successor rules derived from cycle joining

Let $\mathbf{B}(n)$ denote the set of binary strings of length $n$. A function $f : \mathbf{B}(n) \rightarrow \{0,1\}$ is a DB-successor if there exists a DB sequence $\mathcal{D}$ such that each string $\alpha = a_1a_2\cdots a_n \in \mathbf{B}(n)$ is followed by the symbol $f(\alpha)$ in $\mathcal{D}$. Each DB-successor is based on an underlying feedback function which partitions the de Bruijn graph $G(n{-}1,k)$ into a factor of small cycles. Then, simple parent-rules are applied to joining the small cycles together via conjugate pairs (two length $n$ strings differing in the first symbol) to construct a cycle-joining tree. This approach is similar to Hierholzer's algorithm for generating an Euler cycle. As an example, the following figure illustrates four PCR-based cycle-joining trees created via simple parent rules, where the nodes (necklaces) are cyclic.

          

Each corresponding DB successor (outlined in the next section) will run in $O(n)$ time per symbol generated using $O(n)$ space; however, there is also a corresponding concatenation tree approach that allows for a more efficient generation of the corresponding DB sequence. Below, we provide DB-successors derived from the cycle-joining approach applied to simple feedback functions.

DB-successors based on the PCR

Using the PCR as the underlying feedback function to initially partition $\mathbf{B}(n)$ into smaller cycles, the following four simple DB-successors are obtained [GSWW18]. These rules test whether or not a string is a necklace, which is a string that is lexicographically smallest amongst all its rotations. This can be done in $O(n)$ time. The successor rule for the Granddaddy [Fre72] corresponds to the lexicographically smallest binary DB sequence that can also be constructed by the Prefer-smallest greedy approach and a concatenation construction. The Grandmama [DHS+18] can also be constructed via a concatenation construction.

PCR1 Granddaddy DB-successor (Last 0)

      $f(\alpha) = \left\{ \begin{array}{ll} \overline{a}_1 &\ \ \mbox{if $\color{blue}{a_ja_{j+1}\cdots a_n 0 1^{j-2}}$ is a necklace;}\\ {a_1} \ &\ \ \mbox{otherwise,}\end{array} \right.$

where $j$ is the smallest index of $\alpha$ such that $a_j = 0$ and $j > 1$, or $j=n{+}1$ if no such index exists.

PCR2 Grandmama DB-successor (First 1)

      $f(\alpha) = \left\{ \begin{array}{ll} \overline{a}_1 &\ \ \mbox{if $\color{blue}{0^{n-j}1a_2\cdots a_j}$ is a necklace;}\\ {a_1} \ &\ \ \mbox{otherwise,}\end{array} \right.$

where $j$ is the largest index of $\alpha$ such that $a_j = 1$, or $j=0$ if no such index exists.

The following two successors are perhaps the simplest to state DB-successors. The PCR3 DB-successor [SWW16b] is a special case of a more generic DB-successor [JFB91].

PCR3 Granny DB-successor (Last 1)

      $ f(\alpha) = \left\{ \begin{array}{ll} \overline{a}_1 &\ \ \mbox{if $\color{blue}{a_2a_3\cdots a_n1}$ is a necklace;}\\ {a_1} \ &\ \ \mbox{otherwise.}\end{array} \right.$

PCR4 Grandpa DB-successor (First 0)

      $f(\alpha) = \left\{ \begin{array}{ll} \overline{a}_1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ is a necklace;}\\ {a_1} \ &\ \ \mbox{otherwise.}\end{array} \right.$

DB-successors based on the CCR

Using the CCR as the underlying feedback function, the following three simple DB-successors are obtained [GSWW18]. A string $\gamma$ is a co-necklace if $\gamma \overline{\gamma}$ is a necklace.

CCR1 DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 &\ \ \mbox{if $\color{blue}{a_ja_{j+1}\cdots a_n 1 0^{j-2}}$ is a co-necklace;}\\ \overline{a}_1 \ &\ \ \mbox{otherwise.}\end{array} \right.$

where $j$ is the smallest index of $a_2a_3\cdots a_n$ such that $a_j = 0$, or $j=n{+}1$ if no such index exists.

CCR2 DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 &\ \ \mbox{if $\color{blue}{0^{n-j}1\overline{a}_2 \overline{a}_3\cdots \overline{a}_{j}}$ is a co-necklace;}\\ \overline{a}_1 \ &\ \ \mbox{otherwise.}\end{array} \right.$

where $j$ is the largest index of $\alpha$ such that $a_j = 1$, or $j=n$ if no such index exits

CCR3 DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 &\ \ \mbox{if $\color{blue}{a_2 a_3 \cdots a_{n}0}$ is a co-necklace not equal to $0^n$;}\\ \overline{a}_1 \ &\ \ \mbox{otherwise.}\end{array} \right.$

Like PCR3, the CCR3 DB-successor is an efficient application of the more generic rule by [JFB91]. The DB-successor given by Huang [Hua90] is also based on the CCR, however, it is more complicated. Etzion [Etz87] also presented a DB-successor based on the CCR, but it is also more complex and has yet to be implemented.

DB-successors based on the PRR

By considering the PRR as the underlying feedback function, the underlying structure of the greedy Prefer-same and Prefer-opposite constructions have been unveiled [SSA20]. These sequences are the lexicographically largest and smallest sequences, respectively, with respect to their run-length encoding [ASS21]. Below are two additional constructions related to each sequence. They are used as building blocks to defining the successors. Each of the following six DB-successors can be implemented in $O(n)$ time per symbol generated using $O(n)$ space.

Prefer-same: Consider the following three representatives for equivalence classes induced by the PRR, where a string is said to be special if it begins and ends with 0 and has run-length encoding of the form $(21^{2x})^y1^z, \mbox{where $x \geq 0$, $y\geq 2$, and $z\geq 2$}.$

  • RL-rep: The string with the lexicographically largest run-length encoding; if there are two such strings it is the one beginning with 1.

  • LC-rep: The string $\omega$ such that $t{+}1$ applications of the PRR starting with $\omega$ yields the RL-rep, where $t$ is the number of consecutive 1s at the end of the run-length encoding of the RL-rep.

  • same-rep: $\left\{ \begin{array}{ll} \mbox{RL-rep} &\ \ \mbox{if the RL-rep is special}\\ \mbox{LC-rep}\ &\ \ \mbox{otherwise.}\end{array} \right.$

PRR-RunLength DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 \oplus a_2 \oplus a_n \oplus 1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ or $\color{blue}{1a_2 a_3 \cdots a_{n}}$ is an RL-rep;}\\ a_1 \oplus a_2 \oplus a_n\ &\ \ \mbox{otherwise.}\end{array} \right.$

The following successor can be used to construct the lexicographic composition concatenation construction [FK77].

PRR-LexComp DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 \oplus a_2 \oplus a_n \oplus 1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ or $\color{blue}{1a_2 a_3 \cdots a_{n}}$ is an LC-rep;}\\ a_1 \oplus a_2 \oplus a_n \ &\ \ \mbox{otherwise.}\end{array} \right.$

PRR-PrefSame DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 \oplus a_2 \oplus a_n \oplus 1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ or $\color{blue}{1a_2 a_3 \cdots a_{n}}$ is a same-rep;}\\ a_1 \oplus a_2 \oplus a_n \ &\ \ \mbox{otherwise.}\end{array} \right.$


Prefer-opposite: Consider the following three representatives for equivalence classes induced by the PRR, where a string is said to be special2 if it begins with 10 and has run-length encoding of the form $1x^t y, \mbox{where $t$ is odd and $y>x$}.$

  • RL2-rep: The string with the lexicographically smallest run-length encoding; if there are two such strings it is the one beginning with 0.

  • LC2-rep: The string $\omega$ such that $t$ applications of the PRR starting with $\omega$ yields the RL2-rep, where $t$ is last value in the run-length encoding of the RL2-rep.

  • opposite-rep: $\left\{ \begin{array}{ll} \mbox{RL2-rep} &\ \ \mbox{if the RL2-rep is special2}\\ \mbox{LC2-rep}\ &\ \ \mbox{otherwise.}\end{array} \right.$

PRR-RunLength2 DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 \oplus a_2 \oplus a_n \oplus 1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ or $\color{blue}{1a_2 a_3 \cdots a_{n}}$ is an RL2-rep;}\\ a_1 \oplus a_2 \oplus a_n\ &\ \ \mbox{otherwise.}\end{array} \right.$

PRR-LC2 DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 \oplus a_2 \oplus a_n \oplus 1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ or $\color{blue}{1a_2 a_3 \cdots a_{n}}$ is an LC2-rep;}\\ a_1 \oplus a_2 \oplus a_n \ &\ \ \mbox{otherwise.}\end{array} \right.$

PRR-PrefOpposite DB-successor

      $f(\alpha) = \left\{ \begin{array}{ll} a_1 \oplus a_2 \oplus a_n \oplus 1 &\ \ \mbox{if $\color{blue}{0a_2 a_3 \cdots a_{n}}$ or $\color{blue}{1a_2 a_3 \cdots a_{n}}$ is an opposite-rep;}\\ a_1 \oplus a_2 \oplus a_n \ &\ \ \mbox{otherwise.}\end{array} \right.$

References

  • [ASS21] A. Alhakim, E. Sala, and J. Sawada. Revisiting the Prefer-same and Prefer-opposite de Bruijn sequence constructions. Theoretical Computer Science, 852:73–77, 2021.

  • [DHS+18] P. B. Dragon, O. I. Hernandez, J. Sawada, A. Williams, and D. Wong. Constructing de Bruijn sequences with co-lexicographic order: the $k$-ary Grandmama sequence. European J. Combin., 72:1-11, 2018.

  • [Etz87] T. Etzion. Self-dual sequences. J. Combin. Theory Ser. A, 44(2):288-298, 1987.

  • [Fre72] H. Fredricksen. Generation of the Ford sequence of length $2n$; $n$ large. J. Combinatorial Theory Ser. A, 12:153-154, 1972.

  • [FK77] H. Fredricksen and I. Kessler. Lexicographic compositions and de Bruijn sequences. J. Combinatorial Theory Ser. A, 22(1):17-30, 1977.

  • [GSWW18] D. Gabric, J. Sawada, A. Williams, and D. Wong. A framework for constructing de Bruijn sequences via simple successor rules. Discrete Math., 341(11):2977-2987, 2018.

  • [Hua90] Y. J. Huang. A new algorithm for the generation of binary de Bruijn sequences. J. Algorithms, 11(1):44-51, 1990.

  • [JFB91] C. J. A. Jansen, W. G. Franx, and D. E. Boekee. An efficient algorithm for the generation of de Bruijn cycles. IEEE Trans. Inform. Theory, 37(5):1475-1478, 1991.

  • [SSA20] E. Sala, J. Sawada, A. Alhakim. Efficient constructions of the Prefer-same and Prefer-opposite de Bruijn sequences. arXiv preprint arXiv:2010.07960, 2020.

  • [SWW16b] J. Sawada, A. Williams, and D. Wong. A surprisingly simple de Bruijn sequence construction. Discrete Math., 339(1):127-131, 2016.