De Bruijn Sequence and Universal Cycle Constructions

Maximum discrepancy Output
Order $n$
Properties of binary DB sequences =

As articulated in [Gol81], binary DB sequences:

  • are balanced: they contain the same number of 0s and 1s;
  • satisfy a run property: there are an equal number of contiguous runs of 0s and 1s of the same length in the sequence,
  • satisfy a span-$n$ property: they contain every distinct length $n$ binary string as a substring.
Despite these properties, many DB sequences display other properties that are far from random. For instance, a DB sequence known to be generated by a linear feedback shift register can be completely determined after reading only $2n$ bits [Mas69]. In the greedy Prefer-largest construction, as one would expect, the resulting DB sequence has a much higher ratio of 1s to 0s at the start of the sequence. One measure that accounts for this is the discrepancy: the maximum absolute difference between the number of 0s and 1s in any substring of a given sequence. For instance, the discrepancy in the DB sequence \[\underline{1111110111100111010111}000110110100110010110000101010001001000000 \] is $|17-5|=12$ as witnessed by the underlined substring. The sequences generated by this binary Prefer-largest greedy construction are known to have discrepancy $\Theta(\frac{2^n\log n}{n})$ [CH10].

Discrepancies of binary DB sequences

The DB sequences with the smallest discrepancy are based on the complementing cycling register (CCR). It is easy to prove that the two CCR concatenation constructions in [GS18], which correspond to the shift-rules CCR2 and CCR3, have discrepancy less than $2n$. This is because the maximum discrepancy in any single CCR cycle is $n$ [GS17]. Experimentally, the shift rule by Huang [H90] demonstrates slightly smaller discrepancy, though no formal analysis has been given.

The greedy Prefer-same and Prefer-opposite, along with the lexicographic composition concatenation approach all appear to have discrepancies of $O(n^2)$. The table below summarizes their discrepancies along with the four shift rules based on the CCR.

          

The DB sequences with the maximal discrepancy can be obtained by considering weight (density) ranges. By taking a universal cycle $U_1$ for all binary strings with weight from 0 to $\lceil n/2 \rceil -1$ and joining it with a universal cycle $U_2$ for all binary strings with weights from $\lceil n/2 \rceil$ to $n$, the discrepancy will be close to the discrepancy of $U_1$, which is ${n-1 \choose \lfloor n/2 \rfloor}$. It is conjectured that the maximal discrepancy for any DB sequence is ${n-1 \choose \lfloor n/2 \rfloor} + \lfloor n/2 \rfloor$, which by Stirling's approximation is $\Theta(\frac{2^n}{\sqrt{n}})$. This is exactly the discrepancy obtained by this construction, which is available for generation and download above [GS20]. The table below shows the discrepancy of this sequence, along with the cool-lex concatenation construction which is also based on weight ranges. The table also illustrates the discrepancies for the shift rules PCR1 (Granddaddy), PCR2 (Grandmama), PCR3, and PCR4.

          

The "Random" column is obtained by taking the average discrepancy of 10000 random sequences of length $2^n$. Such a random sequence is expected to have discrepancy $\Theta(2^{n/2} \sqrt{\log n})$. It remains an open problem to define/construct a DB sequence that is provably close to these random values. Experimentally, the shift-rule PCR4 constructs a DB sequence that is closest to the expected discrepancy of a random sequence.

Download source code

» Weight range C program

References

  • [CH10] J. Cooper and C. Heitsch. The discrepancy of the lex-least de Bruijn sequence. Discrete Math., 310(6-7):1152-1159, 2010.

  • [Gol81] S. W. Golomb. Shift Register Sequences. Aegean Park Press, Laguna Hills, CA, USA, 1981.

  • [GS17] D. Gabric and J. Sawada. A de Bruijn sequence construction by concatenating cycles of the complemented cycling register. In Combinatorics on words, volume 10432 of Lecture Notes in Comput. Sci., pages 49-58. Springer, Cham, 2017.

  • [GS18] D. Gabric and J. Sawada. Constructing de Bruijn sequences by concatenating smaller universal cycles. Theoret. Comput. Sci., 743:12-22, 2018.

  • [GS20] D. Gabric and J. Sawada. Investigating the discrepancy property of de Bruijn sequences. Manuscript, 2020.

  • [Mas69] J. Massey. Shift-register synthesis and BCH decoding. IEEE Transactions on Information Theory, 15(1):122–127,1969.