%I #32 Aug 21 2017 03:09:44
%S 0,1,3,5,8,12,16,21,27,34,42,50,59,69,80,92,105,119,134,150,166,183,
%T 201,220,240,261,283,306,330
%N Maximal number of distinct nonempty substrings of any binary string of length n.
%C It would be more natural to include the empty substring, which would result in the sequence b(n)=a(n)+1; all values computed so far confirm the conjecture that a(n)+1 = A006697(n). - _M. F. Hasler_, Dec 17 2007
%C In addition to the example given, computation finds 17 other binary strings of length 6 which contain the maximal number a(6)=16 of distinct substrings. Interestingly, each of the 18 such instances gives the same numbers of substrings of each possible length, in this case 2,4,4,3,2,1 substrings of lengths 1 through 6, respectively.
%C Calculations suggest that, for any n>0, binary strings of length n exist such that the number of distinct substrings of length k, 1<=k<=n, is as large as possible consistent with basic counting principles, i.e., n-k+1 substrings of length k starting at each of the n-k+1 possible starting positions, subject to the condition, however, that there cannot be more than 2^k distinct binary strings of length k. This suggests the following conjecture. Conjecture: The maximum number a(n) of distinct substrings of a binary string of length n is given by a(n) = Sum_{k=1..n} t(k) where t(k) is the smaller of 2^k and n-k+1.
%C Conjecture: a(n) = A006697(n)-1. - _Vladeta Jovovic_, Sep 19 2005
%C Conjecture: a(n) = 2^(m+1)-2 + (n-m)(n-m+1)/2, where m = floor(n+1-LambertW( 2^(n+1) * log(2) ) / log(2) ) = integer part of the solution to 2^x = n+1-x. This is based on the reasoning that you can construct the word of length n so that it contains the maximal possible number, max( n-k+1, 2^k ), of different substrings of length k. - _M. F. Hasler_, Dec 17 2007
%C From Peter Pein (petsie(AT)dordos.net), Jan 18 2008: (Start)
%C The following heuristic seems to work for computing the maximal number of distinct substrings of a binary string of length n.
%C Start with the empty list and at each step try to insert a zero or a one at each possible position. Then pick the first list with the maximal number of sublists and start over.
%C Say we have had {0,0,1,1,0} as one of the lists with the maximal number of sublists. Then my candidates for the next test are:
%C With[{lastbest = {0, 0, 1, 1, 0}}, Union[Flatten[ Outer[Insert[lastbest, #2, #1] &, Range[1 + Length[lastbest]], {0, 1}, 1], 1]]]
%C {{0, 0, 0, 1, 1, 0}, {0, 0, 1, 0, 1, 0}, {0, 0, 1, 1, 0, 0}, {0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 1, 0}, {0, 1, 0, 1, 1, 0}, {1, 0, 0, 1, 1, 0}}
%C See http://freenet-homepage.de/Peter_Berlin/Mathe/heuristicA094913.nb for the Mathematica notebook with the complete (simple) algorithm. There's a screenshot too (same URL but with .png instead of .nb).
%C If this works correctly, the first 100 values of A094913 are calculated in 30 secs.
%C (End) [See also the remarks in the Fuller link in A134457.]
%C From _Jon Perry_, Nov 16 2010: (Start)
%C Conjecture: column sums of:
%C 1 3 5 7 9 11 13 ...
%C 1 3 5 7 ...
%C 1 ...
%C ....................
%C --------------------
%C 1 3 5 8 12 16 21 ... (End)
%H The history of this problem is given in a <a href="http://www.combinatorics.org/ojs/index.php/eljc/article/view/v11i1r8/comment">comment</a> in the Electronic Journal of Combinatorics. Some of the conjectures above are proven in the papers cited.
%H J.-P. Allouche, J. Shallit, <a href="http://arxiv.org/abs/1605.02361">On the subword complexity of the fixed point of a -> aab, b -> b, and generalizations</a>, arXiv preprint arXiv:1605.02361 [math.CO], 2016.
%H Abraham Flaxman, Aram W. Harrow, Gregory B. Sorkin, <a href="http://www.combinatorics.org/ojs/index.php/eljc/article/view/v11i1r8">Strings with Maximally Many Distinct Subsequences and Substrings</a>, Electronic J. Combinatorics 11 (1) (2004), Paper R8.
%H Antal Iványi, <a href="http://compalg.inf.elte.hu/~tony/Kutatas/PerfectArrays/On_the_d-complexity.pdf">On the d-complexity of words</a>, Ann. Univ. Sci. Budapest. Sect. Comput. 8 (1987), 69-90 (1988).
%H Jeffrey Shallit, <a href="https://cs.uwaterloo.ca/~shallit/Papers/m0.ps">On the maximum number of distinct factors in a binary string</a>, Graphs Combin. 9 (1993), no. 2, 197-200.
%e The binary string 000110 of length 6 contains the 16 distinct substrings 0, 1, 00, 01, 11, 10, 000, 001, 011, 110, 0001, 0011, 0110, 00011, 00110, 000110 and a computer search shows that no other binary string of length 6 contains more than 16. Thus a(6)=16.
%e G.f. = x + 3*x^2 + 5*x^3 + 8*x^4 + 12*x^5 + 16*x^6 + 21*x^7 + 27*x^8 + 34*x^9 + ...
%t A103354[n_] := Floor[ FullSimplify[ ProductLog[ 2^n*Log[2]]/Log[2]]]; Accumulate[ Table[ A103354[n], {n, 1, 29}]] - 1 (* _Jean-François Alcover_, Dec 13 2011, after _M. F. Hasler_ *)
%Y Cf. A006697, A134457, A134466.
%K nonn,nice
%O 0,3
%A _John W. Layman_, Jun 17 2004
%E a(19)-a(28) from _David W. Wilson_, Dec 17 2007
%E More references (some proving some conjectures) added by _Jeffrey Shallit_, Nov 21 2015