Counting Words that are in "Standard Order" Joerg Arndt, Technische Hochschule Nürnberg, Fakultät EFI, Rothenberger Strasse 174, 90439 Nürnberg, Germany and Neil J. A. Sloane, The OEIS Foundation, 11 South Adelaide Avenue, Highland Park, NJ 08904, USA Dec 06 2016 Abstract. --------- A word of length n over an alphabet B = {1,2,3,...,b} is in "standard order" if whenever a letter r (2 <= r <= b) appears, the letter r-1 has already appeared in the word. In this note we determine the number of words that are in standard order, the number that are in standard order and and have at least one repeated letter, and the number that are in standard order and in which every letter that is present appears at least twice. This answers some questions raised by Daniel Devatman Hromada. 0. The three problems. ---------------------- We study words W = (w_1, w_2, ..., w_n) of length n made up of letters from an alphabet B = {1,2,3,...,b} of size b, where b >= 1. There are b^n possible words of length n. We say that a word W = (w_1, w_2, ..., w_n) is in "standard order" if it has the properties that w_1 = 1, and for 2 <= i <= n, if w_i = x then w_j = x-1 for some j < i. In other words, x > 1 can only appear in a word if x-1 has already appeared. We determine the answers to the following questions. Q1: How many words of length n over an alphabet of size b are in standard order? Q2: Same as Q1, but with the additional restriction that some letter in the word appears more than once Q3: Same as Q1, but with the additional restriction that every letter that appears in the word must appear more than once. These questions were raised by Daniel Devatman Hromada in [1] and in two sequences (A273977 and A273978) submitted to the OEIS [2] in November 2016. The answers will involve many other sequences from [2], several of them new. Ther answers are given below in Sections 1 through 3, and Sections 4 through 6 contain tables of the results. 1. The answer to question Q1. ----------------------------- Recall that a "set partition" [3] of n things labeled 1 though n is a way of splitting up these numbers into heaps. For example, a set partition of {1,2,3,4,5,6} might be {1,5}, {2,3,6}, {4}. In this example there are three heaps. The number of set partitions of n things into b heaps is the Stirling number of the second kind, Stirling_2(n,b). See entry A008277 in the OEIS , where there is a great deal of information about these numbers. There is a simple way to change a set partition into a word. Put the heaps in lexicographic order, as we already did here: {1,5}, {2,3,6}, {4}. This means we sort the contents of each heap into increasing order, and then sort the heaps according to the smallest number in each heap. We get a word of length n over an alphabet of size b by writing 1 in the positions listed in the first heap, 2's in the positions listed in the second heap, 3's in the positions listed in the 3rd heap, and so on. In the example the alphabet has size b = 3, and the word is 122312, of length n = 6. Because the set partition has been sorted, the word is automatically in standard order. So the answer to Q1 is The number of words of length n over an alphabet of size b that are in standard order is Sum_{j = 1..b} Stirling_2(n,j). 2. The answer to question Q2. ----------------------------- The calculation in Section 1 over-counts the answer to Q2, because it includes those words in standard order in which no symbol is repeated. These are the words 1, 12, 123, 1234, ..., 1234...b. There is one of each length from 1 to b, and we must exclude them from the total. So the answer to Q2 is The number of words of length n over an alphabet of size b that are in standard order and in which at least one symbol appears more than once is Sum_{j = 1..b} Stirling_2(n,j), except we must subtract 1 if and only if n <= b. 3. The answer to question Q3. ----------------------------- Now we require that all heaps must have size at least 2. To answer this, we make use of what are called the "associated Stirling numbers of the second kind", which are described in entry A008299 in the OEIS. The mapping between heaps and words is exactly the same as in Section 1. The number of set partitions of n things into b heaps all of size at least 2 is the associated Stirling number of the second kind, AStirling_2(n,b) (or A008299(n,b)). So the answer to Q3 is The number of words of length n over an alphabet of size b that are in standard order and in which every symbol that appears in a word must appear more than once is Sum_{j = 1..b} AStirling_2(n,j). This answers the three questions. There are also two important related sequences. The sum of all Stirling numbers Stirling_2(n,j) for j = 1 through n is the Bell number A000110(n). The sum of all associated Stirling numbers AStirling_2(n,j) for j = 1 through n is the "set partitions without singletons" number A000296(n). 4. Numerical results for question Q1. ------------------------------------- The answers to Q1 are summarized in the array A278984. Maple code: f1(n,b) is the number of solution for given n and b. Q1(b) gives a list of the numbers of solutions for given value of b. with(combinat); f1:=proc(L,b) local t1;i; t1:=add(stirling2(L,i),i=1..b); end: Q1:=b->[seq(f1(L,b), L=1..20)]; for b from 1 to 6 do lprint(Q1(b)); od: 1,.1,..1,...1,...1,...1,...1,....1..; b=1, A000012 1,.2,..4,...8,..16,..32,..64,..128..; b=2, A000079 1,.2,..5,..14,..41,.122,.365,.1094..; b=3, A007051 1,.2,..5,..15,..51,.187,.715,.2795..; b=4, A007581 1,.2,..5,..15,..52,.202,.855,.3845..; b=5, A056272 1,.2,..5,..15,..52,.203,.876,.4111..; b=6, A056273 ... Rows 1 through 10 of this array are A000012, A000079, A007051, A007581 (or A124303), A056272, A056273, A099262, A099263, A164863, A164864. As b increases, the rows converge to sequence A000110, the Bell numbers. In fact the first b terms of row b are the same as the first b terms of A000110. The words with b=3 are listed in A278985. 5. Numerical results for question Q2. ------------------------------------- The answers to Q2 are summarized in the array A278986. Maple code: f2(n,b) is the number of solution for given n and b. Q2(b) gives a list of the numbers of solutions for given value of b. with(combinat); f2:=proc(L,b) local t1;i; t1:=add(stirling2(L,i),i=1..b); if L <= b then t1:=t1-1; fi; t1; end; Q2:=b->[seq(f2(L,b), L=1..20)]; for b from 1 to 6 do lprint(Q2(b)); od: 0,.1,..1,...1,...1,...1,...1,....1..; b=1, 0,.1,..4,...8,..16,..32,..64,..128..; b=2, 0,.1,..4,..14,..41,.122,.365,.1094..; b=3, 0,.1,..4,..14,..51,.187,.715,.2795..; b=4, 0,.1,..4,..14,..51,.202,.855,.3845..; b=5, 0,.1,..4,..14,..51,.202,.876,.4111..; b=6, ... The words with b=9 are listed in Daniel Devatman Hromada's sequence A273977. 6. Numerical results for question Q3. ------------------------------------- The answers to Q3 are summarized in the array A278987. Maple code: f3(n,b) is the number of solution for given n and b. Q3(b) gives a list of the numbers of solutions for given value of b. A008299 := proc(n,k) local i,j,t1; if k<1 or k>floor(n/2) then t1:=0; else t1 := add( (-1)^i*binomial(n, i)*add( (-1)^j*(k - i - j)^(n - i)/(j!*(k - i - j)!), j = 0..k - i), i = 0..k); fi; t1; end; f3:=proc(L,b) global A008299; local i; add(A008299(L,i),i=1..b); end; Q3:=b->[seq(f3(L,b),L=1..40)]; for b from 1 to 6 do lprint(Q3(b)); od: 0,.1,.1,.1,..1,..1,...1,...1,....1,.....1,.....1,......1,.......1,........1,.... b=1 0,.1,.1,.4,.11,.26,..57,.120,..247,...502,..1013,...2036,....4083,.....8178,.... b=2 0,.1,.1,.4,.11,.41,.162,.610,.2165,..7327,.23948,..76352,..239175,...739909,.... b=3 0,.1,.1,.4,.11,.41,.162,.715,.3425,.16777,.80928,.379347,.1726375,..7654817,.... b=4 0,.1,.1,.4,.11,.41,.162,.715,.3425,.17722,.98253,.569922,.3363010,.19776927,.... b=5 0,.1,.1,.4,.11,.41,.162,.715,.3425,.17722,.98253,.580317,.3633280,.23876022,.... b=6 Rows 1 through 4 of the array are A000012, A000295 (or A130103), A278988, A278989. As b increases, the rows approach A000296, the "set partitions without singletons" numbers. In fact the first 2b+1 terms of row b are the same as the first 2b+1 terms of A000296. The words with b=9 are listed in Daniel Devatman Hromada's sequence A273978. References. ----------- [1] D. D. Hromada, Integer-based nomenclature for the ecosystem of repetitive expressions in complete works of William Shakespeare, submitted to special issue of Argument and Computation on Rhetorical Figures in Computational Argument Studies, 2016. [2] The On-Line Encyclopedia of Integer Sequences (or OEIS), https://oeis.org/ [3] R. P. Stanley, Enumerative Combinatorics, Volume 1, 2nd ed., Cambridge 2012. -------------- Concerned with sequences A000012, A000079, A000110, A000295, A000296, A007051, A007581, A008277, A008299, A056272, A056273, A099262, A099263, A124303, A130103, A164863, A164864, A273977, A273978, A278984, A278985, A278986, A278987, A278988, A278989