This site is supported by donations to The OEIS Foundation.

User:Charles R Greathouse IV/Keywords/difficulty

From OeisWiki
Jump to: navigation, search

The OEIS has two keywords marking the difficulty of the sequence: easy and hard. Generally, an easy sequence is one where one could make 'as many terms as desired' without difficulty, while a hard sequence is one that cannot reasonably be extended at all (but which is not known to be complete). No easy sequence is hard; no hard sequence is easy. Most are neither.

What makes a sequence "easy"?

It's hard to decide what the cutoff should be between easy and noneasy sequences. I think that there are some classes of sequences that are inherently easy: constant sequences like A000012, or more generally sequences with rational generating functions (for which the g.f. is known, of course). Decimal (or continued fraction, etc.) expansions for algebraic constants probably also fall into this category.

My working definition for others sequences: a sequence is easy if it is possible to produce 1 MB of the sequence "quickly", say at most a minute, on a reasonable system. (I test with gp's sizebyte; other methods and metrics are of course possible.) This allows fast-growing sequences like A002416 to be classified as easy, even though generating a 10,000-term b-file might actually be time-consuming (it could fill a small hard drive).

More abstractly, a sequence which can be produced in time polynomial in the number of terms requested is probably easy, as is a sequence whose output can be produced in time linear in the output length.

It's also worth noting that this keyword relates not just to the sequence itself but the present level of (primarily mathematical) technology: A004434 (for example) would not have been considered easy before Bateman-Hildebrand-Purdy. For sequences that are borderline, being able to efficiently test numbers for membership and/or calculate large isolated values tips the scales in favor of easy. On the other hand, if generating barely a megabyte requires very sophisticated programs to get down to a minute than the sequence may not be easy.

This test has limitations: For example, the finite sequence "a(1) = 10^1048576-1, a(2) = 1 if the Riemann hypothesis is true or 0 otherwise" should not be easy (in fact it should have keyword:hard) even though the first megabyte is easy to generate. In all cases common sense should decide unusual cases; the megabyte test is useful only to make a quick decision for cases that seem like they might be easy or not.

What makes a sequence "hard"?

Any sequence which can be extended only by new ideas, rather than more computation deserves keyword:hard. Similarly, if computing a term of the sequence would probably merit a paper in a peer-reviewed journal (discussing the result, the algorithm, etc.); for example, A007508 or A011541. A sequence like A000043 which is the target of a distributed search or computation should of course have the keyword as well. Sequences which can be extended only by resolving open conjectures published in peer-reviewed journals are hard as well.

The usage of the keyword has changed over time. Many sequences now marked "hard" are primes in exponential sequences or the like, where serious brute force could extend the sequence by a term or two. I think that this is a positive change: it's worthwhile to mark these sequences this way. For example, it would make a good cross-check to a Plouffe-type program attempting to find formulas for sequences. A purported closed-form formula for a "hard" sequence is most likely wrong (or at least difficult to prove, like A186684). Following that it would be interesting to determine how much work merits the keyword: surely at least a GHz-day, but probably 100× more.

Just because a sequence is exponential in some way or counts objects in an exponential range does not mean that the sequence is hard. For example, if the terms can be computed more efficiently (with dynamic programming, modular forms, theta series, or other techniques) the sequence should not be considered hard. Sequences with this keyword are supposed to be really hard, almost beyond reach; see N. J. A. Sloane, [1], posting to SeqFan on Mar 25 2012.

Like with keyword:easy this is technology-dependent. If an amazing breakthrough makes a previously hard problem routine the keyword can be lost. If a sufficiently powerful (perhaps 50,000 clean qubits) quantum computer was created, perhaps many factorization-related sequences should not be considered hard any more.

More

The keyword "more" represents a request for more terms of a sequence. Generally, this should be applied to neither sequences with keyword:easy (if it's easy the author should find and add the terms) nor keyword:hard (for which extension is always desired). But many hard sequences are tagged with more, which is fine.