Re: integer sequence periodicity

*To*: mathgroup at smc.vnet.net*Subject*: [mg39846] Re: [mg39744] integer sequence periodicity*From*: Daniel Lichtblau <danl at wolfram.com>*Date*: Sat, 8 Mar 2003 02:51:30 -0500 (EST)*References*: <200303050505.AAA01294@smc.vnet.net>*Sender*: owner-wri-mathgroup at wolfram.com

Michal Kvasnicka wrote: > > I am looking for best, in the statistical sense, basic (longest but shorter > than length of the whole sequence, of course) period estimation of the > integer sequence. The integer sequence may be corrupted by fading (randomly > missing sequence elements). > The integer sequence with the basic period P has length L and contain every > integer numbers i = 1,2, ..., N at least once. Where N<=P<L. > > In the real application will be L ~ 100-10000, N ~ 3-100 and P ~ 3-100 > > Are there any suitable, robust and efficient algorithms? > > Thanks for any relevant hints, > Michal I don't know about robust but the ideas below are reasonably efficient and may help you to start on this. The quality of these various approaches will very likely depend alot on how the quantity of fading, as well as on statistics of the sequences themselves (e.g. is a basic period highly structured?). (i) You could try matching a piece of the sequence to later pieces and see how far apart they tend to lie. I do not pursue this method below but I would suspect one could use it to advantage. (ii) You could work with Fourier transforms. I illustrate with an example. We create a random period of length 20. data = Table[Random[Integer,2^4-1], {20}]; Form 100 of these. new = Flatten[Table[data, {100}]]; Make the end "ragged" so we do not have an exact multiple of periods. new2 = Join[new, Take[new,4]]; Pick some elements at random to be removed, to simulate fading. leaveout = Table[Random[Integer,Length[new]], {10}]; new3 = Part[new2, Complement[Range[Length[new]], leaveout]]; Now we will take the Fourier transform, then take absolute values. As it happened, I did not much like the result (too many small frequencies, perhaps some form of "smearing"), so I repeated the process. nn = Abs[Fourier[Abs[Fourier[new3]]]]; Let's see how big is the largest component. In[19]:= Max[nn] Out[19]= 100.577 Arbitrarily I threshhold at about 25% of the maximum. The idea is to have large components become 1, small become zero, and check the "run length" at the ends (we will see below that the frequencies in the middle tend to be too small and homogenized). thresh = Map[If[Abs[#]>25,1,0]&, nn]; In[21]:= lens = Map[Length,Split[thresh]] Out[21]= {1, 19, 1, 19, 1, 19, 1, 19, 1, 1829, 1, 19, 1, 19, 1, 19, 1, 19} At this point we just need to pair off the large and small frequencies, look at the ends, sum the pairs, and average. newsplit = Partition[lens,2]; ends = Join[Take[newsplit,2],Take[newsplit,-2]]; <<Statistics` In[24]:= Ceiling[Mean[Map[Apply[Plus,#]&, ends]]] Out[24]= 20 (iii) Count the number of each integer that occurs in your sequence. We expect (due to fading) that the fractions will not be quite correct, but we can get good approximations with low precision that yield small denominators. I illustrate with the same example. In[25]:= lens = Map[Length,Split[Sort[new3]]] Out[25]= {199, 99, 200, 398, 495, 100, 99, 100, 100, 100, 100} We have 199 of one integer, 99 of another, 200 of a third, etc. We replace these with much smaller numbers that yield approximately the same quotients. <<NumberTheory`Rationalize` In[9]:= fracts = ProjectiveRationalize[SetPrecision[lens,2]] Out[9]= {2, 1, 2, 4, 5, 1, 1, 1, 1, 1, 1} To recover what may be the period we now do simple arithmetic. In[10]:= Round[Length[new3]/Mean[lens/fracts]] Out[10]= 20 Note that this can actually give a proper multiple of the actual period. Say, for example, the signal arose from sampling a function that was symmetric e.g. cosine but with a constant dc component to make all values positive. Then we might get the same fractions from only half a period. All the same this method can give a start in obtaining the correct result. Daniel Lichtblau Wolfram Research

**References**:**integer sequence periodicity***From:*"Michal Kvasnicka" <michal.kvasnicka@NoSpam.quick.cz>