Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Best method to break apart a data set

  • To: mathgroup at smc.vnet.net
  • Subject: [mg113888] Re: Best method to break apart a data set
  • From: Ray Koopman <koopman at sfu.ca>
  • Date: Wed, 17 Nov 2010 05:29:01 -0500 (EST)
  • References: <ibtl20$5lm$1@smc.vnet.net>

On Nov 16, 2:05 am, EliL <elan... at gmail.com> wrote:
> For
> stuff = {{0, 3290}, {0, 8576}, {0, 12081}, {4569828, 3336}, {4569828,
>     8581}, {4569828, 12109}, {9139656, 3468}, {9139656,
>    8600}, {9139656, 12193}, {13709484, 3671}, {13709484,
>    8637}, {13709484, 12328}, {18279312, 3924}, {18279312,
>    8698}, {18279312, 12513}, {22849141, 4205}, {22849141,
>    8791}, {22849141, 12741}, {22849141, 15220}, {27418969,
>    4494}, {27418969, 8925}, {27418969, 13009}, {27418969,
>    15637}, {31988797, 4774}, {31988797, 9106}, {31988797,
>    13312}, {31988797, 15995}, {36558625, 5032}, {36558625,
>    9342}, {36558625, 13646}, {36558625, 16320}, {41128453,
>    5259}, {41128453, 9633}, {41128453, 14008}, {45698281,
>    5453}, {45698281, 9979}, {45698281, 14394}, {50268109,
>    5612}, {50268109, 10377}, {50268109, 14802}, {54837937,
>    5742}, {54837937, 10819}, {54837937, 15230}, {59407765,
>    5846}, {59407765, 11298}, {59407765, 15675}, {63977593,
>    5929}, {63977593, 11809}, {63977593, 16135}, {68547422,
>    5995}, {68547422, 12345}, {73117250, 6048}, {73117250,
>    12902}, {77687078, 6091}, {77687078, 13475}, {82256906,
>    6125}, {82256906, 14062}, {86826734, 6153}, {86826734,
>    14660}, {91396562, 6176}, {91396562, 15268}, {95966390,
>    6195}, {95966390, 15884}, {100536218, 6210}, {105106046,
>    6223}, {109675875, 6233}}
>
> If you ListPlot it you'll see 4 distinct curves. I'd love to have
> Mathematica break them apart into four separate sets.  I've tried
> using FindClusters, but the default doesn't work. I've also tried
> using DistanceFunction -> (Norm[#1[[2]] - #2[[2]]]^2 &) to pull only
> the closeness in the y-axis. This successfully separates the bottom
> curve, but doesn't break the 3 upper curves up in the right way.
>
> Any other ideas for how to break this up? Or different Distance
> Functions to use?
> Thanks so much,
> Eli.

u = Transpose@First@SingularValues@N[#-Mean@stuff&/@stuff];

(* In the next cell, the angle was chosen by visual trial & error,
to get the four lines separated vertically. Then the cutpoints
were chosen by eye.

With[{t = 20 Degree}, ListPlot[u.{{Cos[t],Sin[t]},{-Sin[t],Cos[t]}},
PlotRange->All,Frame->True,Axes->None,AspectRatio->Automatic,
Epilog->(Line[{{-1,#},{1,#}}]&)/@{-.05,.11,.18}];]

group = Which[# < -.05, 1,
              # <  .11, 2,
              # <  .18, 3,
               True   , 4]& /@
(u.{Sin[20 Degree],Cos[20 Degree]})

{1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3,
 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1,
 2, 3, 1, 2, 3, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 1}

Show[Graphics[MapThread[Text,{group,stuff}]]];


  • Prev by Date: Re: Best method to break apart a data set
  • Next by Date: Re: reading different locations of a matrix specified
  • Previous by thread: Re: Best method to break apart a data set
  • Next by thread: Re: Best method to break apart a data set