[Date Index]
[Thread Index]
[Author Index]
Re: Best method to break apart a data set
*To*: mathgroup at smc.vnet.net
*Subject*: [mg113890] Re: Best method to break apart a data set
*From*: Patrick Scheibe <pscheibe at trm.uni-leipzig.de>
*Date*: Wed, 17 Nov 2010 05:29:24 -0500 (EST)
Hi,
your data has a common *linear* direction and it is maybe better to
use some kind of distance from that linear function. The middle
data curve runs approx from {0,8000} to {10^8,16000} which gives us a
linear function
y = 8000/10^8 x
Using this you could try something like
f[{x_, y_}] := (8000/10^8*x - y)^2;
ListPlot[FindClusters[stuff, 4,
DistanceFunction -> (EuclideanDistance[f[#1], f[#2]] &),
Method -> "Agglomerate"], PlotStyle -> {PointSize[0.03]}]
Some lokal method which iteratively picks out neighboring elements
of a curve until no element is reachable maybe successful too.
Cheers
Patrick
On Tue, 2010-11-16 at 05:05 -0500, EliL wrote:
> For
> stuff = {{0, 3290}, {0, 8576}, {0, 12081}, {4569828, 3336}, {4569828,
> 8581}, {4569828, 12109}, {9139656, 3468}, {9139656,
> 8600}, {9139656, 12193}, {13709484, 3671}, {13709484,
> 8637}, {13709484, 12328}, {18279312, 3924}, {18279312,
> 8698}, {18279312, 12513}, {22849141, 4205}, {22849141,
> 8791}, {22849141, 12741}, {22849141, 15220}, {27418969,
> 4494}, {27418969, 8925}, {27418969, 13009}, {27418969,
> 15637}, {31988797, 4774}, {31988797, 9106}, {31988797,
> 13312}, {31988797, 15995}, {36558625, 5032}, {36558625,
> 9342}, {36558625, 13646}, {36558625, 16320}, {41128453,
> 5259}, {41128453, 9633}, {41128453, 14008}, {45698281,
> 5453}, {45698281, 9979}, {45698281, 14394}, {50268109,
> 5612}, {50268109, 10377}, {50268109, 14802}, {54837937,
> 5742}, {54837937, 10819}, {54837937, 15230}, {59407765,
> 5846}, {59407765, 11298}, {59407765, 15675}, {63977593,
> 5929}, {63977593, 11809}, {63977593, 16135}, {68547422,
> 5995}, {68547422, 12345}, {73117250, 6048}, {73117250,
> 12902}, {77687078, 6091}, {77687078, 13475}, {82256906,
> 6125}, {82256906, 14062}, {86826734, 6153}, {86826734,
> 14660}, {91396562, 6176}, {91396562, 15268}, {95966390,
> 6195}, {95966390, 15884}, {100536218, 6210}, {105106046,
> 6223}, {109675875, 6233}}
>
> If you ListPlot it you'll see 4 distinct curves. I'd love to have
> Mathematica break them apart into four separate sets. I've tried
> using FindClusters, but the default doesn't work. I've also tried
> using DistanceFunction -> (Norm[#1[[2]] - #2[[2]]]^2 &) to pull only
> the closeness in the y-axis. This successfully separates the bottom
> curve, but doesn't break the 3 upper curves up in the right way.
>
> Any other ideas for how to break this up? Or different Distance
> Functions to use?
> Thanks so much,
> Eli.
>
Prev by Date:
**Re: Best method to break apart a data set**
Next by Date:
**Re: Best method to break apart a data set**
Previous by thread:
**Re: Best method to break apart a data set**
Next by thread:
**Re: Best method to break apart a data set**
| |