Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Intersection of lists of lists based on the first term

  • To: mathgroup at smc.vnet.net
  • Subject: [mg87492] Re: Intersection of lists of lists based on the first term
  • From: Szabolcs Horvát <szhorvat at gmail.com>
  • Date: Fri, 11 Apr 2008 05:56:52 -0400 (EDT)
  • Organization: University of Bergen
  • References: <ftmu64$50i$1@smc.vnet.net>

Stern wrote:
> There is something I've been doing inefficiently that I think might be done
> much better, and was hoping folks here could help.
> 
> I am working with time series data in the form
> seriesA={{timestamp1A,data1A},{timestamp2A,data2A}....{timestampNA,dataNA}};
> seriesB={{timestamp1B,data1A},{timestamp2B,data2B}....{timestampNB,dataNB}};
> etc.
> where the many of the timestamps will be in common between the series, but
> not all.
> 
> I frequently need to work with only those data which are available in all
> the time series, and I have written a function that does this, in the
> simplest case, by first working out the timestamps that all the series share
> in common
> 
> timestampsincommon=Intersection @@ Map[Transpose[#][[1]] &,
> {seriesA,seriesB....seriesN}];

Here's a slightly simpler way to do this:

timeStampsInCommon =
  Intersection @@ ({seriesA, ..., seriesN}[[All, All, 1]])

> 
> then running Select based on MemberQ
> 
> Map[Select[#, MemberQ[timestampsincommon, #[[1]]] &] &,
> {seriesA,seriesB....seriesN}];
> 
> This works, but the completion time increases roughly linearly with the
> number of datapoints involved, and is slow when the number of points gets
> large, and it seems inelegant. I was hoping somebody on the list might
> suggest a better way.

This is worse than linear time.  It is proportional to (total number of 
datapoints) * (number of common timestamps).  We can speed up the 
MemberQ[timeStampsInCommon, ...] part to faster than proportional to 
Length[timeStampsInCommon] like this:

(isCommon[#] = True) & /@ timeStampsInCommon

Select[seriesA, isCommon@First[#] &]


Szabolcs


  • Prev by Date: Re: JLink Problem, Fixed More or Less
  • Next by Date: Re: A Problem with Simplify
  • Previous by thread: Intersection of lists of lists based on the first term
  • Next by thread: Re: Re: Intersection of lists of lists based on the first term