Re: Using Select
- To: mathgroup at smc.vnet.net
- Subject: [mg76264] Re: Using Select
- From: Bill Rowe <readnewsciv at sbcglobal.net>
- Date: Fri, 18 May 2007 06:28:41 -0400 (EDT)
On 5/17/07 at 5:54 AM, mark at markscoleman.com (Mark Coleman) wrote:
>I'm working on a small application and I'm searching for a way to
>Select a subset of rows from a list based upon a list of criteria
>chosen by the user. Here is small (hypothetical) example. Say one
>has a data set of consisting of a list of as follows:
>{ {1,A,Blue,"Hello",10.5},{7,D,Green,"Goodbye",9.4},
>{6,S,Yellow,"Hello",6.9},{3,A,Blue,"Hello",8.0}....}
>The user will specify a letter and a color and the program should
>Select[ ] the appropriate rows, e.g., if I pick Color=Blue and
>Letter = A, then it will return
>{ {1,A,Blue,"Hello",10.5},{3,A,Blue,"Hello",8.0}....}, etc.
I've created a function for my own use that does something
reasonably close to what you want but makes one assumption about
the first row of the data matrix. That is I have the first row
reserved for a set of undefined symbols that serve as names for
the fields in each data record. Here, I am equating any row > 1
with data record and field with each column.
I've used this with data files with as much as 20-50 fields and
on the order of 100,000 records. As far as I know there is no
limit to how large a data set can be handled except for memory constraints.
code follows:
SelectRows::usage = "SelectRows[data,rows] returns the specified
rows retaining the first row. Rows can either be specified as a
list of row numbers to be returned or a list of {symbol,values}
pairs. If this syntax is used, the rows returned have the
specified values in the column headed by symbol. This form is
intended to select on more than one column and does a Boolean
AND. For example, SelectRows[data,{{a,2},{b,1}}] selects rows of
data where the value of column a is 2 AND the value of column b
is 1.
SelectRows[data,column,values] returns a list of rows contining
values in column retaining the first row.";
SelectRows[data:{__List},name_Symbol,runs:{__}]:=
Module[{c=Position[First[data],name][[1,1]]},
Join[{First[data]},Select[data,MemberQ[runs,#[[c]]]&]]]
SelectRows[data:{__List},runs:{___Integer}]:=data[[Flatten[{1,runs}]]]
SelectRows[data : {__List}, name_Symbol, value_?NumericQ] :=
Module[{c = Position[First[data], name][[1, 1]]},
Join[{First[data]}, Select[Rest[data], (value == #[[c]]) &]]]
SelectRows[data : {__List}, name_Symbol, value_] :=
Module[{c = Position[First[data], name][[1, 1]]},
Join[{First[data]}, Select[Rest[data], (value === #[[c]]) &]]]
SelectRows[data:{__List},items:{__List}]:=
Fold[SelectRows[#1,Sequence@@#2]&,data,items]
--
To reply via email subtract one hundred and four