Re: Using Select
- To: mathgroup at smc.vnet.net
- Subject: [mg76264] Re: Using Select
- From: Bill Rowe <readnewsciv at sbcglobal.net>
- Date: Fri, 18 May 2007 06:28:41 -0400 (EDT)
On 5/17/07 at 5:54 AM, mark at markscoleman.com (Mark Coleman) wrote: >I'm working on a small application and I'm searching for a way to >Select a subset of rows from a list based upon a list of criteria >chosen by the user. Here is small (hypothetical) example. Say one >has a data set of consisting of a list of as follows: >{ {1,A,Blue,"Hello",10.5},{7,D,Green,"Goodbye",9.4}, >{6,S,Yellow,"Hello",6.9},{3,A,Blue,"Hello",8.0}....} >The user will specify a letter and a color and the program should >Select[ ] the appropriate rows, e.g., if I pick Color=Blue and >Letter = A, then it will return >{ {1,A,Blue,"Hello",10.5},{3,A,Blue,"Hello",8.0}....}, etc. I've created a function for my own use that does something reasonably close to what you want but makes one assumption about the first row of the data matrix. That is I have the first row reserved for a set of undefined symbols that serve as names for the fields in each data record. Here, I am equating any row > 1 with data record and field with each column. I've used this with data files with as much as 20-50 fields and on the order of 100,000 records. As far as I know there is no limit to how large a data set can be handled except for memory constraints. code follows: SelectRows::usage = "SelectRows[data,rows] returns the specified rows retaining the first row. Rows can either be specified as a list of row numbers to be returned or a list of {symbol,values} pairs. If this syntax is used, the rows returned have the specified values in the column headed by symbol. This form is intended to select on more than one column and does a Boolean AND. For example, SelectRows[data,{{a,2},{b,1}}] selects rows of data where the value of column a is 2 AND the value of column b is 1. SelectRows[data,column,values] returns a list of rows contining values in column retaining the first row."; SelectRows[data:{__List},name_Symbol,runs:{__}]:= Module[{c=Position[First[data],name][[1,1]]}, Join[{First[data]},Select[data,MemberQ[runs,#[[c]]]&]]] SelectRows[data:{__List},runs:{___Integer}]:=data[[Flatten[{1,runs}]]] SelectRows[data : {__List}, name_Symbol, value_?NumericQ] := Module[{c = Position[First[data], name][[1, 1]]}, Join[{First[data]}, Select[Rest[data], (value == #[[c]]) &]]] SelectRows[data : {__List}, name_Symbol, value_] := Module[{c = Position[First[data], name][[1, 1]]}, Join[{First[data]}, Select[Rest[data], (value === #[[c]]) &]]] SelectRows[data:{__List},items:{__List}]:= Fold[SelectRows[#1,Sequence@@#2]&,data,items] -- To reply via email subtract one hundred and four