How to sort and extract from matrix based on separate key-pair
- To: mathgroup at smc.vnet.net
- Subject: [mg122140] How to sort and extract from matrix based on separate key-pair
- From: Tyler <hayes.tyler at gmail.com>
- Date: Sun, 16 Oct 2011 07:06:58 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
Hi Everyone:
I'm looking for some thoughts on the best way to approach a problem of
selecting and sorting/grouping data based on some key variables. Let
me explain. The easiest analogy would be this.
I have a list of two columns: {varrnames, percentweight}
and in another matrix, "mydata", I have varnames on the top row, and
below each varname is a series of observations. My typical use case
will be to:
[1] Sort columns of mydata based on percent weight
[2] Extract data based on on sorting (e.g., all varnames' observations
whose weights, from smallest to highest, reach 50%)
[3] Or, sort, highest to lowest, and extract multiple submatrices from
mydata based on some some a desired percentage and mixture method. For
example, extract two matrices, whose sum of percent weights is closest
to 30%, where the first matrix gets top weight 1, second matrix gets
2, and so on. Not sure how to deal with the ending values, but you get
the idea.
I could play with the cutoff algorithm in [3], but the real question
is, what's the best way to store/extract these?
The matrices would subsequently be used for time series analysis, but
the variable names are important to the subsequent interpretation.
I've thought about pre-pending a row on top of mydata that contains
the percentweights values, and then working on that vector to
manipulation the positions I extract. This just seems like a lot of
index work and really inefficient.
Any thoughts or ideas how best to do this?
Cheers,
t.