How to sort and extract from matrix based on separate key-pair
- To: mathgroup at smc.vnet.net
- Subject: [mg122140] How to sort and extract from matrix based on separate key-pair
- From: Tyler <hayes.tyler at gmail.com>
- Date: Sun, 16 Oct 2011 07:06:58 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
Hi Everyone: I'm looking for some thoughts on the best way to approach a problem of selecting and sorting/grouping data based on some key variables. Let me explain. The easiest analogy would be this. I have a list of two columns: {varrnames, percentweight} and in another matrix, "mydata", I have varnames on the top row, and below each varname is a series of observations. My typical use case will be to: [1] Sort columns of mydata based on percent weight [2] Extract data based on on sorting (e.g., all varnames' observations whose weights, from smallest to highest, reach 50%) [3] Or, sort, highest to lowest, and extract multiple submatrices from mydata based on some some a desired percentage and mixture method. For example, extract two matrices, whose sum of percent weights is closest to 30%, where the first matrix gets top weight 1, second matrix gets 2, and so on. Not sure how to deal with the ending values, but you get the idea. I could play with the cutoff algorithm in [3], but the real question is, what's the best way to store/extract these? The matrices would subsequently be used for time series analysis, but the variable names are important to the subsequent interpretation. I've thought about pre-pending a row on top of mydata that contains the percentweights values, and then working on that vector to manipulation the positions I extract. This just seems like a lot of index work and really inefficient. Any thoughts or ideas how best to do this? Cheers, t.