Speeding up hierarchical sorting
- To: mathgroup at smc.vnet.net
- Subject: [mg34455] Speeding up hierarchical sorting
- From: "Johannes Ludsteck" <johannes.ludsteck at wiwi.uni-regensburg.de>
- Date: Wed, 22 May 2002 02:46:23 -0400 (EDT)
- Organization: Universitaet Regensburg
- Sender: owner-wri-mathgroup at wolfram.com
Dear MathGroup Members, I would like to sort a list of lists (of integers) hierarchically. For example assume we have a list of records with the sex (encoded as 0 = male, 1 = female), number of childs and age of persons: {{0,0,15},{1,2,30},{0,0,10},{0,1,19},{1,3,65},...}. The task is to sort the dataset with respect to sex and then to sort each subset of people with equal sex with respect to the number of childs and so on... I have programmed a simple (recursive) function doing this. HSort[m,si] takes a matrix m containing the records and a list si containing the indices specifying the hierarchical structure of the sorting procedure, i.e. if si = {1,3,2} then m is first sorted with respect to the first elements, then with respect to the third element and finally with respect to the second element of the records. Here is the implementation. I have two problems with it: first it is rather slow and second, Mathematica refuses to compile it. HSort[m_List,si_?VectorQ]:= Block[{mi,f}, If[si==={},Return[m],f=First[si]]; mi=Sort[m,(#1[[f]]<#2[[f]])&]; Flatten[HSort[#,Rest[si]]&/@ Split[mi, (#1[[f]]==#2[[f]])&], 1]]; The strategy is to sort the matrix with respect to a certain element in si, split the matrix into subsets with ties and to call HSort recursively in order to sort each subset. application example: t= Table[Random[Integer,{0,4}],{100},{3}]; HSort[t,{1,3,2}]//MatrixForm As regards my first problem, I am wondering whether there are potential efficiency gains or not. As regards the second, I don't understand the reason why Mathematica refuses to compile the stuff. Mathematica's warning message is Compile::cret: The type of return values in <<1>> are different. Evaluation will use the uncompiled function. If I am right, the warning message means that A) Return[m] and the returned value of B) Flatten[...] are not the same. But (A) and (B) have exactly the same structure: They are both regular matrices (arrays of dimension 2) with integer elements. So why is Mathematica not willing to cooperate? Thanks for any suggestions, Johannes Ludsteck