Re: How to select unique elements in a list?
- To: mathgroup at smc.vnet.net
- Subject: [mg8307] Re: How to select unique elements in a list?
- From: Robert Villegas <villegas>
- Date: Sun, 24 Aug 1997 04:46:52 -0400
- Organization: Wolfram Research
- Sender: owner-wri-mathgroup at wolfram.com
About my order-restoring function that I used on the singletons:
RestoreOrder[subset_, list_] :=
Last /@
Sort[ {Position[list, #, {1}, 1][[1, 1]], #}& /@
subset
]
I just remembered that Todd Gayley and I realized a far faster indexing
method for a somewhat similar problem several months ago: build a rule
table mapping each element to its position. Here is the new, faster
RestoreOrder:
RestoreOrder[subset_, list_] :=
With[{elemIndex = Dispatch @ Thread[list -> Range @ Length[list]]},
Last /@
Sort[{Replace[#, elemIndex], #}& /@ subset]
]
Here's a timing comparison on a 15,000-element list with lots of singletons:
In[10]:=
bigList = Table[Random[Integer, {1, 10*^3}], {15*^3}];
In[11]:=
Timing[oldResult = Singletons[bigList];]
Out[11]=
{50.93 Second,Null}
In[12]:=
Timing[newResult = SingletonsNew[bigList];]
Out[12]=
{0.96 Second,Null}
In[13]:=
oldResult === newResult
Out[13]=
True
In[14]:=
Length[oldResult]
Out[14]=
3230
What you sacrifice with the big hash table, of course, is memory.
MaxMemoryUsed[] increases by a couple megabytes or more when this
is run on a list of 10,000 integers. MemoryInUse[] doesn't change
that much, which means the memory used by the temporary hash table
is available for future Mathematica use, but not for other
applications.
If you're going to re-order many times with respect to the same master
list, you might want RestoreOrder to remember the hash table so it
doesn't have to regenerate it each time. Generating a hash table is
pretty fast, so don't bother unless you need it quite a few times or
you're on a slow machine. Perhaps something like the following:
ClearAll[RestoreOrder]
Options[RestoreOrder] = {Remember -> False};
RestoreOrder[subset_, list_, opts___] :=
With[{record = $RestoreOrderRecord[list]},
With[{elemIndex =
Replace[record,
_$RestoreOrderRecord :>
Dispatch @ Thread[list -> Range @ Length[list]]
]
},
If[(Remember /. Flatten[{opts, Options[RestoreOrder]}]) &&
Head[record] === $RestoreOrderRecord,
$RestoreOrderRecord[list] = elemIndex
];
Last /@
Sort[{Replace[#, elemIndex], #}& /@ subset]
]
]
Now use the option Remember -> True for individual cases or do
SetOptions[RestoreOrder, Remember -> True] to make it remember all
future hash tables (then you'll be sacrificing MemoryInUse[] as well
as MaxMemoryUsed[]).
Maybe there's a more elegant way to implement the remembering mechanism,
but this works.
Robby Villegas