MathGroup Archive: February 2012 [00333]

[Date Index] [Thread Index] [Author Index]

How best to implement a hash table in Mathematica

To: mathgroup at smc.vnet.net
Subject: [mg125019] How best to implement a hash table in Mathematica
From: Joseph Gwinn <joegwinn at comcast.net>
Date: Sat, 18 Feb 2012 06:23:53 -0500 (EST)
Delivered-to: l-mathgroup@mail-archive0.wolfram.com

I have an application that resembles dictionary lookup, with significant 
rates of collision.  

The current approach is to initialize the hash table, a 10^4 element 
list of empty sublists, and then build the table by inserting document 
object id numbers into the correct sublists as indexed by some important 
property of the object.  The choice of property varies with application.  
The property may be quantized geographic location, or some attribute of 
the object itself.

When done building the hash table, query consists of reading the sublist 
using an index derived in the same way as for the original building of 
the list.  Each sublist may have many entries, and the entries need not 
be numbers.

Anyway, I have a workable approach using 

HashList[[index]]=
Union[Flatten[Join[HashList[[index]],{new object ID}]]]]

as the core update function, but I'm wondering how well this will scale 
to say 10^6 sublists (which may be needed to keep the collisions under 
control with large datasets), and of course if there is a better way in 
Mathematica.

I did look at SparseArrays, but they don't support elements that are 
sublists.

Although this hash table is one dimensional, I have uses for 2D and 
maybe 3D tables, which are very common when one is hashing physical 
location.

Thanks in advance,

Joe Gwinn

Prev by Date: Manipulate

Next by Date: How to change individual colors of an image

Previous by thread: Re: Function Exp[x^2]*Erfc[x]

Next by thread: Re: How best to implement a hash table in Mathematica