Re: Dynamic evaluation of layered networks
- To: mathgroup at smc.vnet.net
- Subject: [mg109424] Re: Dynamic evaluation of layered networks
- From: David Bailey <dave at removedbailey.co.uk>
- Date: Mon, 26 Apr 2010 07:32:57 -0400 (EDT)
- References: <hqmok8$4ch$1@smc.vnet.net>
OmniaOpera wrote:
> I want to implement a multilayer feedforward network in such a way that
> changing value(s) in the input layer automatically causes reevaluation of
> only those parts of the network that are involved in the display of a
> result, and it seems to me that Dynamic does what I want.
>
> A simple 3-layer example would be:
>
> f1[x_] = x^2;
> f2[x_] := f1[x] + 1;
> f3a[x_] := f2[x] + 1;
> f3b[x_] := f2[x] - 1;
>
> Dynamic[{f3a[2], f3b[2]}]
>
> Any subsequent change to the f1[x_] = definition in the input layer
> automatically causes the above Dynamic to reevaluate.
>
> That's fine, except that this causes f2[x] to be evaluated twice, once for
> f3a[x] and once for f3b[x], which would be inefficient when generalised to
> very large layered networks in which the f functions are costly to evaluate.
> Unfortunately, the f2[x]:=f2[x]=... trick for memorising previously
> evaluated results doesn't help us here because it prevents the Dynamic from
> being sensitive to changes in the f1[x_] = definition in the input layer.
>
> There are messy ways of programming around this problem (e.g. using the
> memorisation trick but modified so you forget memorised results that are
> "out of date"), but is there any solution that finesses the problem by
> cleverly using Mathematica's evaluation engine?
>
> OO
>
>
I imagine that you want the layers of your network to be very large -
otherwise you would not be interested in efficiency issues - so I would
not use the memorising trick - it will probably use too much memory.
I'd represent each layer as an array:
layerSize=1000000;
f1=RandomReal[1.,layerSize];
f2=RandomReal[1.,layerSize];
f3=RandomReal[1.,layerSize];
With that setup, the k'th element of row 2 (say) would be specified by
f2[[k]] .
Layer two is derived from layer 1 using a matrix of weights that is in
principle of size (layerSize x layerSize), but which could be
represented as a sparse array, M12, so layer two would be computed from
layer 1 using
f2=Dot[M12,f1];
followed by some sort of nonlinear thresholding of f2, as required.
I think I'd start with that implementation, even though you would need
to calculate every element after a change, because you definitely want
to access the basic speed of Mathematica's linear algebra. Any other
scheme for partial updates would obviously only make sense if it turned
out to be faster than just recomputing the lot using this basic approach!
As others have said, Dynamic[] is not the way to go. I'd forget about
any visual display until you have the basic engine running. If you do
decide to try selective recalculation, it is obviously important to make
sure it gives the same answers. I'd crank down layerSize to 10 (say) and
use arrays of fixed numbers for the testing phase. That way, you can
compare partial updating with total recalculation and make sure the
result is the same!
One other tip. Although small layers are useful for testing, you need to
check your code with full-sized arrays as early as possible in order to
get a realistic idea of how expensive your code will be. Timing very
short calculations produces meaningless results.
Sorry I haven't answered your question exactly, but you are starting on
a fair sized project.
BTW, there are some neural network packages out there according to
GOOGLE, so you might want to start with one of these.
David Bailey
http://www.dbaileyconsultancy.co.uk