Re: Speed: Inner MUCH slower than dot product??
- To: mathgroup at smc.vnet.net
- Subject: [mg65546] Re: [mg65523] Speed: Inner MUCH slower than dot product??
- From: Sseziwa Mukasa <mukasa at jeol.com>
- Date: Fri, 7 Apr 2006 06:14:22 -0400 (EDT)
- References: <200604061052.GAA19474@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
On Apr 6, 2006, at 6:52 AM, Lee Newman wrote: > Dear Group, > > In the process of performance tweaking some code, I came upon the > following result: using Inner is MUCH slower (orders of magnitude) > than > using Dot product (see code below). Why is this? This is a question best answered by a Wolfram employee but I'd guess it's probably because Inner is more general than Dot. Dot can be implemented by computing an appropriate stride length and then running across the lists with multiply and accumulate (MAC) instructions. If your array can be packed as in this case, there is probably a single hardware instruction to do the MAC on the appropriate types. Inner on the other hand, must necessarily first thread the second function over the lists, explicitly store the result, which is probably the most expensive operation since this requires allocating memory, and then apply the second function and allocate memory again to store the result. Regards, Ssezi
- References:
- Speed: Inner MUCH slower than dot product??
- From: Lee Newman <leenewm@umich.edu>
- Speed: Inner MUCH slower than dot product??