MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: Re: How to quickly find number of non-zero elements in sparse matrix rows?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg56916] Re: [mg56894] Re: [mg56846] Re: How to quickly find number of non-zero elements in sparse matrix rows?
  • From: Chris Chiasson <chris.chiasson at gmail.com>
  • Date: Tue, 10 May 2005 03:42:18 -0400 (EDT)
  • References: <d5hum0$k03$1@smc.vnet.net> <200505071935.PAA26945@smc.vnet.net> <200505090546.BAA13870@smc.vnet.net>
  • Reply-to: Chris Chiasson <chris.chiasson at gmail.com>
  • Sender: owner-wri-mathgroup at wolfram.com

Paul's last version of the code works correctly. I have modified it to
return the results in the format that Matt Pharr requested, which is a
vector indicating the number of nonzero entries per row in the "sa"
matrix. As an added bonus, the vector is sparse. On my machine, the
total runtime for the 200kx200k test matrix is 2.5 seconds, a vast
improvement over 6.5 months... Hooray for Split, which I am beginning
to understand.

sa=SparseArray[Append[Table[{i+20,i+20}\[Rule]1,{i,200*1000}],{21,22}->40]];
SparseArray[({#[[1,1,1]]}->Length[#])&/@
    Split[ArrayRules[sa],#1[[1,1]]\[Equal]#2[[1,1]]&]]

On 5/9/05, Chris Chiasson <chris.chiasson at gmail.com> wrote:
> Paul,
> Your function is really quick. I wish I understood the way the Split
> commands works. Unfortunately, your code does not work, in its present
> form, if the array has nonzero rows. Here is a test case that is
> 2020*2020
> sa=SparseArray[Table[{i+20,i+20}\[Rule]1,{i,200*10}]];
> The first 20 rows will be blank.
> With that as input, your code produces a list of length 2000, with all
> 1s as output.
> 
> My code does produce the correct answer. Unfortunately, with my
> algorithm, the number of nonzero entries (d) of the sa matrix is a
> power regression function of the time (t) required to compute the
> number of nonzero items in each row.
> 
> d==a*t^b
> 
> On my computer, {a->2924.8,b->0.434112}. The ~4 million nonzero
> entries in Pharr's matrix would take around 6.5 months to classify
> using my procedure... lol.
> 
> Regards,
> On 5/7/05, Paul Abbott <paul at physics.uwa.edu.au> wrote:
> > In article <d5hum0$k03$1 at smc.vnet.net>, Matt Pharr <matt at pharr.org>
> > wrote:
> >
> > > I have a sparse matrix, roughly 200k by 200k, with about .01% of the
> > > entries non zero, represented with SparseArray.  I'd like to reasonably
> > > efficiently generate a 200k-long list where each element is the number of
> > > non-zero entries in the corresponding row of the matrix.  I haven't been
> > > able to figure out a quick way to do this.
> > >
> > > One approach I've tried is the following (using a made up SparseArray of
> > > an identity matrix to illustrate the point):
> > >
> > > In[76]:= sa = SparseArray[Table[{i,i}\[Rule]1, {i, 200000}]];
> >
> > Alternatively,
> >
> >   sa = SparseArray[{{i_, i_} -> 1}, {200000, 200000}]
> >
> > > In[77]:= rowLen[sa_, r_] :=  Length[ArrayRules[Take[sa, {r}]]]-1
> > >
> > > However, it's quite slow--about 1/10 of a second for each value computed
> > > (on a 1GHz Mac G4)
> > >
> > > In[80]:= Table[rowLen[sa,i], {i,100}] // Timing
> > > Out[80]=
> > > {12.4165 Second,{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> > >     1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> > >     1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}}
> > >
> > > I've got to assume that there's an efficient way to do this--does anyone
> > > have any suggestions?
> >
> > Here is one suggestion: For the first 100 entries, the following is
> > essentially instantaneous (only one call to ArrayRules).
> >
> >   Most[Length /@ Split[ArrayRules[sa[[Range[100]]]],
> >       #1[[1,1]] == #2[[1,1]] & ]] // Timing
> >
> > The test #1[[1,1]] == #2[[1,1]] & is used by Split to start a new List
> > whenever the row index changes. Most is required to drop the default
> > case, {_, _} -> 0.
> >
> > For the full array, the following is quite reasonable:
> >
> >   Most[Length /@ Split[ArrayRules[sa], #1[[1,1]]==#2[[1,1]]&]]; //Timing
> >
> > In the advanced documentation for Linear Algebra -- there is a link to
> > this from the Help Browser entry for SparseArray -- you will find that
> > SparseArray uses the Compressed Sparse Row (CSR) format as an internal
> > storage format. You can view this information using InputForm. For
> > example,
> >
> >   SparseArray[{{1, 1} -> 1, {2, 3} -> 2, {2, 2} -> 1}, {3, 4}]] //
> >     InputForm
> >
> > The 4-th part of the data structure records the cumulative sum of
> > non-zero entries. However, there seems to be no simple way to access
> > this information. Usually, for special formats you can use Part to
> > extract such information but Part is interpreted by SparseArray,
> > circumventing this.
> >
> > Cheers,
> > Paul
> >
> > --
> > Paul Abbott                                      Phone: +61 8 6488 2734
> > School of Physics, M013                            Fax: +61 8 6488 1014
> > The University of Western Australia         (CRICOS Provider No 00126G)
> > AUSTRALIA                               http://physics.uwa.edu.au/~paul
> >         http://InternationalMathematicaSymposium.org/IMS2005/
> >
> >
> 
> --
> Chris Chiasson
> http://chrischiasson.com/
> 1 (810) 265-3161
> 
> 


-- 
Chris Chiasson
http://chrischiasson.com/
1 (810) 265-3161


  • Prev by Date: Re: Hexagonal Spiral
  • Next by Date: Representation and Simulation of Dynamic Systems
  • Previous by thread: Re: Re: How to quickly find number of non-zero elements in sparse matrix rows?
  • Next by thread: Re: How to quickly find number of non-zero elements in sparse matrix rows?