RE: final results: creating adjacency matrices

*To*: mathgroup at smc.vnet.net*Subject*: [mg36934] RE: [mg36584] final results: creating adjacency matrices*From*: "Moliterno, Thomas" <TMoliter at gsm.uci.edu>*Date*: Wed, 2 Oct 2002 03:32:50 -0400 (EDT)*Sender*: owner-wri-mathgroup at wolfram.com

First thanks to all, and in particular Bobby Treat, for your help with this question. The best solution was as follows: lst = ReadList["c:\\data.txt", {Number, Number}] adjacenceMatrix[ x:{{_, _}..}] := Module[{actors, events}, {actors, events} = Union /@ Transpose[x]; Array[If[MemberQ[x, {actors[[#1]], events[[#2]]}], 1, 0] & , {Length[actors], Length[events]}]] a = adjacenceMatrix[lst]; b = a . Transpose[a]; c = b (1 - IdentityMatrix[Length[b]]) C is the desired symmetric matrix with off diagonal values of >=0, indicating the number of times two actors participate in the same event. The diagonal is set to 0. A few items in response to Bobby's message, below. While c is, in fact, a huge matrix with lots of cells equal to zero, that is exactly how we need it structured for our analysis and research question (not relevant to the list, but I'd be happy to discuss off list). Processing time is actually not too bad!! I'm running a PIII 900 with 512 SDRAM, and the code ran a 177 x 3669 matrix in under 90 seconds. MatrixForm [c] presented no problems in viewing in the front end, but then it's only 177 x 177. Thanks again to all, Tom ********************************************** Thomas P. Moliterno Graduate School of Management University of California, Irvine tmoliter at uci.edu ********************************************** -----Original Message----- From: DrBob [mailto:drbob at bigfoot.com] To: mathgroup at smc.vnet.net Subject: [mg36934] RE: [mg36584] Re: creating adjacency matrices Thomas, That was Jens-Peer's algorithm for computing actor-event connections, so I won't take responsibility for its failure. Mine would have failed too with that data, though!! Jens-Peer's algorithm would have worked if your inputs included all the actors from 1 to n, but it didn't. Mine assumed that 1 was one of them, and it wasn't. Instead, I'm now assuming that you don't want rows for actors that don't appear in the observations, and similarly for events. The rows and columns will be numbered from 1 to the number of observed actors or events, and will correspond to actors and events in sorted order. That said, you're asking for a VERY large matrix, and most of its entries will be zero. I'll suggest another way, later. The following indicates AT MOST 13.4% of the entries could be non-zero: lst = ReadList["moliterno-test1996.txt", {Number, Number}]; {actors, events} = Union /@ Transpose[lst]; N[Length[lst]/(Length[actors]*Length[actors])] 0.13350994338800987 However, a random sample shows that less than 1% will be non-zero: Timing[ Count[(MemberQ[lst, {actors[[Random[Integer, Length[actors]]]], events[[Random[Integer, Length[events]]]]}] & ) /@ Range[10000], True]/ 10000.] {7.515999999999998*Second, 0.008} Nevertheless, the following code should build the matrices you want. I'm using a 2.2GHz P4 and 1024MB RDRAM, so if you have a slower machine, be warned. adjacenceMatrix[ x:{{_, _}..}] := Module[{actors, events}, {actors, events} = Union /@ Transpose[x]; Array[If[MemberQ[x, {actors[[#1]], events[[#2]]}], 1, 0] & , {Length[actors], Length[events]}]] Timing[a = adjacenceMatrix[lst]; ] Dimensions[a] {5.671999999999997*Second, Null} {166, 1778} Timing[b = a . Transpose[a]; ] {0.5309999999999988*Second, Null} You don't want to display a or b in MatrixForm. It will crash your FrontEnd, if not your Kernel. Save the file before trying to display anything at all from the result, and use something like b[[Range[20],Range[4]]]//MatrixForm {{47, 0, 0, 0}, {0, 3, 0, 0}, {0, 0, 7, 1}, {0, 0, 1, 59}, {0, 0, 0, 3}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 0}, {0, 0, 0, 1}, {0, 0, 0, 0}, {0, 0, 0, 1}, {0, 0, 0, 0}, {0, 0, 0, 0}, {2, 0, 0, 1}, {0, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 0, 1}, {0, 0, 0, 2}, {0, 0, 0, 0}, {0, 0, 0, 0}} to get a glimpse at some of it. It's the 'a' matrix that's terribly sparse -- the 'b' matrix isn't unreasonable. Elements of the 'a' matrix can be quickly computed by the function aFunction = If[MemberQ[lst, {actors[[#1]], events[[#2]]}], 1, 0] &; That stores a line of code rather than all those ones and zeroes. The b matrix (call it bb this time) can be computed as: Timing[bb = (#1 . Transpose[#1] & ) [Array[aFunction, {Length[actors], Length[events]}]]; ] bb == b {6.1569999999999965*Second, Null} True Bobby Treat -----Original Message----- From: Moliterno, Thomas [mailto:TMoliter at gsm.uci.edu] To: mathgroup at smc.vnet.net Subject: [mg36934] RE: [mg36584] Re: creating adjacency matrices Hi Bobby, First, let me say "thank you" for your several replies on the mathematica list re: my query about creating adjacency matrices. I've lately dispatched some teaching commitments, and can now dig down into what you and others wrote. I've found your reply, below, to be the most helpful. Indeed, you anticipated something that I actually needed, but forgot to mention in my original posting (off-diagonal values of >1 to show the number of events that 2 actors co-participated in). I plan on posting a "this was the most helpful solution" message to the list, but first I hoped to ask you a follow up question, if I may (and I'll capture your off-line response here in my final posting to the list). I've run the code (copied from you) below, and get the correct output for "made-up" data, but when I import in "real" data, I get an error message. Here's the input I'm running: lst = ReadList["c:\\test1996.txt", {Number, Number}] AdjacenceMatrix[lst : {{_, _} ..}] := Module[{actors, events, adj}, {actors, events} = Union /@ Transpose[lst]; adj = Table[0, {Length[actors]}, {Length[events]}]; Scan[(Part[adj, Sequence @@ #] = 1) &, lst /. MapIndexed[Rule[#1, First[#2]] &, events]]; adj] MatrixForm[a = AdjacenceMatrix[lst]] MatrixForm[b = a.Transpose[a]] And here's what I get for output: Set::partw : Part 300007 of <<1>> does not exist. Set::partw : Part 300007 of <<1>> does not exist. Set::partw : Part 300007 of <<1>> does not exist. General::stop : Further output of Set::partw will be suppressed during this calculation." Then I get the two matrices (a & b as per your code), but they are just filled with zeros. So it gets to about the 4th line of your code, but then doesn't "fill-in" from my data. Finally, I should note that "30007" is one of the "actors" in the data that I've read in. In case you want to run this yourself, I've attached the raw data file. There are 166 actors and 1778 events: both actors and events are coded with 6-digit numbers, actors begin with 3's, events with 2's. I'm sure this is a "silly" question, and that there is an easy answer ... But I sure can't find it. So I really appreciate your help and interest!!!! Tom ********************************************** Thomas P. Moliterno Graduate School of Management University of California, Irvine tmoliter at uci.edu ********************************************** -----Original Message----- From: DrBob [mailto:drbob at bigfoot.com] To: mathgroup at smc.vnet.net Thomas Subject: [mg36934] RE: [mg36584] Re: creating adjacency matrices I believe that's not the adjacency matrix Thomas asked for. It doesn't have zeroes on the diagonal (it isn't square) and it doesn't have ones to indicate that two actors are associated with the same event. Instead, it shows connections between actors and events, which is actually more useful, as I'll demonstrate. (In addition -- though it doesn't really matter -- Jens-Peer switched the 'actors' and 'events' nomenclature within the function.) If you multiply it by its transpose, you get something else that's useful: lst = {{1, A}, {1, B}, {2, B}, {3, C}, {3, D}, {1, D}, {1, C}}; AdjacenceMatrix[lst : {{_, _} ..}] := Module[{actors, events, adj}, {actors, events} = Union /@ Transpose[lst]; adj = Table[0, {Length[actors]}, {Length[events]}]; Scan[(Part[adj, Sequence @@ #] = 1) &, lst /. MapIndexed[Rule[#1, First[#2]] &, events]]; adj] MatrixForm[a = AdjacenceMatrix[lst]] MatrixForm[b = a.Transpose[a]] Matrix 'b' records how many events two actors have in common. On the diagonal, it shows the total number of events each actor is connected to. It's easy to put zeroes on the diagonal: MatrixForm[c = b (1 - IdentityMatrix[Length[b]])] To get the originally intended incidence matrix, this works: d = c /. {_?Positive -> 1} However, I think matrices 'a' and 'b' are actually more useful, and 'a' easily leads to all the others. Bobby -----Original Message----- From: Jens-Peer Kuska [mailto:kuska at informatik.uni-leipzig.de] To: mathgroup at smc.vnet.net Subject: [mg36934] [mg36584] Re: creating adjacency matrices Hi, with In[]:=lst = {{1, A}, {1, B}, {2 , B}, {3, C}, {3, D}, {1, D}}; and In[]:= AdjacenceMatrix[lst : {{_, _} ..}] := Module[ {actors,events adj}, {events, actors} = Union /@ Transpose[lst]; adj = Table[0, {Length[events]}, {Length[actors]}]; Scan[(Part[adj, Sequence @@ #] = 1) &, lst /. MapIndexed[Rule[#1, First[#2]] &, actors]]; adj ] you get In[]:=AdjacenceMatrix[lst] Out[]={{1, 1, 0, 1}, {0, 1, 0, 0}, {0, 0, 1, 1}} Regards Jens "Moliterno, Thomas" wrote: > > I need to create an adjacency matrix from my data, which is currently in > the form of a .txt file and is basically a two column incidence list. > For example: > > 1 A > 1 B > 2 B > 3 C > . . > . . > . . > m n > > Where 1 to m represent actors and A to n represent events. My goal is to > have an (m x m) matrix where cell i,j equals 1 if two actors are > incident to the same event (in the sample above, 1 and 2 are both > incident to B) and 0 otherwise (w/ zeros on the diagonal). > > I'm new to Mathmatica, and so I'm on the steep part of the learning > curve ... All I've been able to figure out so far is how to get my > incidence list into the program using Import["filename.txt"]. But then > what? How do I convert to the adjacency matrix? I've found the > ToAdjacencyMatrix[] command in DiscreteMath`Combinatorica`, but I can't > seem to get it to work ... > > Thanks to any and all in advance. > > Tom > > ********************************************** > Thomas P. Moliterno > Graduate School of Management > University of California, Irvine > tmoliter at uci.edu > **********************************************

**Follow-Ups**:**Re: RE: final results: creating adjacency matrices***From:*Daniel Lichtblau <danl@wolfram.com>