 
 
 
 
 
 
Re: matrix matching
- To: mathgroup at smc.vnet.net
- Subject: [mg91839] Re: matrix matching
- From: dh <dh at metrohm.ch>
- Date: Tue, 9 Sep 2008 07:00:02 -0400 (EDT)
- References: <g9r4ds$d41$1@smc.vnet.net>
Hi Sophie,
you can easily do this computation in Mathematica by "hand" and get some number. 
But it does not make much sense to compare a quantity that varies 
between 0..20000 and an other one that varies 0..1.
Therefore, one has to assume something about the distribution. We assume 
that we have enough data points to get a coarse guess of the variance. 
If all quantities are equally important, it makes sense to normalize 
their variance before comparing. Often it is also advantageous (e.g. for 
plotting)to subtract the column mean from a column.
For technical reasons, it is easier to have quantities in rows instead 
of columns:
A1= Transpose[A]
to subtract the column mean:
A2= (#-Mean[#])& /@ A1
to normalize the variance:
A3= (#/Variance[#])& /@ A3
the distance:
Total@Flatten@(A3-B3)^2
We may divide by the number of columns and rows to make the distance 
independent of dimensions. E.g. if you add another data tupel.
hope this helps, Daniel
Sophie D. Yip wrote:
> i was trying to attend the Mathematica seminars last month to see if can
> build a matrix matching model using this new tool (after searching the group
> archive and could not find a close one). i had to reschedule the seminars in
> the next 10 days because some technical issue kept refusing me into the classroom...
> Before I can get a better sense with Mathematica, can anyone tell me if a
> basic matrix matching model close to described below does exist in public or
> can be built handily?
> 
> Scenario 1:
> 
> Two m X n matrices A and B, where first column are items (represented by
> names or IDs, e.g. computer engineering or
> 
> 88888888), last column are importance level (0~10), and rest of the columns
> are descriptors (interpreted as numbers,
> 
> e.g. 0%~100%, 0~ 20,000 km, 0 or 1, which can be standardized).
> 
> Matching these two matrices A and B to determine their closeness (or level
> of matching), in one way, by comparing
> 
> the descriptors of each item and calculating their overall distance,
> adjusted by the levels of importance:
> 
> D1 = SUM{SQUARE[L1(B) - L1(A)]+ SQUARE[L2(B) -L2(A)] +...+
> SQUARE[Ln(B)-Ln(A)]}*IM1^2
> 
> D2 = SUM{SQUARE[L1(B) - L1(A)]+ SQUARE[L2(B) -L2(A)] +...+
> SQUARE[Ln(B)-Ln(A)]}*IM2^2
> 
> Dm = SUM{SQUARE[L1(B) - L1(A)]+ SQUARE[L2(B) -L2(A)] +...+
> SQUARE[Ln(B)-Ln(A)]}*IMm^2
> 
> D = (D1 + D2 +... + Dm)/m
> 
> D: overall distance
> L: quantified and standardized level of descriptor
> IM: level of importance
> 
> Alternative Scenario:
> 
> Two m X n matrices A and B, where first column are items (represented by
> names or IDs, e.g. computer engineering or 88888888) and rest of the columns
> are descriptors (interpreted as numbers, e.g. 0~10, 0%~100%, 0~ 20,000 km, 0
> or 1, which can be standardized).
> 
> Matching these two matrices A and B to determine their closeness (or level
> of matching) by comparing the descriptors of each item and calculating their
> overall distances:
> 
> D1 = SUM{SQUARE[L1(B) - L1(A)]+ SQUARE[L2(B) -L2(A)]
> +...+ SQUARE[Ln(B)-Ln(A)]}^2
> 
> D2 = SUM{SQUARE[L1(B) - L1(A)]+ SQUARE[L2(B) -L2(A)]
> +...+ SQUARE[Ln(B)-Ln(A)]}^2
> 
> Dm = SUM{SQUARE[L1(B) - L1(A)]+ SQUARE[L2(B) -L2(A)]
> +...+ SQUARE[Ln(B)-Ln(A)]}^2
> 
> D = (D1 + D2 +... + Dm)/m
> 
> D: overall distance
> L: quantified and standardized level of descriptor
> 
> Many thanks,
> 
> Sophie
> 
> 
-- 
Daniel Huber
Metrohm Ltd.
Oberdorfstr. 68
CH-9100 Herisau
Tel. +41 71 353 8585, Fax +41 71 353 8907
E-Mail:<mailto:dh at metrohm.com>
Internet:<http://www.metrohm.com>

