Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2012

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: reverse engineering principal components...

  • To: mathgroup at smc.vnet.net
  • Subject: [mg128421] Re: reverse engineering principal components...
  • From: Ray Koopman <koopman at sfu.ca>
  • Date: Sun, 14 Oct 2012 23:41:21 -0400 (EDT)
  • Delivered-to: l-mathgroup@mail-archive0.wolfram.com
  • Delivered-to: l-mathgroup@wolfram.com
  • Delivered-to: mathgroup-newout@smc.vnet.net
  • Delivered-to: mathgroup-newsend@smc.vnet.net
  • References: <k5854v$io4$1@smc.vnet.net>

On Oct 11, 9:10 pm, Richard Palmer <rhpal... at gmail.com> wrote:
> I would like to be able to take a large dataset, compute the principal
> components on a sufficient subset, and use the results to compute principal
> components on the remaining observations.  So far, I haven't been able to
> figure out how it is done.  Here is sample code (computed as a notebook
> expression).  Can anyone tell me where I am going wrong?
>
> Notebook[{
>
> Cell[CellGroupData[{
> Cell["Reverse Engineering Principal Components", "Section",
>  CellChangeTimes->{{3.558966707926651*^9, 3.5589667244925985`*^9}}],
>
> Cell["\<\
> make a table of data and a table of the principal components using \
> the Correllation method.  Check to see that they have the requisite \
> properties\
> \>", "Text",
>  CellChangeTimes->{{3.5589667304749403`*^9, 3.558966775500516*^9}, {
>   3.558966830268648*^9, 3.5589668439964333`*^9}}],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[{
>  RowBox[{
>   RowBox[{"t", "=",
>    RowBox[{"Table", "[",
>     RowBox[{
>      RowBox[{"RandomReal", "[", "]"}], ",",
>      RowBox[{"{", "5", "}"}], ",",
>      RowBox[{"{", "3", "}"}]}], "]"}]}], ";"}], "\n",
>  RowBox[{
>   RowBox[{
>    RowBox[{"princomponentst", "=",
>     RowBox[{"PrincipalComponents", "[",
>      RowBox[{"t", ",",
>       RowBox[{"Method", "\[Rule]", "\"\<Correlation\>\""}]}], "]"}]}],
>     ";"}], " "}], "\n",
>  RowBox[{"Print", "[",
>   RowBox[{"\"\<The mean of the set is \>\"", ",",
>    RowBox[{
>     RowBox[{"Mean", "[", "princt", "]"}], "//", "Chop"}]}],
>   "]"}], "\n",
>  RowBox[{"Print", "[",
>   RowBox[{"\"\<The variance of the set is \>\"", ",",
>    RowBox[{"Variance", "[", "princt", "]"}]}], "]"}]}], "Input",
>  CellChangeTimes->{{3.558943696585477*^9, 3.5589437436731706`*^9}, {
>    3.5589448723167253`*^9, 3.558944889147688*^9},
>    3.5589452714125524`*^9, {3.558965740525318*^9,
>    3.558965743957515*^9}, 3.558966338325511*^9, {
>    3.5589667822939043`*^9, 3.558966817373911*^9}, {
>    3.558966862013464*^9, 3.5589669321494756`*^9}, {
>    3.558967927711418*^9, 3.558967942303253*^9}}],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[
>  InterpretationBox[
>   RowBox[{"\<\"The mean of the set is \"\>", "\[InvisibleSpace]",
>    RowBox[{"{",
>     RowBox[{"0", ",", "0", ",", "0"}], "}"}]}],
>   SequenceForm["The mean of the set is ", {0, 0, 0}],
>   Editable->False]], "Print",
>  CellChangeTimes->{{3.558966925069071*^9, 3.558966932823514*^9}, {
>   3.558967933532751*^9, 3.5589679478705716`*^9}}],
>
> Cell[BoxData[
>  InterpretationBox[
>   RowBox[{"\<\"The variance of the set is \"\>", "\[InvisibleSpace]",
>    RowBox[{"{",
>     RowBox[{
>     "1.4974734615741159`", ",", "0.9657686960146733`", ",",
>      "0.5367578424112112`"}], "}"}]}],
>   SequenceForm[
>   "The variance of the set is ", {1.4974734615741159`,
>    0.9657686960146733, 0.5367578424112112}],
>   Editable->False]], "Print",
>  CellChangeTimes->{{3.558966925069071*^9, 3.558966932823514*^9}, {
>   3.558967933532751*^9, 3.558967947872572*^9}}]
>
> }, Open  ]]
> }, Open  ]],
>
> Cell["\<\
> Standardize the observations and compute a correlation matrix.  \
> Compute the eigenvectors.\
> \>", "Text",
>  CellChangeTimes->{{3.558966974924922*^9, 3.558967006484727*^9}}],
>
> Cell[BoxData[{
>  RowBox[{
>   RowBox[{"standardizet", "=",
>    RowBox[{"Standardize", "[", "t", "]"}]}], ";"}], "\n",
>  RowBox[{
>   RowBox[{
>    RowBox[{"corrt", "=",
>     RowBox[{"Correlation", "[", "standardizet", "]"}]}], ";"}],
>   " "}], "\n",
>  RowBox[{
>   RowBox[{
>    RowBox[{"eigenvectors", "=",
>     RowBox[{"Eigenvectors", "[", "corrt", "]"}]}], ";"}],
>   " "}]}], "Input",
>  CellChangeTimes->{{3.5589449260758*^9, 3.5589449510202265`*^9}, {
>    3.5589454403162127`*^9, 3.5589454525239115`*^9},
>    3.5589670144291816`*^9, 3.5589670498292065`*^9, {
>    3.5589711280694685`*^9, 3.5589711551900196`*^9}}],
>
> Cell["\<\
> I think this is the multiplication.  However, the variances are not \
> correct since they do not decrease.\
> \>", "Text",
>  CellChangeTimes->{{3.5589677050376825`*^9, 3.5589677222526665`*^9}, {
>   3.558971033573064*^9, 3.558971044228673*^9}, {3.558971199412549*^9,
>   3.558971208188051*^9}}],
>
> Cell[CellGroupData[{
>
> Cell[BoxData[{
>  RowBox[{
>   RowBox[{"mypc2", "=",
>    RowBox[{"standardizet", ".", "eigenvectors"}]}], ";"}], "\n",
>  RowBox[{"Mean", "[", "mypc2", "]"}], "\n",
>  RowBox[{"Variance", "[", "mypc2", "]"}]}], "Input",
>  CellChangeTimes->{{3.5589661581992083`*^9, 3.5589662056929245`*^9},
>    3.5589670948777833`*^9, 3.5589671263245816`*^9, {
>    3.558967191557313*^9, 3.558967192101344*^9}, {
>    3.5589677375095396`*^9, 3.558967776636778*^9},
>    3.5589710607416177`*^9}],
>
> Cell[BoxData[
>  RowBox[{"{",
>   RowBox[{
>    RowBox[{"-", "2.4424906541753446`*^-16"}], ",",
>    "3.108624468950438`*^-16", ",",
>    RowBox[{"-", "3.7192471324942745`*^-16"}]}], "}"}]], "Output",
>  CellChangeTimes->{{3.55897113613293*^9, 3.5589711629244623`*^9}}],
>
> Cell[BoxData[
>  RowBox[{"{",
>   RowBox[{
>   "1.1977733239835728`", ",", "0.7727628961600694`", ",",
>    "1.0294637798563568`"}], "}"}]], "Output",
>  CellChangeTimes->{{3.55897113613293*^9, 3.558971162927462*^9}}]}, Open  ]]
> }, Open  ]]
> },
>
> WindowSize->{707, 787},
> WindowMargins->{{Automatic, 228}, {49, Automatic}},
> ShowSelection->True,
> FrontEndVersion->"8.0 for Microsoft Windows (64-bit) (October 6, \
> 2011)",
> StyleDefinitions->"Default.nb"
> ]
>
> --
> Richard Palmer
>
> Home                            941 412 8828
> Cell                               508 982-7266

Eigenvectors returns a matrix in which each row is an eigenvector,
so you need to transpose the matrix that Eigenvectors returns.



  • Prev by Date: Re: How to simplify hypergeometrics
  • Next by Date: Re: sum of coins article in mathematica journal
  • Previous by thread: reverse engineering principal components...
  • Next by thread: Resetting Clock[]