Re: Modeling of NFL game results
- To: mathgroup at smc.vnet.net
- Subject: [mg129285] Re: Modeling of NFL game results
- From: Scott Hemphill <hemphill at hemphills.net>
- Date: Sat, 29 Dec 2012 15:10:06 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- Delivered-to: l-mathgroup@wolfram.com
- Delivered-to: mathgroup-newout@smc.vnet.net
- Delivered-to: mathgroup-newsend@smc.vnet.net
- References: <kbbitr$9oc$1@smc.vnet.net> <kbh6da$iie$1@smc.vnet.net>
- Reply-to: hemphill at alumni.caltech.edu
Ray Koopman <koopman at sfu.ca> writes: [snip] > Some suggestions. Keep a 32 x 32 data matrix in which > data[[m,n]] = the # of times m beat n, m,n = 1...32. > The diagonals don't matter. The easiest way to eliminate infinite > solutions is to initialize it as data = Table[eps,{32},{32}], > where eps is some small positive value. (Use reals.) > Update every week: > win[m_,n_] := data[[m,n]]++; > tie[m_,n_] := (data[[m,n]] += .5; data[[n,m]] += .5); > Maximize Tr[LL[x,#]&/@Subsets[Range@32,{2}]]. > Fix x[32] = 0, maximize with respect to x[1]...x[31]. > Or define x[32] := -Tr[x/@Range@31], to fix the mean at 0. > If[logit === True, > LL[x_,{m_,n_}] := -(data[[m,n]]*Log[1. + Exp[x@n - x@m]] + > data[[n,m]]*Log[1. + Exp[x@m - > x@n]]) > LL[x_,{m_,n_}] := data[[m,n]]*Log[.5 + .5*Erf[.42(x@m - x@n)]] + > data[[n,m]]*Log[.5 + .5*Erf[.42(x@n - x@m)]] ] > The .42 puts the two solutions on approximately the same scale > for .10 < p < .90 . > > Caveat user: I haven't tried any of that. I haven't tried any of that yet, either, but it makes sense. At least, it makes sense when you know what Tr does to a List. :-) You really do try to reduce key strokes, don't you: "x@m" instead of "x[m]", "Tr" instead of "Total", .... I'll remember the value ".42" in connection with the meaning of "life, the universe, and everything". :-) I'm not so interested in using eps, because Mathematica's optimizer is pretty well-behaved. I would be interested in seeing if I could improve the model's predictive ability in the early season. The additive nature of log-likelihood suggests using weights, and I could seed "data" with last year's results with small weights. It might even make sense to weight results from week to week with Exp[a(w-w0)], where "w0" is the current week, to model a kind of a running average of a team's ability. It won't track the sudden nature of an important player out for the season with an injury, but the additional parameter "a" might be useful to track the general rise or fall of a team's ability though the season. Scott -- Scott Hemphill hemphill at alumni.caltech.edu "This isn't flying. This is falling, with style." -- Buzz Lightyear