标签:
对于解带约束(线性约束)的线性回归通常的办法是,将约束直接表示在线性回归中,也就是减少一个变量(即回归到线性约束本身的自由变量数目)。然而今天由于要解一个问题发现了另一种思路的解法,是比变换变量更为通用的办法,就是用二次规划法解带约束的线性回归。
二次规划法有如下一般形式:
各个部分为:
特别地如果约束条件为等号,则可以用拉格朗日乘子法直接求解。如果Q是不定矩阵,甚至有一个特征值是负数的,问题就是NP难问题。。
解决一个带约束的线性回归问题,形如:
需要求解最小化:
这个式子展开来写成矩阵形式就是:
其中:
##我不明白下面三个约束怎么实现了回归参数都大于0, 求解答。。。
因为Y是一个常量,因此目标函数等价于求解:
等价于:
因此带有约束的线性回归就变成了一个二次规划问题。
用R解决二次规划问题的方法是 library("quadprog") 中的 solve.QP(Dmat, dvec, Amat, bvec, meq=0, factorized=FALSE) 。解决的是形如
min(-d^T b + 1/2 b^T D b) with the constraints A^T b >= b_0.
的问题。方法中各个变量的含义如下:
Dmat |
matrix appearing in the quadratic function to be minimized. |
dvec |
vector appearing in the quadratic function to be minimized. |
Amat |
matrix defining the constraints under which we want to minimize the quadratic function. |
bvec |
vector holding the values of b_0 (defaults to zero). |
meq |
the first |
factorized |
logical flag: if |
返回值的含义为:
solution |
vector containing the solution of the quadratic programming problem. |
value |
scalar, the value of the quadratic function at the solution |
unconstrained.solution |
vector containing the unconstrained minimizer of the quadratic function. |
iterations |
vector of length 2, the first component contains the number of iterations the algorithm needed, the second indicates how often constraints became inactive after becoming active first. |
Lagrangian |
vector with the Lagragian at the solution. |
iact |
vector with the indices of the active constraints at the solution. |
具体的用法给出一段代码:
1 library("quadprog"); 2 X <- matrix(runif(300), ncol=3) 3 Y <- X %*% c(0.2,0.3,0.5) + rnorm(100, sd=0.2) 4 Rinv <- solve(chol(t(X) %*% X)); 5 #returns the inverse of the input 6 #chol( is the cholesky factorization ) 7 C <- cbind(rep(1,3), diag(3)) 8 b <- c(1,rep(0,3)) 9 d <- t(Y) %*% X 10 solve.QP(Dmat = Rinv, factorized = TRUE, 11 dvec = d, Amat = C, bvec = b, meq = 1) 12 #pass factorized = TRUE for the chol()
references:
[2] https://en.wikipedia.org/wiki/Quadratic_programming
标签:
原文地址:http://www.cnblogs.com/manqing/p/4998182.html