码迷,mamicode.com
首页 > 系统相关 > 详细

Machine Learning Techniques -5-Kernel Logistic Regression

时间:2015-08-26 13:33:48      阅读:398      评论:0      收藏:0      [点我收藏+]

标签:

5-Kernel Logistic Regression

Last class, we learnt about soft margin and its application. Now, a new idea comes to us, could we

apply the kernel trick to our old frirend logistic regression? 

Firstly, let‘s review those four concepts of margin handling: 

技术分享

 As we can see, the differences between "Hard" and "Soft" is showed from constant C, which is a bit similar to Regularization.

Since we define a new factor called ξ, we can use MAX function smoothly express the margin violation:

技术分享

thus the unconstrained form of soft-margin SVM:

技术分享

We can easily find that the form of this subject is similar to L2 regularization.

However, there is no QP formation and the function of max may lead to some place not differentiable.

Thus we get the idea, which apply SVM as a regularization model:

技术分享

For the REGULARIZATION FORM SVM, a larger C means the smaller influence from wTw, which is the regularization factor.

Then, a comparition about error will be given,

the SVM error has different appearance with the middle point, which we called hinge error measure.

技术分享

技术分享

Now for this binary Classification, could LogReg and SVM be joint?

Because we know the advantage of SVM, which is able to simplify the computing by kernel, while the LogReg holds some other benefits.

技术分享

Here we apply the Platt‘s Scaling https://en.wikipedia.org/wiki/Platt_scaling

which is found to be a nice method to better the binary problem.

We caculate the transforming of SVM to get the w and b, and we other tool to find best A and B.

技术分享

In conclusion, the structure of our demand is like that :

we want to use KERNEL ->   we need wT*Z (to package into KERNEL)  -> we need linear combination of Zn

技术分享

 

optimal w be represented by zn:

技术分享

Since the w|| can be prpved to be the only subitem in w:

It can be proved that any L2- regularized linear model can be kernelized.

So, here we get a new represention called Kernel Logistic Regression (KLR),

There are something we should pay attention:

1. the dimention of this issue is subject to  N of samples.

2. the β can be seen as a description toward the relationship between xn and any other points in X space.

3. βn could not be zero, which means larger computing cost compared to the process of finding good w.

技术分享

 

Machine Learning Techniques -5-Kernel Logistic Regression

标签:

原文地址:http://www.cnblogs.com/windniu/p/4759951.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!