Machine Learning Techniques -5-Kernel Logistic Regression

时间：2015-08-26 13:33:48 阅读：398 评论：0 收藏：0 [点我收藏+]

标签：

5-Kernel Logistic Regression

Last class, we learnt about soft margin and its application. Now, a new idea comes to us, could we

apply the kernel trick to our old frirend logistic regression?

Firstly, let‘s review those four concepts of margin handling:

技术分享

As we can see, the differences between "Hard" and "Soft" is showed from constant C, which is a bit similar to Regularization.

Since we define a new factor called ξ, we can use MAX function smoothly express the margin violation:

技术分享

thus the unconstrained form of soft-margin SVM:

技术分享

We can easily find that the form of this subject is similar to L2 regularization.

However, there is no QP formation and the function of max may lead to some place not differentiable.

Thus we get the idea, which apply SVM as a regularization model:

技术分享

For the REGULARIZATION FORM SVM, a larger C means the smaller influence from wTw, which is the regularization factor.

Then, a comparition about error will be given,

the SVM error has different appearance with the middle point, which we called hinge error measure.

技术分享

Now for this binary Classification, could LogReg and SVM be joint?

Because we know the advantage of SVM, which is able to simplify the computing by kernel, while the LogReg holds some other benefits.

技术分享

Here we apply the Platt‘s Scaling https://en.wikipedia.org/wiki/Platt_scaling

which is found to be a nice method to better the binary problem.

We caculate the transforming of SVM to get the w and b, and we other tool to find best A and B.

技术分享

In conclusion, the structure of our demand is like that :

we want to use KERNEL -> we need wT*Z (to package into KERNEL) -> we need linear combination of Zn

技术分享

optimal w be represented by zn:

技术分享

Since the w|| can be prpved to be the only subitem in w:

It can be proved that any L2- regularized linear model can be kernelized.

So, here we get a new represention called Kernel Logistic Regression (KLR),

There are something we should pay attention:

1. the dimention of this issue is subject to N of samples.

2. the β can be seen as a description toward the relationship between xn and any other points in X space.

3. βn could not be zero, which means larger computing cost compared to the process of finding good w.

技术分享

Machine Learning Techniques -5-Kernel Logistic Regression

标签：

原文地址：http://www.cnblogs.com/windniu/p/4759951.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行