Logistic Regression-Cost Fuction

时间：2017-11-04 00:26:29 阅读：347 评论：0 收藏：0 [点我收藏+]

标签：rac ict ram div other class cti csu 全局最优

1. 二分类问题

样本： $技术分享$ ，训练样本包含 $技术分享$ 个；
其中 $技术分享$ ，表示样本 $技术分享$ 包含 $技术分享$ 个特征；
$技术分享$ ，目标值属于0、1分类；
训练数据： $技术分享$

输入神经网络时样本数据的形状：

$技术分享$

目标数据的形状：

$技术分享$

2. logistic Regression

逻辑回归中，预测值：

$技术分享$

其表示为1的概率，取值范围在 $技术分享$ 之间。引入Sigmoid函数，预测值：

$技术分享$

其中

$技术分享$

注意点：函数的一阶导数可以用其自身表示，

$技术分享$

这里可以解释梯度消失的问题，当 $技术分享$ 时，导数最大，但是导数最大为 $技术分享$ ，这里导数仅为原函数值的0.25倍。参数梯度下降公式的不断更新， $技术分享$ 会变得越来越小，每次迭代参数更新的步伐越来越小，最终接近于0，产生梯度消失的现象。

3. logistic回归损失函数

Loss function

一般经验来说，使用平方错误（squared error）来衡量Loss Function：

$技术分享$

但是，对于logistic regression 来说，一般不适用平方错误来作为Loss Function，这是因为上面的平方错误损失函数一般是非凸函数（non-convex），其在使用低度下降算法的时候，容易得到局部最优解，而不是全局最优解。因此要选择凸函数。

逻辑回归的Loss Function：

$技术分享$

当 $技术分享$ 时， $技术分享$ 。如果 $技术分享$ 越接近1， $技术分享$ ，表示预测效果越好；如果 $技术分享$ 越接近0， $技术分享$ ，表示预测效果越差；
当 $技术分享$ 时， $技术分享$ 。如果 $技术分享$ 越接近0， $技术分享$ ，表示预测效果越好；如果 $技术分享$ 越接近1， $技术分享$ ，表示预测效果越差；
我们的目标是最小化样本点的损失Loss Function，损失函数是针对单个样本点的。

Cost function

全部训练数据集的Loss function总和的平均值即为训练集的代价函数（Cost function）。

$技术分享$

Cost function是待求系数w和b的函数；
我们的目标就是迭代计算出最佳的w和b的值，最小化Cost function，让其尽可能地接近于0。

################################################################################################################################################

Logistic Regression: Cost Function

To train the parameters ?? and ??, we need to define a cost function. Recap:

???(??) = ??(??????(??) + ??), where ??(??(??))= 1 ??(??) the i-th training example 1+ ?????(??)

?????????? {(??(1), ??(1) ), ? , (??(??), ??(??) )}, ???? ???????? ???(??) ≈ ??(??)

Loss (error) function:

The loss function measures the discrepancy between the prediction (???(??)) and the desired output (??(??)). In other words, the loss function computes the error for a single training example.

??(???(??), ??(??)) = 1 (???(??) ? ??(??))2 2

??(???(??), ??(??)) = ?( ??(??) log(???(??)) + (1 ? ??(??))log(1 ? ???(??))

If ??(??) = 1: ??(???(??), ??(??)) = ? log(???(??)) where log(???(??)) and ???(??) should be close to 1
If ??(??) = 0: ??(???(??), ??(??)) = ? log(1 ? ???(??)) where log(1 ? ???(??)) and ???(??) should be close to 0

Cost function

The cost function is the average of the loss function of the entire training set. We are going to find the parameters ?? ?????? ?? that minimize the overall cost function.

1?? 1??
??(??, ??) = ?? ∑ ??(???(??), ??(??)) = ? ?? ∑[( ??(??) log(???(??)) + (1 ? ??(??))log(1 ? ???(??))]

??=1 ??=1

注意：

1）定义cost function的目的是为了训练logistic 回归模型的参数 w 和 b

loss fuction 是在单个训练样本上定义的，而cost fuction 是在全体训练样本上定义的

Logistic Regression-Cost Fuction

标签：rac ict ram div other class cti csu 全局最优

原文地址：http://www.cnblogs.com/Bella2017/p/7780586.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

Logistic Regression-Cost Fuction

1. 二分类问题

2. logistic Regression

3. logistic回归 损失函数

3. logistic回归损失函数