码迷,mamicode.com
首页 > 编程语言 > 详细

Softmax Regression及Python代码

时间:2018-03-04 23:56:03      阅读:265      评论:0      收藏:0      [点我收藏+]

标签:res   turn   ida   ict   coding   post   scale   des   参数   

  Softmax Regression是逻辑回归在多分类问题上的推广,主要用于处理多分类问题,其中任意两个类别之间都是线性可分的。

  假设有$k$个类别,每个类别的参数向量为${\theta}_j $,那么对于每个样本,其所属类别的概率为:

\[P({{y}_{i}}|X,{{\theta }_{j}})=\frac{{{e}^{{{\theta }_{j}}X}}}{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}}\]

  相比如逻辑回归的损失函数,Softmax的损失函数引入了指示函数,其损失函数为:

$J\left( \theta  \right)=-\frac{1}{m}\left[ \sum\limits_{i=1}^{m}{\sum\limits_{j=1}^{k}{I\left\{ {{y}_{i}}=j \right\}\log \frac{{{e}^{{{\theta }_{j}}X}}}{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}}}} \right]$

   该损失函数的意义是对每一个样本判断其属于哪个类别,并进行相应计算。对该损失函数,可以使用梯度下降法求解,梯度计算过程如下:

${{\nabla }_{{{\theta }_{j}}}}J(\theta )=-\frac{1}{m}\sum\limits_{i=1}^{m}{\left[ {{\nabla }_{{{\theta }_{j}}}}\sum\limits_{l=1}^{k}{I\{{{y}_{i}}=j\}\log \frac{{{e}^{{{\theta }_{j}}X}}}{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}}} \right]}$

$ =-\frac{1}{m}\sum\limits_{i=1}^{m}{[I\{{{y}_{i}}=j\}\frac{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}}{{{e}^{{{\theta }_{j}}X}}}\cdot \frac{{{e}^{{{\theta }_{j}}X}}\cdot X\cdot \sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}-{{e}^{{{\theta }_{j}}X}}\cdot {{e}^{{{\theta }_{j}}X}}\cdot X}{{{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}}^{2}}}]}$

$ =-\frac{1}{m}\sum\limits_{i=1}^{m}{I\{{{y}_{i}}=j\}\frac{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}-{{e}^{{{\theta }_{j}}X}}}{\sum\limits_{l=1}^{k}{{{e}^{{{\theta }_{l}}X}}}}}\cdot X $

$=-\frac{1}{m}\sum\limits_{i=1}^{m}{\left[ (I\{{{y}_{i}}=j\}-P({{y}_{i}}=j||X,{{\theta }_{j}}))\cdot X \right]} $

  对每个类别,分别求其${\theta}_j$的梯度并计算,Python代码如下:

# -*- coding: utf-8 -*-
"""
Created on Sun Jan 28 15:32:44 2018

@author: zhang
"""

import numpy as np
from sklearn.datasets import load_digits
from sklearn.cross_validation import train_test_split
from sklearn import preprocessing

def load_data():
     digits = load_digits()
     data = digits.data
     label = digits.target
     return np.mat(data), label

def gradient_descent(train_x, train_y, k, maxCycle, alpha):
# k 为类别数
     numSamples, numFeatures = np.shape(train_x)
     weights = np.mat(np.ones((numFeatures, k)))
     
     for i in range(maxCycle):
          value = np.exp(train_x * weights)  
          rowsum = value.sum(axis = 1)   # 横向求和
          rowsum = rowsum.repeat(k, axis = 1)  # 横向复制扩展
          err = - value / rowsum  #计算出每个样本属于每个类别的概率
          for j in range(numSamples):     
               err[j, train_y[j]] += 1
          weights = weights + (alpha / numSamples) * (train_x.T * err)
     return weights
    
     

def test_model(test_x, test_y, weights):
     results = test_x * weights
     predict_y = results.argmax(axis = 1)
     count = 0
     for i in range(np.shape(test_y)[0]):
          if predict_y[i,] == test_y[i,]:
               count += 1   
     return count / len(test_y), predict_y 


if __name__ == "__main__":

     data, label = load_data()
#     data = preprocessing.minmax_scale(data, axis = 0)
#    数据处理之后识别率降低了
     train_x, test_x, train_y, test_y = train_test_split(data, label, test_size = 0.25, random_state=33)
     k = len(np.unique(label))
     
     weights = gradient_descent(train_x, train_y, k, 800, 0.01)
     accuracy, predict_y = test_model(test_x, test_y, weights)
     print("Accuracy:", accuracy)

 

Softmax Regression及Python代码

标签:res   turn   ida   ict   coding   post   scale   des   参数   

原文地址:https://www.cnblogs.com/PowerTransfer/p/8506440.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!