标签:style blog http color io os ar for strong
之前一直没怎么接触过代码,前段时间朋友提起了caffe。本想看看caffe怎么用,无奈自己太渣了,不会用……想起之前也没怎么接触过这方面知识,就从入门开始吧。本文代码来自UFLDL tutorial。
MATLAB代码,和UFLDL Tutorial对应。代码调用minFunc求解,可以先看第二部分再看此处。
minFunc : uncontrained optimizer using a line search strategy.(注:虽然此处没有表明,但方法只能求解无约束凸优化问题)
函数采用下降方法求解最小值,可以参考凸优化和机器学习第3节的内容,当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。
Inputs : funObj, x0, options, varagin
funObj 提供了代价函数和梯度
x0 迭代过程中的初值
options 函数参数传递
varagin funObj中需要的参数.
Outputs : x, f, exitflag output
x 迭代结果,最小值
f 最小值处的代价函数的取值
exitflag 退出时的状态
output 函数运行时相关信息
输入参数(options)
DerivativeCheck
verbose & verboseI & debug & doPlot 通过DISPLAY控制, 决定运行过程中显示多少信息.
method METHOD控制, 表示采用的下降方法. LS_init,LS_type,LS_interp,LS_multi,Fref,Damped,HessianIter,c2 的取值和采用的方法有关,也能够自己指定(部分).
其他参数,包括maxFunEvals, maxIter, optTol, progTol, ...可以保持默认或指定
凸优化求解——下降方法
参考凸优化和机器学习第3节的内容,当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。
处理流程
(以最速下降法为例,比较简单,主要是其他的我不会……)
变量含义
x: 此时的位置
d: 下降方向
t: 下降步长
1. 预处理
2. SD < NEWTON, Hessian矩阵不需要计算。采用funObj 计算f(代价函数值)和g(x点处的梯度)。
3. 循环,直到迭代次数达到或者最小步长/代价函数差值达到。
对于最速下降法, 下降方向= 负梯度,因此d=-g
之后采用直线搜索策略,找最小步长。默认采用回溯直线搜索方法。
初始化t=1,调用函数ArmijoBacktrack计算即可。
4 . 对结果进行相应的检验和判断。
ArmijoBacktrack
标准的回溯直线搜索方法。参数含义可参见注释
本程序中c1代表alpha(默认值为1e-4),每次循环t=t/2。
注:MATLAB 脚本(不是函数)开头最好加上以下几行,清楚工作区,变量,以及关闭图等。
clc;
clear all;
close all;
Linear Regression
ex1a_linreg.m line 47
theta = minFunc(@linear_regression, theta, options, train.X, train.y);
其中options 是struct,决定minFunc参数。train.x和train.y是输入数据集(训练),linear_regression是编写的函数,输入参数中的theta是初值,输出theta是求得的线性回归的权值。即
minFunc 起到了求解上式的功能。
linear_regression(注释已略去)
function [f,g] = linear_regression(theta, X,y) f=0; g=zeros(size(theta)); yEst=theta‘*X; f=(y-yEst)*(y-yEst)‘; g=X*(yEst-y)‘; end
minFunc计算过程中需要得知代价函数和梯度,以上代码提供了这一功能。
Logistic Regression
function [f,g] = logistic_regression(theta, X,y) % % Arguments: % theta - A column vector containing the parameter values to optimize. % X - The examples stored in a matrix. % X(i,j) is the i‘th coordinate of the j‘th example. % y - The label for each example. y(j) is the j‘th example‘s label. % m=size(X,2); % initialize objective value and gradient. f = 0; g = zeros(size(theta)); %%% YOUR CODE HERE %%% h = 1./(1+exp(-theta‘*X)); f = -y*log(h‘)+(y-1)*log(1-h‘); g = X*(h-y)‘;
Softmax Regression(供参考)
function [f,g] = softmax_regression(theta, X,y) % % Arguments: % theta - A vector containing the parameter values to optimize. % In minFunc, theta is reshaped to a long vector. So we need to % resize it to an n-by-(num_classes-1) matrix. % Recall that we assume theta(:,num_classes) = 0. % % X - The examples stored in a matrix. % X(i,j) is the i‘th coordinate of the j‘th example. % y - The label for each example. y(j) is the j‘th example‘s label. % m=size(X,2); n=size(X,1); % theta is a vector; need to reshape to n x num_classes. theta=reshape(theta, n, []); num_classes=size(theta,2)+1; % initialize objective value and gradient. % % TODO: Compute the softmax objective function and gradient using vectorized code. % Store the objective function value in ‘f‘, and the gradient in ‘g‘. % Before returning g, make sure you form it back into a vector with g=g(:); % %%% YOUR CODE HERE %%% f = 0; g = zeros(size(theta)); a = theta‘*X; a = [a;zeros(1,size(a,2))]; a = exp(a); aSum = sum(a); compareMatrix = 1:10; compareMatrix = repmat(compareMatrix‘,1,m); h=log(a./repmat(aSum,num_classes,1)); judMatrix = abs(compareMatrix - repmat(y,num_classes,1)); A = judMatrix; A(judMatrix>0) = 0; A(judMatrix==0) = 1; B = A*h‘; f = -sum(diag(B)); g = -X*(A-a./repmat(aSum,num_classes,1))‘; g = g(:,1:9); g=g(:); % make gradient a vector for minFunc
PCA Whitening(这个看教程就可以了)
%%================================================================ %% Step 0a: Load data % Here we provide the code to load natural image data into x. % x will be a 784 * 600000 matrix, where the kth column x(:, k) corresponds to % the raw image data from the kth 12x12 image patch sampled. % You do not need to change the code below. clear all; close all; clc; x = loadMNISTImages(‘train-images-idx3-ubyte‘); figure(‘name‘,‘Raw images‘); randsel = randi(size(x,2),200,1); % A random selection of samples for visualization display_network(x(:,randsel)); %%================================================================ %% Step 0b: Zero-mean the data (by row) % You can make use of the mean and repmat/bsxfun functions. %%% YOUR CODE HERE %%% xMeanRow = mean(x); x = x-repmat(xMeanRow,size(x,1),1); %%================================================================ %% Step 1a: Implement PCA to obtain xRot % Implement PCA to obtain xRot, the matrix in which the data is expressed % with respect to the eigenbasis of sigma, which is the matrix U. %%% YOUR CODE HERE %%% xCorr = x*x‘/size(x,2); [U S V] = svd(xCorr); xRot = U‘*x; %%================================================================ %% Step 1b: Check your implementation of PCA % The covariance matrix for the data expressed with respect to the basis U % should be a diagonal matrix with non-zero entries only along the main % diagonal. We will verify this here. % Write code to compute the covariance matrix, covar. % When visualised as an image, you should see a straight line across the % diagonal (non-zero entries) against a blue background (zero entries). %%% YOUR CODE HERE %%% covar = xRot*xRot‘/size(xRot,2); % Visualise the covariance matrix. You should see a line across the % diagonal against a blue background. figure(‘name‘,‘Visualisation of covariance matrix‘); imagesc(covar); %%================================================================ %% Step 2: Find k, the number of components to retain % Write code to determine k, the number of components to retain in order % to retain at least 99% of the variance. %%% YOUR CODE HERE %%% var = sum(diag(covar)); varMin = 0.99*var; varSum = 0; k = 0; A = diag(covar); for i=1:length(A) varSum = varSum+A(i); if(varSum>=varMin && k==0) k = i; end end %%================================================================ %% Step 3: Implement PCA with dimension reduction % Now that you have found k, you can reduce the dimension of the data by % discarding the remaining dimensions. In this way, you can represent the % data in k dimensions instead of the original 144, which will save you % computational time when running learning algorithms on the reduced % representation. % % Following the dimension reduction, invert the PCA transformation to produce % the matrix xHat, the dimension-reduced data with respect to the original basis. % Visualise the data and compare it to the raw data. You will observe that % there is little loss due to throwing away the principal components that % correspond to dimensions with low variation. %%% YOUR CODE HERE %%% xRot = U‘*x; xTilde = U(:,1:k)‘*x; xHat = U*[xTilde;zeros(size(x,1)-k,size(x,2))]; % Visualise the data, and compare it to the raw data % You should observe that the raw and processed data are of comparable quality. % For comparison, you may wish to generate a PCA reduced image which % retains only 90% of the variance. figure(‘name‘,[‘PCA processed images ‘,sprintf(‘(%d / %d dimensions)‘, k, size(x, 1)),‘‘]); display_network(xHat(:,randsel)); figure(‘name‘,‘Raw images‘); display_network(x(:,randsel)); %%================================================================ %% Step 4a: Implement PCA with whitening and regularisation % Implement PCA with whitening and regularisation to produce the matrix % xPCAWhite. epsilon = 1e-1; %%% YOUR CODE HERE %%% xPCAWhite = diag(1./sqrt(diag(S) + epsilon)) * xRot; covar = xPCAWhite*xPCAWhite‘; %% Step 4b: Check your implementation of PCA whitening % Check your implementation of PCA whitening with and without regularisation. % PCA whitening without regularisation results a covariance matrix % that is equal to the identity matrix. PCA whitening with regularisation % results in a covariance matrix with diagonal entries starting close to % 1 and gradually becoming smaller. We will verify these properties here. % Write code to compute the covariance matrix, covar. % % Without regularisation (set epsilon to 0 or close to 0), % when visualised as an image, you should see a red line across the % diagonal (one entries) against a blue background (zero entries). % With regularisation, you should see a red line that slowly turns % blue across the diagonal, corresponding to the one entries slowly % becoming smaller. %%% YOUR CODE HERE %%% % Visualise the covariance matrix. You should see a red line across the % diagonal against a blue background. figure(‘name‘,‘Visualisation of covariance matrix‘); imagesc(covar); %%================================================================ %% Step 5: Implement ZCA whitening % Now implement ZCA whitening to produce the matrix xZCAWhite. % Visualise the data and compare it to the raw data. You should observe % that whitening results in, among other things, enhanced edges. %%% YOUR CODE HERE %%% xZCAWhite = U * diag(1./sqrt(diag(S) + epsilon)) * U‘ * x; % Visualise the data, and compare it to the raw data. % You should observe that the whitened images have enhanced edges. figure(‘name‘,‘ZCA whitened images‘); display_network(xZCAWhite(:,randsel)); figure(‘name‘,‘Raw images‘); display_network(x(:,randsel));
暂时就这些了,其他的代码看了,写了之后也会写在这里……
标签:style blog http color io os ar for strong
原文地址:http://www.cnblogs.com/sea-wind/p/4042939.html