OpenCV人脸识别Eigen算法源码分析

时间：2016-07-07 19:40:58 阅读：1979 评论：0 收藏：0 [点我收藏+]

标签：

1 理论基础

学习Eigen人脸识别算法需要了解一下它用到的几个理论基础，现总结如下：

1.1 协方差矩阵

首先需要了解一下公式：

技术分享

共公式可以看出：均值描述的是样本集合的平均值，而标准差描述的则是样本集合的各个样本点到均值的距离之平均。以一个国家国民收入为例，均值反映了平均收入，而均方差/方差则反映了贫富差距，如果两个国家国民收入均值相等，则标准差越大说明国家的国民收入越不均衡，贫富差距较大。以上公式都是用来描述一维数据量的，把方差公式推广到二维，则可得到协方差公式：

协方差表明了两个随机变量之间的相关性，值为正说明两者是正相关的，值为负说明两者是负相关的，值为零说明两者不相关，举一个简单的小例子，假设一个人用4个维度身高、体重、距离屋顶的高度、每天画画的时间来表示：身高取样X=[1 2 3 4 5 6 7 8 9]，体重取样Y=[11 12 13 14 15 16 17 18 19]，距离屋顶的高度取样Z=[9 8 7 6 5 4 3 2 1]，每天画画时间L=[1 1 1 1 1 1 1 1 1]，则有cov(X,Y)=7.5，cov(X,Z)=-7.5，cov(X,L)=0，结果很明显X和Y协方差为正数两者正相关，X和Z协方差为负数两者负相关，X和L协方差为0，说明它们不相关。以上例子每一个随机变量都可以表示一个维度，我们计算了部分维度之间的协方差，计算所有维度之间的协方差并组织成矩阵的形式，就有了协方差矩阵的概念：C_nxn=[c_i,j]=[cov(Dim_i,Dim_j)] i,j=1,2,…,n，Dim_i表示第i个维度向量。以Matlab协方差矩阵为例，将X，Y，Z，L分别作为1，2，3，4个维度，则有c_1,1=7.5，c_1,2=7.5，c_1,3=-7.5，c_1,4=7.5……，所以协方差矩阵为：

技术分享

在Matlab中可以把矩阵的每行看做是4个随机变量的一组取样样本，每列看做是一个维度，则可以直接用con函数求得4个维度的协方差矩阵：

技术分享

1.2 Jacobi迭代法求对称矩阵特征向量及特征值

雅可比迭代法的基本思想是：通过一组平面旋转变换（相似正交变换）化对称矩阵A为对角矩阵，进而求出A的特征值与特征向量。由线性代数理论可知：若矩阵A是实对称矩阵，则一定存在正交矩阵U，使得U^T*A*U=D，其中D对角矩阵，其主对角线元素λ_i是A的特征值，正交矩阵U的第i列是A对应特征值λ_i的特征向量。于是求对称矩阵A的特征值问题转化为寻找正交矩阵U，使得U^T*A*U为对角矩阵，这个问题的困难在于如何构造U，为此我们先看一下平面上的旋转变换：

技术分享

则有：

技术分享

其中：

技术分享

上述推导其实说明了一种构造正交矩阵P，并使得P^T*A*P为对角矩阵的方法，可以将这种方法推广到nxn对角矩阵，首先引入n阶旋转矩阵（Givens矩阵）的概念：

技术分享

平面旋转矩阵有如下性质：

（1）U_pq为正交矩阵，即U_pq^T*U_pq=E

^（2）U^TAU=B仍为对称矩阵，且B与A有相同的特征值

Jacobi迭代法，在每一次迭代时都是进行一次（2）中的转换，这里p、q分别是前一次的迭代矩阵A的非主对角线上绝对值最大元素的行列号，变换后元素值可以由以下公式求出：

技术分享

由公式可以看出转换后矩阵相比原矩阵只是在p，q行和列的元素发生了改变，旋转角的计算过程和2维时一样，其意义是使得apq和aqp值为零，这样每次迭代都使得非对角线上绝对值最大的元素变为零，所以整个迭代的过程就是使对角线外元素逐步逼近于零，这是对角线上的元素即为原对称矩阵的特征值λ_i。在进行Jacobi迭代时，假如i次迭代时旋转矩阵为Ui，每次迭代对单位矩阵I依次左乘Ui，最终迭代结束后可得矩阵D=U_k…U₂U₁I，这里k为迭代次数，则可以证明D的列向量即为特征值λ_i对应的特征向量，证明如下：

技术分享

上述推导过程中d_i为矩阵D的i列表示的列向量，由最后的等式及特征值定义，可以得知λ_i是A的特征值，d_i为对应的特征向量。

2 OpenCV源码解析

2.1 关键函数

（1）void reduce(InputArray src, OutputArray dst, int dim, int rtype, int dtype=-1)

其英文注释：transforms 2D matrix to 1D row or column vector by taking sum, minimum, maximum or mean value over all the rows.

其英文注释不太准确，函数的作用其实是：将2维矩阵转换为1维行向量或列向量，如转换为行向量，则每列处的值为原矩阵对应列所有值的和，最小值，最大值，平均值；如转换为列向量，则每行处的值为原矩阵对应行所有值的和。该函数参数意义如下：

src: 原矩阵

dst: 目的向量

dim: 指明处理后向量是行向量还是列向量，0原矩阵被处理成行向量，否则原矩阵被处理成列向量

op: 取值为CV_REDUCE_SUM，CV_REDUCE_MAX，CV_REDUCE_MIN，CV_REDUCE_AVG之一

dtype: 目的向量类型

（2）void gemm(InputArray src1, InputArray src2, double alpha, InputArray src3, double gamma, OutputArray dst, int flags=0)

其英文注释：implements generalized matrix product algorithm GEMM from BLAS.

函数的作用：实现广义矩阵乘法，只对最后一个参数进行说明

flags: 取值为GEMM_1_T，GEMM_2_T，GEMM_3_T之1或者它们的组合，例如取值为GEMM_1_T则进行乘法之前对src1进行转置，所有函数作用可由以下公式来说明：

dst=alpha*op(src1)*op(src2)+gamma*op(src3)，其中op(X)是X还是X^T由flags确定。

（3）void mulTransposed( InputArray src, OutputArray dst, bool aTa, InputArray delta=noArray(), double scale=1, int dtype=-1 )

其英文注释：multiplies matrix by its transposition from the left or from the right.

函数的作用：矩阵左乘或右乘其转置矩阵，参数意思如下：

src: 原矩阵

dst: 目的矩阵

ata: 乘法顺序，true A^T*A false A*A^T

delta：在进行乘法前src先减去该数组

scale：乘法之后对结果进行scale倍缩放

dtype：目的矩阵类型

当ata为真时可用公式 dst=(src-delta)^T*(src-delta)*scale 来说明函数的作用，该函数内部调用了函数（2）

（4）void calcCovarMatrix( InputArray samples, OutputArray covar, OutputArray mean, int flags, int ctype=CV_64F)

其英文注释：computes covariation matrix of a set of samples

函数作用：计算矩阵行向量或列向量的协方差矩阵，该函数中会调用函数（3）来实现相应功能

（5）bool eigen(InputArray src, OutputArray eigenvalues, OutputArray eigenvectors, int lowindex=-1, int highindex=-1)

其英文解释：finds eigenvalues and eigenvectors of a symmetric matrix

函数作用：求对称矩阵的特征值和特征向量，在该函数中会利用Jacobi方法来求对称矩阵的特征值和特征向量

2.2 主要过程

特征脸EigenFace的思想是把人脸从像素空间变换到另一个空间，在另一个空间中做相似性计算，EigenFace选择的空间变换方法是PCA，就是大名鼎鼎的主成分分析。EigenFace方法利用PCA得到人脸分布的主成分，具体实现是对训练集中的所有人脸图像的协方差矩阵进行求特征值，特征值对应的特征向量就是所谓的“特征脸”，每个特征向量描述人脸的一种变化或者特征，所以每个人脸都可以表示为这些特征脸的线性组合。下面结合以AT&T人脸库（40个人每个人包含10个表情脸图像，共400个脸部图像，每个图像分辨率为92x112），取其中399个人脸为样本库，最后1个为待识别人脸，给出基于Eigen特征脸的人脸识别实现过程：

（1）将训练集中的每一个人脸图像数据都拉长成一行，并将他们组合在一起形成一个大矩阵A，则A的大小为399x10304，即399行10304列。

（2）将399个人脸每个人脸对应的维度数据相加，然后求平均值，得到平均值向量Mean_1x10304，将矩阵A的每一行都减去平均值向量得到差值矩阵B。

（3）计算协方差矩阵C=B*B^T，C的维度是399x399，再对C求特征值λ_i，及特征向量e_i，0<=i<399。

（4）上一步骤中其实并不是真正的人脸取样集协方差矩阵，因为人脸取样的维度是10304，而协方差矩阵反应的是各个维度之前的相关性，所以人脸取样集真正的协方差矩阵是C‘=C^T=B^T*B，如果v_i是C‘的第i个特征向量，可以证明λ_i同样是C‘的特征值，且v_i=B^T*e_i（v_i是10304行列向量），证明如下：

C*e_i=λ_i*e_i => B*B^T*e_i=λ_i*e_i => B^T*B*B^T*e_i=λ_i*B^T*e_i => C‘*v_i=λ_i*v_i

特征向量v_i即为“特征脸”，所有特征向量组成特征向量矩阵V_10304*399，则对于任意人脸向量α，将它与特征向量矩阵V相乘，将得到向量α在各个特征向量的投影，即α*V所得向量的每一个元素为α在对应“特征脸”的投影，在进行识别时，先求得待识别人脸向量在“特征脸”的投影向量，之后和每个样本脸的投影向量进行相似度比较，相似度最低者为最佳匹配。

2.3 核心源码

代码取自Opencv2.4.9

 1 void Eigenfaces::train(InputArrayOfArrays _src, InputArray _local_labels) {
 2     if(_src.total() == 0) {
 3         string error_message = format("Empty training data was given. You‘ll need more than one sample to learn a model.");
 4         CV_Error(CV_StsBadArg, error_message);
 5     } else if(_local_labels.getMat().type() != CV_32SC1) {
 6         string error_message = format("Labels must be given as integer (CV_32SC1). Expected %d, but was %d.", CV_32SC1, _local_labels.type());
 7         CV_Error(CV_StsBadArg, error_message);
 8     }
 9     // make sure data has correct size
10     if(_src.total() > 1) {
11         for(int i = 1; i < static_cast<int>(_src.total()); i++) {
12             if(_src.getMat(i-1).total() != _src.getMat(i).total()) {
13                 string error_message = format("In the Eigenfaces method all input samples (training images) must be of equal size! Expected %d pixels, but was %d pixels.", _src.getMat(i-1).total(), _src.getMat(i).total());
14                 CV_Error(CV_StsUnsupportedFormat, error_message);
15             }
16         }
17     }
18     // get labels
19     Mat labels = _local_labels.getMat();
20     // observations in row
21     Mat data = asRowMatrix(_src, CV_64FC1);
22 
23     // number of samples
24    int n = data.rows;
25     // assert there are as much samples as labels
26     if(static_cast<int>(labels.total()) != n) {
27         string error_message = format("The number of samples (src) must equal the number of labels (labels)! len(src)=%d, len(labels)=%d.", n, labels.total());
28         CV_Error(CV_StsBadArg, error_message);
29     }
30     // clear existing model data
31     _labels.release();
32     _projections.clear();
33     // clip number of components to be valid
34     if((_num_components <= 0) || (_num_components > n))
35         _num_components = n;
36 
37     // perform the PCA
38     PCA pca(data, Mat(), CV_PCA_DATA_AS_ROW, _num_components);
39     // copy the PCA results
40     _mean = pca.mean.reshape(1,1); // store the mean vector
41     _eigenvalues = pca.eigenvalues.clone(); // eigenvalues by row
42     transpose(pca.eigenvectors, _eigenvectors); // eigenvectors by column
43     // store labels for prediction
44     _labels = labels.clone();
45     // save projections
46     for(int sampleIdx = 0; sampleIdx < data.rows; sampleIdx++) {
47         Mat p = subspaceProject(_eigenvectors, _mean, data.row(sampleIdx));
48         _projections.push_back(p);
49     }
50 }

人脸样本训练过程

38行的PCA类中实现了求样本矩阵的协方差矩阵、求协方差矩阵特征向量等核心功能，47行_mean为人脸平均值向量，该行其实是求每一个人脸向量减去平均值向量在“特征脸”集上的投影向量。

 1 PCA& PCA::operator()(InputArray _data, InputArray __mean, int flags, int maxComponents)
 2 {
 3     Mat data = _data.getMat(), _mean = __mean.getMat();
 4     int covar_flags = CV_COVAR_SCALE;
 5     int i, len, in_count;
 6     Size mean_sz;
 7 
 8     CV_Assert( data.channels() == 1 );
 9     if( flags & CV_PCA_DATA_AS_COL )
10     {
11         len = data.rows;
12         in_count = data.cols;
13         covar_flags |= CV_COVAR_COLS;
14         mean_sz = Size(1, len);
15     }
16     else
17     {
18         len = data.cols;
19         in_count = data.rows;
20         covar_flags |= CV_COVAR_ROWS;
21         mean_sz = Size(len, 1);
22     }
23 
24     int count = std::min(len, in_count), out_count = count;
25     if( maxComponents > 0 )
26         out_count = std::min(count, maxComponents);
27 
28     // "scrambled" way to compute PCA (when cols(A)>rows(A)):
29     // B = A‘A; B*x=b*x; C = AA‘; C*y=c*y -> AA‘*y=c*y -> A‘A*(A‘*y)=c*(A‘*y) -> c = b, x=A‘*y
30     if( len <= in_count )
31         covar_flags |= CV_COVAR_NORMAL;
32 
33     int ctype = std::max(CV_32F, data.depth());
34     mean.create( mean_sz, ctype );
35 
36     Mat covar( count, count, ctype );
37 
38     if( _mean.data )
39     {
40         CV_Assert( _mean.size() == mean_sz );
41         _mean.convertTo(mean, ctype);
42         covar_flags |= CV_COVAR_USE_AVG;
43     }
44 
45     calcCovarMatrix( data, covar, mean, covar_flags, ctype );
46     eigen( covar, eigenvalues, eigenvectors );
47 
48     if( !(covar_flags & CV_COVAR_NORMAL) )
49     {
50         // CV_PCA_DATA_AS_ROW: cols(A)>rows(A). x=A‘*y -> x‘=y‘*A
51         // CV_PCA_DATA_AS_COL: rows(A)>cols(A). x=A‘‘*y -> x‘=y‘*A‘
52         Mat tmp_data, tmp_mean = repeat(mean, data.rows/mean.rows, data.cols/mean.cols);
53         if( data.type() != ctype || tmp_mean.data == mean.data )
54         {
55             data.convertTo( tmp_data, ctype );
56             subtract( tmp_data, tmp_mean, tmp_data );
57         }
58         else
59         {
60             subtract( data, tmp_mean, tmp_mean );
61             tmp_data = tmp_mean;
62         }
63 
64         Mat evects1(count, len, ctype);
65         gemm( eigenvectors, tmp_data, 1, Mat(), 0, evects1,
66             (flags & CV_PCA_DATA_AS_COL) ? CV_GEMM_B_T : 0);
67         eigenvectors = evects1;
68 
69         // normalize eigenvectors
70         for( i = 0; i < out_count; i++ )
71         {
72             Mat vec = eigenvectors.row(i);
73             normalize(vec, vec);
74         }
75     }
76 
77     if( count > out_count )
78     {
79         // use clone() to physically copy the data and thus deallocate the original matrices
80         eigenvalues = eigenvalues.rowRange(0,out_count).clone();
81         eigenvectors = eigenvectors.rowRange(0,out_count).clone();
82     }
83     return *this;
84 }

PCA类核心代码

45行求样本矩阵的协方差矩阵，46行求协方差矩阵的特征值及特征向量。

 1 void Eigenfaces::predict(InputArray _src, int &minClass, double &minDist) const {
 2     // get data
 3     Mat src = _src.getMat();
 4     // make sure the user is passing correct data
 5     if(_projections.empty()) {
 6         // throw error if no data (or simply return -1?)
 7         string error_message = "This Eigenfaces model is not computed yet. Did you call Eigenfaces::train?";
 8         CV_Error(CV_StsError, error_message);
 9     } else if(_eigenvectors.rows != static_cast<int>(src.total())) {
10         // check data alignment just for clearer exception messages
11         string error_message = format("Wrong input image size. Reason: Training and Test images must be of equal size! Expected an image with %d elements, but got %d.", _eigenvectors.rows, src.total());
12         CV_Error(CV_StsBadArg, error_message);
13     }
14     // project into PCA subspace
15     Mat q = subspaceProject(_eigenvectors, _mean, src.reshape(1,1));
16     minDist = DBL_MAX;
17     minClass = -1;
18     for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
19         double dist = norm(_projections[sampleIdx], q, NORM_L2);
20         if((dist < minDist) && (dist < _threshold)) {
21             minDist = dist;
22             minClass = _labels.at<int>((int)sampleIdx);
23         }
24     }
25 }

人脸识别核心代码

15行求待识别人脸向量减去人脸平均值向量在“特征脸”集上的投影向量X，19行求X与人脸样本投影向量的欧几里得距离（把此距离作为人脸相似度），20~23行取最小距离为识别结果。

3 示例代码

最后给出Eigen人脸识别的示例代码，代码中仍使用AT&T人脸库，其下载地址见上一篇随笔。

  1 #include "opencv2/core/core.hpp"
  2 #include "opencv2/highgui/highgui.hpp"
  3 #include "opencv2/contrib/contrib.hpp"
  4 
  5 #define CV_VERSION_ID       CVAUX_STR(CV_MAJOR_VERSION) CVAUX_STR(CV_MINOR_VERSION) CVAUX_STR(CV_SUBMINOR_VERSION)
  6 
  7 #ifdef _DEBUG
  8 #define cvLIB(name) "opencv_" name CV_VERSION_ID "d"
  9 #else
 10 #define cvLIB(name) "opencv_" name CV_VERSION_ID
 11 #endif
 12 
 13 #pragma comment( lib, cvLIB("core") )
 14 #pragma comment( lib, cvLIB("imgproc") )
 15 #pragma comment( lib, cvLIB("highgui") )
 16 #pragma comment( lib, cvLIB("flann") )
 17 #pragma comment( lib, cvLIB("features2d") )
 18 #pragma comment( lib, cvLIB("calib3d") )
 19 #pragma comment( lib, cvLIB("gpu") )
 20 #pragma comment( lib, cvLIB("legacy") )
 21 #pragma comment( lib, cvLIB("ml") )
 22 #pragma comment( lib, cvLIB("objdetect") )
 23 #pragma comment( lib, cvLIB("ts") )
 24 #pragma comment( lib, cvLIB("video") )
 25 #pragma comment( lib, cvLIB("contrib") )
 26 #pragma comment( lib, cvLIB("nonfree") )
 27 
 28 #include <iostream>
 29 #include <fstream>
 30 #include <sstream>
 31 
 32 using namespace cv;
 33 using namespace std;
 34 
 35 static Mat toGrayscale(InputArray _src) {
 36     Mat src = _src.getMat();
 37     // only allow one channel
 38     if(src.channels() != 1) {
 39         CV_Error(CV_StsBadArg, "Only Matrices with one channel are supported");
 40     }
 41     // create and return normalized image
 42     Mat dst;
 43     cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1);
 44     return dst;
 45 }
 46 
 47 static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ‘;‘) {
 48     std::ifstream file(filename.c_str(), ifstream::in);
 49     if (!file) {
 50         string error_message = "No valid input file was given, please check the given filename.";
 51         CV_Error(CV_StsBadArg, error_message);
 52     }
 53     string line, path, classlabel;
 54     while (getline(file, line)) {
 55         stringstream liness(line);
 56         getline(liness, path, separator);
 57         getline(liness, classlabel);
 58         if(!path.empty() && !classlabel.empty()) {
 59             images.push_back(imread(path, 0));
 60             labels.push_back(atoi(classlabel.c_str()));
 61         }
 62     }
 63 }
 64 
 65 int main(int argc, const char *argv[]) {
 66     // Check for valid command line arguments, print usage
 67     // if no arguments were given.
 68     if (argc != 2) {
 69         cout << "usage: " << argv[0] << " <csv.ext>" << endl;
 70         exit(1);
 71     }
 72 
 73     // Get the path to your CSV.
 74     string fn_csv = string(argv[1]);
 75     // These vectors hold the images and corresponding labels.
 76     vector<Mat> images;
 77     vector<int> labels;
 78     // Read in the data. This can fail if no valid
 79     // input filename is given.
 80     try {
 81         read_csv(fn_csv, images, labels);
 82     } catch (cv::Exception& e) {
 83         cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
 84         // nothing more we can do
 85         exit(1);
 86     }
 87     // Quit if there are not enough images for this demo.
 88     if(images.size() <= 1) {
 89         string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
 90         CV_Error(CV_StsError, error_message);
 91     }
 92     // Get the height from the first image. We‘ll need this
 93     // later in code to reshape the images to their original
 94     // size:
 95     int height = images[0].rows;
 96     // The following lines simply get the last images from
 97     // your dataset and remove it from the vector. This is
 98     // done, so that the training data (which we learn the
 99     // cv::FaceRecognizer on) and the test data we test
100     // the model with, do not overlap.
101     Mat testSample = images[images.size() - 1];
102     int testLabel = labels[labels.size() - 1];
103     images.pop_back();
104     labels.pop_back();
105     // The following lines create an Eigenfaces model for
106     // face recognition and train it with the images and
107     // labels read from the given CSV file.
108     // This here is a full PCA, if you just want to keep
109     // 10 principal components (read Eigenfaces), then call
110     // the factory method like this:
111     //
112     //      cv::createEigenFaceRecognizer(10);
113     //
114     // If you want to create a FaceRecognizer with a
115     // confidennce threshold, call it with:
116     //
117     //      cv::createEigenFaceRecognizer(10, 123.0);
118     //
119     Ptr<FaceRecognizer> model = createEigenFaceRecognizer();
120     model->train(images, labels);
121     // The following line predicts the label of a given
122     // test image:
123     int predictedLabel = model->predict(testSample);
124     //
125     // To get the confidence of a prediction call the model with:
126     //
127     //      int predictedLabel = -1;
128     //      double confidence = 0.0;
129     //      model->predict(testSample, predictedLabel, confidence);
130     //
131     string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
132     cout << result_message << endl;
133     // Sometimes you‘ll need to get/set internal model data,
134     // which isn‘t exposed by the public cv::FaceRecognizer.
135     // Since each cv::FaceRecognizer is derived from a
136     // cv::Algorithm, you can query the data.
137     //
138     // First we‘ll use it to set the threshold of the FaceRecognizer
139     // to 0.0 without retraining the model. This can be useful if
140     // you are evaluating the model:
141     //
142     model->set("threshold", 0.0);
143     // Now the threshold of this model is set to 0.0. A prediction
144     // now returns -1, as it‘s impossible to have a distance below
145     // it
146     predictedLabel = model->predict(testSample);
147     cout << "Predicted class = " << predictedLabel << endl;
148     // Here is how to get the eigenvalues of this Eigenfaces model:
149     Mat eigenvalues = model->getMat("eigenvalues");
150     // And we can do the same to display the Eigenvectors (read Eigenfaces):
151     Mat W = model->getMat("eigenvectors");
152     // From this we will display the (at most) first 10 Eigenfaces:
153     for (int i = 0; i < min(10, W.cols); i++) {
154         string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i));
155         cout << msg << endl;
156         // get eigenvector #i
157         Mat ev = W.col(i).clone();
158         // Reshape to original size & normalize to [0...255] for imshow.
159         Mat grayscale = toGrayscale(ev.reshape(1, height));
160         // Show the image & apply a Jet colormap for better sensing.
161         Mat cgrayscale;
162         applyColorMap(grayscale, cgrayscale, COLORMAP_JET);
163         imshow(format("%d", i), cgrayscale);
164     }
165     waitKey(0);
166 
167     return 0;
168 }