Conventional methods include features based on gradients like Hog, EOH, edgelets.
Human occlusions
Complex backgrounds
Real-time processing because of the use of the raster scanning while varying the window scale
Control the window size to 64128 so it can be divided into 816 cells.
There are 492 rectangular regions obtained by varying the cell units of the rectangular region from 11 to 88.
Calculate the RDSF from combinations of the 492 rectangular regions, 492*491/2=120786 features.
Perform a raster scan of the detectin window in a 3D space
Computes the RDSFs from the detection windows
Judge whether there are occlusions in the calculated features
Use Real AdaBoost to classify each detection window is of human or not human.
Convolutional human detection methods involve repeated raster scans while the scale of the detection window varied, so there are many windows do not match the dimisions of humans.
With the depth information, we can use fixed window size with different depth to detect humans with different scales. Process can be seen below:
Window with different depth can projected to the 2d image using a projection matrix which is the equation 7 in the paper.
Using Real Adaboost algorithm to classify the extracted features.
Adaboost algorithm can ensemble a number of weak classifiers to build a strong classifer.
H(x) is the final strong classifier, and h(x) is a weak classifier.
Depth information is useful in a confusing scene with a number of people overlapping.Combine the overlapping information into the classifier in a simple way. Process is below:
Define occlusion: any object region that is closer to the camera than the detection window
Extraction of occlusion regions:
Mean-shift clustering is a method that clustering the detect windows which detect a same object into one window. In image space, detection window could be erroneously integrated if humans overlap in them. But in 3D space, this problem can be solved easily.
There are two expriments:
Comparison of three feature extraction methods:
Comparison of occlusion and non-occlusion adjustment feature extraction methods:
HOG without occlusion adjustment
HOG with occlusion adjustment
RDFS without occlusion adjustment
RDFS with occlusion adjustment
Real-Time Human Detection using Relational Depth Similarity Features