Image Normalization for Cumulative Foot Pressure Images

Analysis of cumulative foot pressure images can be used like other biometrics for personal recognition. Cumulative foot pressure image is a 2-D data that recorded the spatial and temporal changes of ground reaction force during one gait cycle. In this biometric, there are some challenges such as walking in different speed. In this paper, we present a new approach based on image normalization, feature extraction by Gabor filters, dimension reduction via combining Eigenfoot and LDA. Different classifiers of nearest neighbor (NN), nearest subspace (NS) and sparse representation classifier (SRC) are also used for recognition. We evaluate the proposed approach via the CASIA Gait-Footprint dataset containing the cumulative foot pressure images of 88 persons. The experimental results show that the proposed method has high accuracy than other similar methods, and has good robustness to changes in walking speed.

In recent years, biometrics has become one of the most attractive fields in digital image processing and pattern recognition. During the past decades, biometrics technologies, such as iris, face and fingerprint recognition, have been used in different security applications. Among the biometrics introduced in recent years, the foot has become popular to study.ground reaction force to the pressure of the foot on the way go, can be used like other biometricsÂ for recognition. This type of biometric can be used in places where there are no cameras such as prisons, residential entry and in shoe design software, medicine, sports, etc. Previous works have been developed based on methods such as footprint [1-5] (without socks and shoes), ground reaction force [6] and also cumulative foot pressure images. Using footprint as a biometric does not seem a good idea because it only contains the rough foot pressure distribution.) Recently in [1-2], footprint have been used for recognition. But the problem is the person must be barefoot. Which in turn limits this method to certain places where people are barefoot (without socks and shoes), such as pools and water parks. Ground reaction force is a 1-D signal which records the temporal information of gait, so it is easily affected by the noise. As we mentioned, the cumulative foot pressure image is a 2-D cumulative ground reaction force during one gait cycle. Different from ground reaction force and footprint, the cumulative foot pressure image provides both temporal and spatial distribution information of the gait. Each person has a unique gait, so the cumulative foot pressure image of each person is unique. Zhang et al. fused gait and cumulative foot pressure image for recognition [8]. Fig.1 shows some sample images from the cumulative foot pressure images of two persons.

In this paper, we present a new approach based on image normalization, feature extraction by Gabor filters, dimension reduction via combining Eigenfoot and LDA. Different classifiers of nearest neighbor (NN), nearest subspace (NS) and sparse representation classifier (SRC) are also used for recognition. In the SRC, we use two different sparse recovery approaches of Basis Pursuit (BP) and smoothed L0 norm (SL0). Finally, we discuss the results of these classifiers.

The remainder of this paper is organized as follows.

Section II presents the proposed method. In Section III, we present a brief mathematical explanation of the sparse representation classifier (SRC). Experiments and Conclusions are discussed in Section IV and V, respectively.

The block diagram of our proposed approach containing feature extraction, dimension reduction and classifier is shown in Fig.2. In the following subsection, we explain each component of the proposed method.

About these two blocks, we use CASIA Gait-Footprint database [9]. To create this database, 88 patients have participated in the experiment. The participants were asked to walk without shoes along a straight line through foot pressure measurement floor (wearing socks doesn’t impact on system performance). First, the subjects were asked to walk normal through the pressure sensor 5 times and then walk fast through the pressure sensor 5 times. When a person walks through foot pressure measurement floor, cumulative foot pressure images are acquired. So there are totally 10×88=880 cumulative foot pressure records, three cumulative foot pressure images for each record, so there are totally 2640 images for considering the effect of walking speed. As mentioned above, walking in each time, each person leaves three cumulative foot pressure, we only considered the first left foot pressure image to use in experiments.

The effect of rotation and translation between CFPIs causes to decrease the recognition performance. Hence, the Y axis of the cumulative foot pressure images must be aligned with the Y axis ofÂ the image frames. Fig.3 shows a sample cumulative foot pressure image with Y axis of the footÂ , Y axis of the image frame and the angle between themÂ . So should be rotatedÂ Â degrees to align with . This makes the algorithm resistant to rotation [4]. Then the center of the mass of the cumulative foot pressure image is calculated and moved to the center of the frame. These processes are done automatically. Fig.3 illustrates the diagram of procedures for automatic normalization of a sample cumulative foot pressure image. Step 0 shows the original image. The dimensions of the image are low, in some cases may be a part of image data will lose by rotating the foot (by normalizing the part of the foot may be out of the frame) and also all the images are not the same size. Therefore, in the step 1, we increase the image’s dimensions by adding zeros in the matrix of the images to 180Ã-80. Step 2: after increasing the size of the image matrix, we convert the grayscale image to a binary image for separating the mass of the cumulative foot pressure image from the background. Then the angleÂ betweenÂ Â andÂ Â is calculated and in step 3, the mass of the cumulative foot pressure imageÂ is rotated byÂ degrees.

Step 4: the obtained image is converted to a binary image for finding coordinates of the mass center of the cumulative foot pressure image. Finally, in the step 5, the mass center of the cumulative foot pressure image (result of step 3) is moved to the center of the frame. The normalization process makes our proposed method resistant to the rotation and translation.

In this paper, Gabor filters are used to extract features. The Gabor filters (kernels) are defined as [10]:

Where Âµ and Î½ respectively are orientation and scale. Also

Â is the pixel position, and the

wave vectorÂ Â is defined asÂ Â withÂ Â andÂ .Â Â isÂ the maximum frequency, andÂ Â is the spacing factor between kernels in

the frequency domain.

represents the ratio of the Gaussian window

width to wavelength). In this experiment, we used five scales,Â Â and eight orientationÂ . We also usedÂ ,Â Â andÂ .

In the proposed method, as the dimension of feature vectors are high (7920), we reduce the dimension via the combination of EigenfootÂ [11] and LDA.

At first, we reduce the feature space dimension from 7920 to less than 440 via Eigenfoot and then we use LDA. By doing this, the feature vectors dimensions are reduced from 7920 to 87 (The number of classes – 1).

In this paper, to evaluate the performance of the proposed approach we also use the raw normalized images as feature vectors. Fig.5 shows some of the feature vectors of Eigenfoot.

In this paper, we use three classifiers of nearest neighbor (NN) as a simplest classifier, nearest subspace (NS) and sparse representation classifier (SRC) as a newest and one of the most efficient classifier. In the SRC, we use two different sparse recovery approaches of Basis Pursuit (BP) and smoothed L0 norm (SL0) [12]. The SRC is briefly described in the next section.

A basic problem in object recognition is to use labeled training samples from k distinct object classes to correctly determine the class to which a new test sample belongs. A dictionary matrix is a matrix which each column is a feature vector of one of the training samples. Assume that there areÂ training data samples for the ith class, where each data sample is represented by a vector of m elements. These vectors are then used to construct the columns of matrixÂ :

(2)

whereÂ Â Â is a column vector that represents the feature extracted from the training data sampleÂ j of subject i. It is assumed that a test dataÂ Â from class i can be represented as a linear combination of the training data from class iÂ with scalar valuesÂ Â [12]:

Â Â Â Â Â Â Â Â Â Â Â (3)

Since the membership i of the test sample is initially unknown, we define a new matrix A for the all training set as the concatenation of the n training samples of all k object classes:

Â Â Â Â Â Â Â Â Â Â (4)

A linear representation of the test data,Â , can then be given as :

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (5)

whereÂ Â is a coefficient vector whose entries are zero except those associated with the ith class. As the entries of the vectorÂ Â Â encode the identity of the test sample y, it is tempting to attempt to obtain it by solving the linear system of equationsÂ . Note that in equation (3) all the training data samples of a given subject are used to form a representation of the test data. It is clear that the proposed method will produce a more descriptive representation than methods that only incorporate part of the training set, such as the nearest subspace (NS) method.

To solve equationÂ , the number of equations m, and unknown parameters n, are important. If m = n the equation will be complete and the solution will be unique. However, in our case, the equation is under determined (i.e m < n) and there is no unique solution. Usually, theÂ Â norm is used in this case and the estimate is expressed as follows:

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (6)

whereÂ Â is the solution, which can be obtained simply by computing the pseudo-inverse of A. However, this solution does not contain information that is useful for recognition. The more sparse the recoveredÂ Â is, the easier will it be to accurately determine the identity of the test sample y. The sparse solution forÂ Â can be computed as follows [12]:

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (7)

whereÂ Â is the zero norm, which is defined as the number of non zero elements of x. In fact, if the columns of A are

in general position, then wheneverÂ Â for some x with

less than m/2 non-zeros, x is the unique sparsest solution:Â Â [13]. However, the problem of finding the sparsest

solution is difficult. Recently, a lot of approximation methods have been proposed to solve this problem. In this paper we use two methods ofÂ -norm optimization or linear programing (LP) or Basis Pursuit (BP) [13] and smoothed L0 norm (SL0) [13]. The cost function of one-norm optimization is expressed as follows:

Â (8)

In SL0, zero norm is smoothed with a Gaussian function as follows:

Â (9)

The idea of SL0 is to maximizeÂ Â for a smallÂ , subject toÂ . In other words, zero norm is approximated by a smooth and differentiable function. Speed convergence of SL0 is higher and its performance is better than BP [14].

Given a new test sample y from one of the classes in the

training set, we first compute its sparse representationÂ Â via (8) or (9). For each class i, letÂ Â be the characteristic function that selects the coefficients associated with the ith class. ForÂ ,Â is a new vector whose only nonzero entries are the entries inÂ Â that are associated with class i. Using only the coefficients associated with the ith class, one can approximate the given test sample y asÂ . Recognition can subsequently be performed by assigning the test data to the object class that minimizes the

residual between y andÂ as follows:

Â (10)

Table.1 summarizes the complete recognition procedure via sparse representation (SRC).

TABLE I.Â Summary of the complete recognition procedure via sparse representation (SRC).

Sparse Representation-based Classification (SRC)

1: Input: a matrix of training samplesÂ Â for k classes and a test sampleÂ .

2: Normalize the columns of A.

3: Get sparse representation for the test sample using one of the methods (8) or (9).

4: The estimation error is calculated for all classes.

5: Output: identity y viaÂ .

Plag=14%Â Â uniq=86%

http://smallseotools.com/plagiarism-checker

Due to the design procedure of recognition algorithms, we find out that the performance of the algorithm depends on feature extraction and classification, which means that, if a feature is well chosen, a simple classification can obtain good performance or a classifier with the simplest features can have a very good performance. In the rest of this section, we evaluate the proposed approach in different states.

We use Gabor filters to feature extraction and Eigenfoot+LDA for dimension reduction. To evaluate the proposed approach, also we use raw images as features. Also to investigate the effect of the LDA on the results, we repeat all the experiments without using LDA . Finally,Â we check the results of all cases using different classifiers of nearest neighbor (NN), nearest subspace (NS) and sparse representation classifier (SRC). In the SRC, we use two different sparse recovery approaches of Basis Pursuit (BP) and smoothed L0 norm (SL0).

To evaluate the performance of the proposed system, we use the two following scenarios:

1-Train data and test data are collected from normal walking.

2-Train data are collected from normal walking and test data are collected from fast walking.

Figs 6-13 show the performance accuracy of the proposed system under the different conditions.

By comparing Fig.6Â with Fig.7 , Fig.8Â with Fig.9, Fig.10Â with Fig.11 and Fig.12Â with Fig.13, it is obvious that Gabor filter as feature vectors improves the recognition accuracy of the proposed approach.

By comparing Fig.6 with Fig.8 and Fig.10 with Fig.12, we see that adding LDA to Eigenfoot for dimension reduction reduces the accuracy of the results when we use raw images as features. However, comparing Fig.7 with Fig.9 and Fig.11 with Fig.13, shows that adding LDA to Eigenfoot significantly increased the recognition accuracy.

Based on the above, the use of Gabor filters to feature extraction and Eigenfoot+LDA for dimension reduction increased the accuracy of results.

According to Figs. 8 and 12, in which Gabor filters are not used, the results of NN, NS and SL0 are very close and the results of BP are very poor. By using Gabor filters as the features, results substantially are improved for BP. But the results of the other three classifiers are also improved and still the performance of BP is weaker compared to them.

In Table II, the results of our proposed methods are compared with the results of [7] that were obtained from the same database. Shuai Zheng et al. proposed a method in which Locality-Constrained Sparse Coding (LSC) is used for feature extraction, LDA and NN are used for classification and they set the reduced vector dimension as 200 [7].( they set reduced dimensionality as 200.) Table II reports the results of our proposed method (Fig.9 and Fig.13) in the case that at first we reduce feature space dimension from 7920 to 140 via Eigenfoot and then we use LDA to reduce feature space to 87 (The number of classes – 1).

As can be seen in TABLE II the results of our proposed methods in both scenarios outperform the results of [7]. The normalization makes our proposed method to beÂ resistant to rotation and translation. It should be mentioned that the results of our proposed methods without normalization phase is about 4% less than when we use the normalization phase. Also, we have shown that the performance of recognition is improvedÂ using Gabor filters. As TABLE IIÂ shows, the results of our proposed method with NN classifier outperforms the other classifiers. This indicates that the selected features and dimension reduction are very suitable for this type of images that give the best results just with the simplest (the simple) classifier (NN).

TABLE IIÂ shows that the results of our proposed method in scenario 1 andÂ scenario 2 related to each classifier are close to each other. It means that our proposed method is resistant to walk in different speed. Based on these results we propose the final system based on the components of normalization process, Gabor filters as a feature vector, dimension reduction via combining Eigenfoot and LDA, and nearest neighbor (NN) as a classifier.

In this paper, a method was proposed in which firstly images are normalized automatically, i.e., directional and positional information. It makes our proposed method robust to rotation and translation. Then Gabor filters as the process of feature extraction, and dimension reduction via combining Eigenfoot and LDA were used. Finally, we have used different classifiers of nearest neighbor (NN), nearest subspace (NS) and sparse representation classifier (SRC) for recognition. In the SRC, we have used two different sparse recovery approaches of Basis Pursuit (BP) and smoothed L0 norm (SL0). The proposed method was evaluated on the CASIA Gait-Footprint database [9]. Nearest neighbor (NN) classifier not only outperforms the other three classifiers, it is resistant to walk in different speed and has low computational complexity. So in this paper our proposed method is based on the preprocess of normalization, Gabor filters as feature extraction, dimension reduction via combining Eigenfoot and LDA, and nearest neighbor (NN) as a classifier.

In the future, we will focus on evaluating the proposed method on other databases which contains cumulative foot pressure images of walking subjects with different shoes. Also, inÂ order to improve the system performance, we will try to use other features and combine this biometric with other biometrics like gait video and face.

Order Now