Long range correlation of preceded pixels relations and application to off‐line signature verification
2016; Institution of Engineering and Technology; Volume: 6; Issue: 2 Linguagem: Inglês
10.1049/iet-bmt.2016.0046
ISSN2047-4946
AutoresHans Loka, Elias N. Zois, George Economou,
Tópico(s)Natural Language Processing Techniques
ResumoIET BiometricsVolume 6, Issue 2 p. 70-78 Research ArticleFree Access Long range correlation of preceded pixels relations and application to off-line signature verification Hans Loka, Hans Loka Electronics Engineering Department, Technological and Educational Institute of Athens, Agiou Spiridonos Str., 12243 Egaleo, GreeceSearch for more papers by this authorElias Zois, Corresponding Author Elias Zois ezois@teiath.gr Electronics Engineering Department, Technological and Educational Institute of Athens, Agiou Spiridonos Str., 12243 Egaleo, GreeceSearch for more papers by this authorGeorge Economou, George Economou Physics Department, University of Patras, Patras, 26500 GreeceSearch for more papers by this author Hans Loka, Hans Loka Electronics Engineering Department, Technological and Educational Institute of Athens, Agiou Spiridonos Str., 12243 Egaleo, GreeceSearch for more papers by this authorElias Zois, Corresponding Author Elias Zois ezois@teiath.gr Electronics Engineering Department, Technological and Educational Institute of Athens, Agiou Spiridonos Str., 12243 Egaleo, GreeceSearch for more papers by this authorGeorge Economou, George Economou Physics Department, University of Patras, Patras, 26500 GreeceSearch for more papers by this author First published: 04 January 2017 https://doi.org/10.1049/iet-bmt.2016.0046Citations: 6AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract Lately, off-line signature verification systems have been reintroduced based on the idea of modelling the signature images with various relations among their pixels. In this paper, a modified version of the partially ordered set feature extraction procedure is presented by enabling distant range interactions between preceded relations of pixel groups. In this way, the spatial diversity of correlation that exists among signature features can be exploited. The motive behind this approach is related to our belief that the particular idiosyncratic writing style characteristics of each individual will be present over the whole length of the signature. Experiments involve the well-known Center of Excellence for Document Analysis and Recognition (CEDAR) and Ministerio de Ciencia y Tecnologia (MCYT) datasets in two popular writer dependent training modes: In the first mode, genuine and simulated forgeries were utilized, while in the second one genuine and random forgeries only were utilized during the training stage of the classifier. In both cases, the testing phase is exploited with genuine, simulated and random forgeries while receiver operating characteristic along with decision oriented FAR, FRR and average error metrics are assessing the proposed feature extraction method. The results obtained, show that the long range correlation of grid features can be efficiently employed for off-line signature verification. 1 Introduction A review of the recent literature provides strong indication that handwritten signature verification still remains an open field of great challenge and diverse experimentation [1, 2]. Signature verification, which is considered to be the art of validating the handwritten signature as a means for the consent or the presence of a person, can be broadly classified into two major categories [3]. Online signature verification is the first category [4]: signatures are usually acquired with the use of pen-oriented devices such as Personal Digital Assistants (PDA's) and/or some sort of phablets. The primary components, which constitute an online signature instance, are sets of numerous time oriented functions such as horizontal and vertical positions as well as pressure and type of actions including pen-up/down. In principle and due to the diverse type of information which is provided, online systems are considered to be more accurate in terms of rejecting mimics defined as forgeries - skilled or random - as genuine and vice versa [5]. Off-line signature verification systems form the second acquisition category [1, 6] in which case the signal on inspection has the form of an accustomed grey-level image in which all previously held dynamic information has been collapsed into timeless information comprised only of pixel imprints. Though off-line signature verification systems are considered to be less accurate when compared with the online counterparts, their use is sometimes unavoidable due to the nature of the case under investigation, e.g. in forensic cases. Off-line signature verification has indeed grown-up but not aged. It has matured; throughout the previous years, an intense effort regarding off-line signature verification has been carried out by several research groups. The verification chain of off-line systems can be broken-down into the following stages: (a) pre-processing in which the original images are subject to embellishments, (b) feature extraction in which the image is transformed into numbers under an undergoing hypothesised signature model, (c) training in which sets of signatures previously characterised as genuine or forgery, whether they are belonging or not to the person under study, are used in order to divide the formed feature space into two corresponding sub-regions by means of pattern recognition techniques defined as classifiers and (d) testing, during which unknown signature samples are presented to the classifier so that it can decide the claimed authenticity of the questioned sample by means of a binary decision or an underlying score which may express or not the probability of the accompanied decision. Off-line feature extraction, and therefore modelling of the handwritten signature is among the most challenging tasks to be performed; according to the authors' opinion it is an art, the building blocks of which are signal processing and ad hoc intuitions. Though not entirely 'newcomer', recent feature extraction methods have relocated their attention within a field which model pixel related information by placing structuring elements – grids of various shapes and detail – to the signature image on a local or global scale. Following, the properties of the signature are quantified by means of assessing statistical properties such as texture and/or shape characteristics. A short survey into the literature reveals important and representative effort regarding the use of imposed grids for feature generation. For example, surroundedness [7] proposed by Kumar et al. models both the signature shape and additionally provides a measure of signature texture in terms of spatial distribution of black pixels around a signature candidate pixel. Signature texture has also been modelled with the use of local binary patterns [8], while Serdouk et al. [9] provide fresh insight by using the gradient local binary patterns in order to estimate gradient features based on the Local Binary Patterns (LBP) neighbourhood. Pixel densities and corresponding distribution perceptions such as extended shadow code and directional probability density functions using grids have been also used by Bertolini et al. [10], Batista et al. [11] and Rivard et al. [12]. Graphometric circular grids are also another way of measuring grid oriented properties of the static signature image [13]. Histogram oriented gradients used by Yilmaz et al. [14], scale invariance feature transform descriptors introduced by Solar [15] and Speeded Up Robust Feature (SURF)-based variations of grid features provided by Malik et al. [16] and Pal [17] and Pal et al. [18] have been also found to add significant results to the off-line signature verification research community. Optical flow [19] and histogram-based local shape [20] proposed by Pirlo and Impedovo provide another field of view in the pixel oriented model of static signatures. Automatic signature verification systems can also be divided into writer dependent (WD) and writer-independent (WI) approaches. On the one hand, WD training modes make use of a dedicated model – classifier – for every person. On the other hand, WI setups are not writer dedicated systems and, additionally, they do not exploit the properties of the primary WD feature space. WI systems are basically trained on two large groups: the first group models the differences of features between samples originating from genuine writers, whereas the second one models the differences of features between genuine samples and samples from other writers, simulated or random. Then onwards it is expected that the learning procedure of these two subspaces will provide the following testing phase: given the set of (a) a reference set of signatures belonging to one writer which has not participated in the WI training procedure and (b) a questioned signature sample accompanied with a claim of authenticity for this specific writer, a WI system will successfully authenticate or not the presence of the claimed identity. In both WD and WI cases, there are two major training approaches. The first one trains the classifier with the use of genuine and simulated forgery samples only. Then, the testing phase evaluates signature samples from the genuine, simulated and occasionally random forgery classes, i.e. genuine samples from other writers of the dataset. The second one addresses the training of the classifier by utilising genuine and only random forgeries. Usually, the decision in the testing phase is much harder to be taken when random forgeries are engaged due to the fact that the training mode inserts a bias to the design of the classifier by means of a proper placement of the decision threshold. However, the work of Batista et al. [11] and Rivard et al. [12] clearly recommends this kind of training since 'in practice, only random forgeries are available during the design of a signature verification system'. The disadvantage of this protocol is obviously the selection of the appropriate threshold which will separate the genuine or positive class from any other kind of forgery during the testing stage. The way that the decision threshold is chosen comprises another debate; frequently, the decision threshold is evaluated with the use of the training or the testing set for either each writer or their entire population. However, it also seems appropriate that the selection of the right threshold takes place with the use of a validation set instead. Usually, receiver operating characteristic (ROC) curves are also employed in order to provide evidence of the strength of any proposed system by means of equal error rate and/or area under curve. Lately, a hybrid WI–WD approach has been introduced, which attempts to cope with some of the above issues [21]. In another recent effort, Ferrer et al. [22] propose the creation and use of synthesised signatures as forgeries which the authors believe that it will add a significant boost to the design of signature verification systems. Recently, some of the authors devised and tested a methodology for signature authentication that is based on a series of binary grid features defined over a 5 5 image window and applied along the signature trace [23]. Key factors to the development of the method were the efficient reshuffling of primary grid features and the selection of certain first-order transitional probabilities among them. In this paper, we mainly propose a feature extraction method which extends the previous methodology in a spatially diverse manner thus introducing its long range correlation (LRC) variant. The motive behind this approach is related to our belief that the particular idiosyncratic writing style characteristics of each individual will be present over the whole length of his/her signature. The proposed work exploits partially ordered elements of pixel assortments (posets) not on a local scale but rather on a spatially extended one; this is accomplished by measuring poset properties between sections of signature that reside between distant equimass segments. Additionally, two WD protocols are presented with the use of the Center of Excellence for Document Analysis and Recognition (CEDAR) and Ministerio de Ciencia y Tecnologia (MCYT) datasets which clearly reflect the aforementioned ways of WD training. This work is organised in the following way. Section 2 presents the databases and the pre-processing steps. Section 3 rapidly revives the feature extraction methodology and provides its LRC version. Section 4 provides the experimental setup and the corresponding results. Discussion and conclusions are provided in Section 5. 2 Materials and methods 2.1 Datasets Two popular and well-known databases were employed in order to assess the proposed feature extraction method and the experimental protocols. The first signature database was created at CEDAR, Buffalo University [24]. A total of 48 signature samples (24 genuine and 24 simulated) confined in a 50 mm×50 mm square box were provided for every one of the 55 enrolled writers, and digitised at 300 dpi. The simulated signatures in the CEDAR database are a blend of random, simple and skilled forgeries. The CEDAR signature corpus can be accessed at http://www.cedar.buffalo.edu/NIJ/data. The second signature database is the off-line signature MCYT-75 [25, 26]. The signature samples were acquired with the use of the same ink pen and paper templates of a pen tablet. A total of 15 genuine and 15 simulated (skilled) signature samples were provided for every one of the 75 enrolled writers at 600 dpi. The MCYT-75 signature corpus is publicly available at http://www.atvs.ii.uam.es [8]. 2.2 Pre-processing The enhancement of any input signature prior to feature extraction stage is carried out by means of the following pre-processing steps: (a) thresholding using Otsu's method [27] and (b) thinning which is a parameter of this work; in particular, due to the different recording resolutions of the CEDAR and MCYT datasets, and in order to (b1) keep the derived signatures having as much information as the original versions while (b2) reducing any random noise originating by using several writing tools, the pruning level of CEDAR and MCYT has been set to one and two, respectively. Finally, any resulted image is subject to segmentation in order to provide four equimass sections defined hereafter as SG1–SG4 [20, 28]. Figs. 1a–d depict the various stages of the pre-processing procedure for a signature sample of the CEDAR dataset. Figure 1Open in figure viewerPowerPoint Signature pre-processing stages (CEDAR) (a) Original image, (b) Binary image, (c) Thinned image, (e) Equimass segmented portions (SG1–SG4) of the thinned image 3 Feature extraction In this work, the proposed process of transforming a pre-processed handwritten signature into a set of representative numbers or features is accomplished by utilising a spatial variation of the partially ordered grid feature extraction method which was recently presented in [23] and denoted hereafter as the 'LRC poset variant' or simply LRC. Though details regarding the poset feature generation can be found in the aforementioned reference, the following paragraph offers a revisited description by refreshing the postulates of the method and covering the necessary steps regarding the formulation of the LRC poset variant. To begin with, Fig. 2 at a first glance presents the structure of the poset feature extraction method along with its LRC variant. For any signature sample, the feature extraction stage creates 12 separate feature vectors which relate to 12 complementary feature extractors or schemes. Each models the signature with one eight level comprised scheme. Each level produces 32 features for each pair of signature segments, making the overall dimensionality of the feature vector equal to 1536. Finally, each is used individually during the training and testing stage of the corresponding classifier and, subsequently, their derived scores/decisions will be combined in order to provide the final assessment metrics. Figure 2Open in figure viewerPowerPoint Structure of the poset feature extraction method along with its LRC variant 3.1 Postulates of partially ordered sets A partially ordered set (or poset) enacts in an intuitive way the notion of the arrangement of the elements of a set. A poset () is a mathematical structure which contains a set comprised of its elements , along with a binary or precedence relation () used to designate the ordering of a subset of elements of a set, over another. In other words, the precedence relation specifies that for some subsets of elements of the set, one of the subsets precedes the other. The concept of partiality also defines that not every subset needs to be related with another one by means of the precedence operator. For some subsets it may also hold that neither of these has precedence over the other, contrary to the notion of total ordering which requires that every subset of a set be related. A characteristic example of a precedence relation is the inclusion or subset relation () between subsets. It describes the case of having every element of the subset also contained into subset , and it offers a way to compare elements of sets and consequently form chains of comparable subsets. One representative case of a poset is the powerset of a given set equipped with the inclusion relation. Posets can be depicted graphically with the use of lattice-based graphs. In the case of its mathematical structure, powerset–poset is designated by (, ) where each subset or element of the powerset is represented with a vertex while the association between the (k1, k2)-indexed vertexes of the subsets is represented by a connecting line or edge . Fig. 3 depicts graphically the duet powerset–poset for a case of having the X set cardinality equal to four (). It is easily perceived by inspection that there are 32 edges on a four element poset or mathematically expressed: . The primary elements of the poset are four 5 5 binary image masks whose placement is defined in the subsequent paragraph. Figure 3Open in figure viewerPowerPoint Graphical example of a four elements {1, 2, 3, 4} set poset along with some designated edges Let us assume a set of 32 elementary pixel assortments (EPAs) as Fig. 4a depicts in the form of grid masks. Each mask has the constraint of having Chebychev distance of two between the starting and the final pixel. A partition of this specific EPA primary set is simply a way of dividing the EPA elements into subgroups according to a property. The total number of possible partitions of any set X with cardinality is provided by the following Bell number BN relation: (1) Figure 4Open in figure viewerPowerPoint Scheme description (a) Elementary 32 55 binary image masks, (b) Definition of a scheme (outer parts) along with the orthogonal levels The term (2) calculates the number of ways to partition a set of N elements into k-non-empty subsets. Given that the cardinality of the EPA set is equal to 32, the number of partitions that the elements of EPA set can provide goes to a really vast number (∼1.28 × 1026). Due to this fact, two extra conditions are imposed. The first is to create partitions of eight subgroups, each one containing four elements. According to the terminology followed both in [23] and in this work, we denote (a) a partition of the EPA as a scheme and (b) each one of the eight subgroups as -levels, i = 1:8, (Fig. 2). The second constraint is that the levels of each scheme must obey an orthogonality rule, which means that we keep only those schemes that have their -levels organised in such a way so that each element of an -level cannot be expressed by the union of the remaining elements of the same level. Fig. 4b depicts graphically the aforementioned definition of a scheme. The application of these two constraints provides an almost tolerable number of 2582 potential schemes. Further narrowing is achieved by clustering schemes into groups of schemes (GoSs) according to the associated intra-group Jaccard distance of any GoS [23]. Intuitively, a low value of the indicates strongly that the features which will be extracted by schemes of the specific GoS will be of high correlation, and thus expected to contribute poorly to the verification task. Thus, GoS having low intra Jaccard distance is not considered at all. Given that (a) the theoretical maximum intra for eight level schemes is 0.4286 [23] and (b) the trade-off between the and the total number of schemes of the GoS, a 12 scheme GoS with intra Jaccard distance equal to 0.3 has been randomly selected with their numbers displayed in Fig. 2. For each one of the 12 schemes, a specific -level unfolds its powerset–poset as Fig. 5a depicts. Figure 5Open in figure viewerPowerPoint Powerset and poset generations of a scheme (a) 16-Ary powerset of a level, (b) Ordered transitions of one level powerset with respect to inclusion 3.2 Long range correlation of posets According to the previous section, the evaluation of the poset components for each signature pixel requires the detection of two connected pixel assortments, i.e. a edge on the graph of Fig. 5b. This is accomplished by using two spatially separated 5 5 grids denoted with labels master (M) and slave (S). Thus, given any (M)-designated master window sliding over a signature pixel, we count any encountered pixel assortment of the poset. In this work, the LRC variant of the (M)–(S) grids is exploited; this is accomplished by allowing the poset measurements to be taken only when the (M) and (S) grids belong to different segments of the four-fold partitioned image [Fig. 6, ]. Specifically, for one placement of the (M)-grid on one segment of the signature image, the (S)-grid is placed on all pixels of the remaining segments. In this way, it is expected that the particular features that characterise the behaviour of a specific writer will be active in other parts of the signature image as well. Given that (a) there are 32 poset edges for any -level of each of the eight levels of a scheme and (b) there are six pairwise equimass , k = 1:4 segment combinations, it is deducted that the overall dimensionality of the proposed feature vector for one scheme is of 32 8 6 = 1536 feature components. Each one of the corresponding feature components of a scheme (3) measures the existence of one poset edge for a specific -level. Fig. 6 provides an optical representation of the feature extraction process. The final feature is formed by appending all the relevant sub-features. Figure 6Open in figure viewerPowerPoint Feature extraction (a) LRC approach: (M) and (S) grids spatially modulated poset evaluation. Signature lines are not in scale with the grids, (b) Activation of poset edges and corresponding feature index increment 4 Experimental protocol – results 4.1 Classifier description Support vector machines [29] have been proved to be a helpful and practical tool for data classification in a number of applications including signature verification. Let us consider the feature-label pairs of a training set of k-samples: where and . Then the Support Vector Machines (SVM) solves the following optimisation problem: (4) In other words, the SVM maps the training vectors to a higher-dimensional space with the use of the kernel function and simultaneously finds a separating hyper-plane with a maximal margin. The term (5) is denoted throughout the literature as the mapping kernel function; in this work, a radial base form has been selected. The parameters C (the margin) and (directly related to the kernel fitting parameter sigma-) are those which must be properly tuned during training by means of cross-validation trials in order to achieve the lowest possible generalisation error in the testing phase. In this work, the experimental protocol was conducted with the use of the Library for Support Vector Machines (LIBSVM) library and the LIBSVM tools [30]. 4.2 Experimental protocol The foundations behind the employed verification strategy rely on the framework of WD off-line signature verification. For each writer, a corresponding model is assigned by a dedicated binary SVM classifier. In accordance with the two major training approaches of WD explored in the literature, two corresponding strategies were followed. The first training strategy (Writer Dependent Signature Verification (WDSV)1) makes use of genuine and simulated forgeries. The second strategy (WDSV2) utilises genuine and only random forgeries during the training stage of the classifier. In both training approaches, the testing phase has been exploited with the use of genuine, simulated and random forgeries and the training and testing experiments were subject-disjoint, i.e. samples of different subjects were used in training and testing in order to avoid underestimated results of error rates. 4.2.1 Training with genuine and simulated forgeries In the case of the WDSV1 training mode, five and ten genuine samples were randomly selected in order to create the positive (or genuine) class for each scheme. For the CEDAR case, 16 genuine samples were additionally utilised. The negative class was represented by an equal number of simulated forgery samples. A leave-one-out cross-validation procedure combined with a grid-search strategy was adopted in order to provide the best binary SVM training parameters by allowing the parameter C to be searched on an exponentially growing sequence of , and on a respective pattern of . Throughout the cross-validation procedure, the metric employed in order to indicate the selection of the optimal SVM operating parameters was the minimum value of the Ave , which is defined as the minimum average of the following two types of errors: the FRR (FRRCV) and the FAR (FARCV). Both FARCV and FRRCV and consequently the were evaluated as threshold functions. The use of the measure can be justified by the fact that an equal number of positive and negative samples were used in order to train the system. The cross-validation procedure for an input feature vector also outputs a score value for each j-scheme; this is the SVM distance of the input sample from the separating hyper-plane. As a consequence, 12 scores are provided which in their turn are averaged: ; their range is also recorded and further stored for the post-processing evaluation metrics of the testing samples. The WDSV1 testing phase was performed by using (a) the remaining genuine samples, (b) the remaining simulated forgery samples and (c) random forgery samples. For case (c), three samples randomly selected from each one of all other participating writers (e.g. 54 for the CEDAR case and 74 for the MCYT case) were used. During the testing phase, when a sample under question enters the 12 SVM binary classifiers, each classifier outputs two values: the binary decisions along with the corresponding SVM scores , j = 1:12. For the case of employing the decisions, the efficiency of the WDSV1 testing stage was quantified with the use of the following rates: , and for the genuine, simulated forgery and random forgery samples. Since 12 binary decisions participated, a fusion rule based on voting provided the final decision for each writer. For all writers, a common decision rule based on the majority voting was employed. The decision rule states that if six out of the twelve schemes vote positively for the questioned sample, then the overall decision rule will accept it; in the opposite case it will not. In addition, for each writer the above rule was allowed to be flexible to the minimum number of positive votes though it was always bounded by its lower limit with the six-out-of-twelve rule. In the case of employing the scores, their average was also calculated. Next, ROC curves were estimated by allowing the decision threshold to range between the values of resumed from the aforementioned cross-validation procedure. Since three classes were present during the testing stage, the efficiency of the WDSV1 testing stage was quantified with the use of the following best rates: , for genuine and simulated forgery samples along with the associated value for the random forgery samples. The and metrics were mined on the point of the best testing Ave. The experiments were repeated ten times for different training and testing sets and their average results along with their standard deviations were recorded and presented in a tabulated manner in Table 1. Table 1. Best error rates (%) and associated standard deviations for the WDSV1 training mode Dataset Decision ROC based Number of genuine samples RF_ CEDAR 5 7.23(6.00) 8.47(5.72) 14.10(9.76) 4.54(4.44) 2.60(4.10) 4.28(5.15) 10 4.17(4.23) 3.30(2.94) 12.86(8.99) 1.82(2.59) 1.01(1.70) 3.70(5.07) 16 1.65(1.39) 1.42(0.92) 10.41(8.17) 1.12(1.73) 0.74(0.62) 2.15(4.12) MCYT 5 12.83(8.77) 11.52(7.48) 14.15(8.79) 8.24(5.72) 5.07(4.95) 8.66(7.87) 10 6.64(6.29) 4.75(4.58) 10.13(9.15) 3.20(3.79) 1.04(2.11) 4.75(4.37) Inspection of Table 1 results provides two issues which need further discussion. The first one deals with the obvious and clear difference of performance between the cases of having actual decisions and ROC oriented results. It is a fact that there is a significant difference which is observed in both types of error which sometimes may reach up to 50%. Clearly, this is due to the underlying decision mechanism carried out f
Referência(s)