Off‐line verification technique for Hindi signatures
2013; Institution of Engineering and Technology; Volume: 2; Issue: 4 Linguagem: Francês
10.1049/iet-bmt.2013.0016
ISSN2047-4946
AutoresSrikanta Pal, Umapada Pal, Michael Blumenstein,
Tópico(s)Vehicle License Plate Recognition
ResumoIET BiometricsVolume 2, Issue 4 p. 182-190 ArticleFree Access Off-line verification technique for Hindi signatures Srikanta Pal, Corresponding Author Srikanta Pal [email protected] School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, AustraliaSearch for more papers by this authorUmapada Pal, Umapada Pal Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, West Bengal, 700108 IndiaSearch for more papers by this authorMichael Blumenstein, Michael Blumenstein School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, AustraliaSearch for more papers by this author Srikanta Pal, Corresponding Author Srikanta Pal [email protected] School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, AustraliaSearch for more papers by this authorUmapada Pal, Umapada Pal Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, West Bengal, 700108 IndiaSearch for more papers by this authorMichael Blumenstein, Michael Blumenstein School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, AustraliaSearch for more papers by this author First published: 01 December 2013 https://doi.org/10.1049/iet-bmt.2013.0016Citations: 14AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Abstract Handwritten signature is one of the oldest biometric attributes used for authentication of an individual or a document. The purpose of this study is to present an empirical contribution towards the understanding of signature verification using a novel method involving off-line Hindi (Devnagari) signatures. Although research in the field of signature verification involving Western signatures has been well studied, there has been relatively little attention devoted to non-Western signatures such as Chinese, Japanese, Arabic, Persian etc. In this study, the performance of an off-line signature verification system involving Hindi signatures, whose style is distinct from Western scripts, was investigated. The gradient feature, Zernike moment features and SVMs were considered for verification. To the best of the authors' knowledge, Hindi signatures investigated as part of a large dataset have never been used for the task of signature verification, and this research work is only the second important report using Hindi signatures in this area. An encouraging accuracy of 90.69% was obtained using gradient feature. The Hindi signature database employed for experimentation consisted of 2400 (100 × 24) genuine signatures and 3000 (100 × 30) forgeries. The error rates of 11.50% FRR and 7.12% FAR were obtained through experimentation using gradient features. 1 Introduction Today, biometric technologies are increasingly being used to ensure identity verification or the authorisation of access to sensitive data. Among all of the biometric authentication systems, handwritten signatures have been accepted as an official means to verify personal identity for legal purposes on such documents as cheques, credit cards and wills [1]. Signature verification has been a topic of renewed intensive research over the past several years [2] because of the important role it plays in numerous areas, including in financial applications. An automatic signature verification system can be classified into two categories [3, 4] called on-line and off-line. In an on-line technique, signatures are created on a digitiser and dynamic information such as speed and pressure is captured in addition to a static image of the signature [5, 6]. In an off-line technique, signatures are inscribed on a piece of paper and then scanned to obtain the signature image [7]. Off-line signature verification deals with the verification of signatures, which appear in a static format [8]. Verification decisions are usually based on local or global features extracted from processed signatures. Excellent verification results can be achieved by comparing the robust features of the test signature with that of the user's signature using appropriate classifiers [9]. The use of signatures has been one of the more convenient methods for the identification and verification of human beings. Signatures are considered a complete image with a special distribution of pixels, and a particular writing style. It is not considered as a collection of letters and words [10]. A person's signature may change radically during their lifetime. Great inconsistency can even be observed in signatures according to country, habits, psychological or mental state, physical and practical conditions [11]. The remainder of this paper is organised as follows. Some related work is described in Section 2. The concept of signature verification is introduced in Section 3. Section 4 deals with the importance of Hindi signature verification. The Hindi signature database developed for the current research is described in Section 5. Pre-processing steps are described in Section 6. Some notable properties of Devnagari script are introduced in Section 7. Section 8 briefly describes the feature extraction techniques employed in this work. Details of the classifiers used are presented in Section 9. The experimental settings are presented in Section 10. Results and discussion are given in Section 11. Section 12 deals with the comparison of performance. Error analyses are described in Section 13. Finally, conclusions and future work are discussed in Section 14. 2 Related work There is a limited amount of work in the area involving off-line verification of non-Western signatures. A few of the signature verification techniques are taken into account and illustrated here. Pal et al. [12] presented a signature identification technique for bi-script off-line signatures. In that proposed signature identification method, the signatures of Bangla (Bengali) and English are considered for the identification process. That report was an important contribution involving the Indian regional language in the signature identification/verification area. In that experiment, different features such as under-sampled bitmaps, modified chain code features and the gradient feature were computed from both background and foreground components and were employed for the purpose of signature identification. Support vector machines (SVMs) and nearest neighbour (NN) techniques were considered as classifiers for signature identification in the proposed system. A database of 1554 English signatures and 1092 Bengali signatures were used to generate the experimental results. In one of the papers by Pal et al. [13], a signature verification system based on a local language in India was introduced. In that paper, the performance of an off-line signature verification system involving Bangla (Bengali) signatures, whose style is distinct from Western scripts, was investigated. The Bangla signature database employed in the experiments consisted of 3000 forgeries and 2400 genuine signatures. That Bangla signature database, used for the task of signature verification was a novel contribution to the field of off-line signature verification. The Gaussian grid feature extraction technique was employed for feature extraction and SVMs were considered for classification. In another paper by Pal et al. [14], multi-script signature identification and verification involving English and Hindi signatures was presented. The intention of that paper was to present an empirical contribution towards the understanding of multi-script signature verification. In that paper, multi-script signatures were identified first based on the signature script and after that the experiment for verification was performed based on the identified result. Two different results for identification and verification were determined and analysed. In recent times, Pal et al. [15] introduced a signature verification system employing Hindi signatures. The purpose of that paper was to present an investigation of the performance of a signature verification system involving Hindi off-line signatures. In that study, two important features such as: gradient feature, Zernike moment feature and SVM classifiers were employed. Thirty sets of Hindi signature images were considered for experimentation. Each set of signatures consisted of 30 skilled forgeries and 24 genuine signatures. For each signature set, the classifier was trained with 14 randomly chosen genuine signatures. The negative samples for training were the 20 skilled forgeries of signatures. For testing, the remaining genuine signatures (ten signatures) and remaining skilled forgeries (ten signatures) of the signatures considered, were employed. In the present proposed approach, 100 sets of Hindi signature images have been employed for experimentation. The skilled forgeries were not taken into consideration for training purposes in the proposed technique. The experimental settings were also different which is described in Section 10. Another non-English signature verification approach was presented by Pal et al. [16] where a Bangla signature database, which consisted of 5400 (100 × 54) signatures was employed for experimentation. The objective of that paper was to present an empirical contribution towards the understanding of a threshold-based signature verification method. Some techniques such as under-sampled bitmap, intersection/endpoint and directional chain code were employed for feature extraction. The NN method was considered for classification. For experimentation, the system was trained with a set of 12 genuine signatures of each individual (class) and a distance map using the Euclidian distances between all 12 genuine signatures of each class (121 distances) were calculated. The average distance of all 121 distances for each class was computed and this average distance of each class was considered as the acceptance/rejection threshold of that class. An encouraging accuracy was achieved using the threshold-based technique on a Bangla signature database. A signature verification competition for non-English on-line and off-line signatures was proposed by Liwicki et al. [17]. In that competition, the performance of 12 systems from 5 different institutes was evaluated based on Dutch and Chinese signatures. Our present signature verification work also involves Hindi off-line signatures, but there is a major difference with respect to a few aspects. The experimental technique presented in our proposed work for signature verification is totally different to the earlier one. The size of the signature dataset used for experimentation is also vastly different compared to the previous one. The accuracy obtained from the new experiments is encouraging. Apart from non-Western signature verification, it has been observed that a substantial amount work has been done in the area employing off-line verification of Western signatures. Armand et al. [18] presented an effective method to perform off-line signature verification and identification. Unique structural features were extracted from the signature's contour. Using a publicly available database of 2106 signatures containing 936 genuine and 1170 forgeries, the verification rate of 91.12% was obtained. Ramachandra et al. [19] proposed an off-line signature verification system based on a cross-validation principle and graph matching. Schafer and Viriri [20] presented an off-line signature verification system based on the combination of feature sets. Some extracted features were: aspect ratio, centroid feature, four surface features, six surface features, number of edge points, transition features etc. The verification of signatures was accomplished using the Euclidean distance classifier. Signatures may be written in different languages and there is a need to undertake a systematic study in this area. From the above reviewed work, it may be noted that many published papers are available for Western signatures and only a few studies have been undertaken for signatures of Chinese, Japanese, Persian, Arabic etc. [21]. To the best of the authors' knowledge, there are only a few published papers dealing with signatures of Indic scripts and languages. The present work of Hindi signature verification would be considered a novel contribution to the field of signature verification. Some signature samples of Hindi script are shown in Fig. 1. Figure 1Open in figure viewerPowerPoint Examples of some Hindi signature samples 3 Signature verification concept In the last few decades many methods have been developed in the area of pattern recognition, specifically for the off-line signature verification problem. In general, to deal with the problem of off-line signature verification, researchers have investigated a commonly used approach which is based on two different patterns of classes, class 1 and class 2, where class 1 represents the genuine signature set, and class 2 represents the forged signature set. However, different types of forgeries are measured aside from skilled forgeries (random forgeries, simple forgeries, simulated forgeries, unskilled forgeries, casual forgeries etc.), which are discussed below. As different types of forgeries are being used for signature verification experimentation, the scope of signature databases used for approaches to verification, are not limited to genuine and skilled forgeries alone. Random forgeries have been widely used in verification. In this proposed approach, random forgeries are considered instead of skilled forgeries for training purposes during experimentation. Actually, genuine signatures were treated as random forgeries and this is further discussed in the experimental settings section. Usually two types of errors are considered in an off-line/on-line signature verification system: The false rejection or Type-I error and the false acceptance or Type-II error. These error types are associated with two common types of error rates: the false rejection rate (FRR) which is the percentage of genuine signatures misclassified as forgeries, and the false acceptance rate (FAR) which is the percentage of forged signatures misclassified as genuine. According to Coetzer et al. [22], there are three basic types of forged signatures, which are often taken into account, and these are Random forgery. The forger has no access to the genuine signature (not even the author's name) and reproduces a random one. In many cases, the forgeries are the forger's own genuine signature. Simple forgery. The forger knows the author's name and the script, but has no access to a sample of the signature. Skilled forgery. The forger has access to one or more samples of the genuine signature and is able to reproduce it. According to Weiping et al. [23], forgeries may be classified into two basic classes: casual, skilled or traced, which are often considered in the signature verification field. Casual forgeries (also referred to as rough of random) are produced by only knowing the name of the person, whose signature is being forged, with no previous knowledge of the appearance of the genuine signature. In some cases, the forger may even use their own signature as a forgery (substitution forgery). Casual forgeries are the most commonly found forgeries. In skilled forgeries, the forger knows the signature very well and has practiced the simulation process. Therefore a skilled forgery is very similar to the genuine signature and some effective features in casual forgery verification become ineffective in skilled signature verification. Detection of the simulated and disguised signatures was introduced by Liwicki et al. [24]. Fig. 2 outlines the three main phases of automatic signature verification: data acquisition and pre-processing, feature extraction and classification. Figure 2Open in figure viewerPowerPoint Process-flow of a general signature verification system 4 Importance of Hindi signature verification In a multi-script and multi-lingual country like India, Hindi script is one of the official languages of the country, and is not only used for writing/reading purposes, but is also used for signing purposes. Hindi script is widely used for official reasons throughout India and it is the state local language in some of the states like Bihar, Uttar Pradesh, Madhya Pradesh (a state of India) as well as a national language in India. Experiments with Hindi signatures for verification can be considered as a new contribution to the area of signature verification. Hindi signature verification is quite different to other scripts with respect to the shape and structure of Hindi signature images. Most of the characters of Hindi hand-written words are touching. Hindi is written using Devnagari script. It has been noted that many characters of Devanagari script have a horizontal line at the upper part called a headline, which is called as irorekha in Devanagari. It joins two or more basic or compound characters to form a word. These headlines are present at the top portion of the characters and they are often used as supporting features in identifying Devanagari script. Hindi script is, less blocky and presents a more sinuous shape. Hindi signatures always contain textual components. The gradient feature used in this research is suitable for analysing Hindi script given such characteristics. One of the co-authors of this paper has attained encouraging accuracy analysing Hindi characters as presented in [25] where the gradient feature was considered. For Hindi signatures, there is no concept of an initial signature which is usually seen in English signatures. By initial signature we mean the first character of the first name followed by the last mane (family name). 5 Signature database Although automatic signature verification has been an active research area for several decades, there has been no publicly available signature database for Hindi, the most popular official Indian script. Therefore a Hindi signature database was created for the purpose of this work. Hence, the research in automatic signature verification has long been constrained by the unavailability of a standard database. Hindi signature collection and database preparation The signatures of Hindi script were considered for this signature verification approach. As there has been no public signature corpus available for Hindi script, it was necessary to create a database of Hindi signatures. The signatures were collected from Kolkata, India. The majority of the signatures were contributed by students. The remaining signatures were collected from seniors of the locality. This Hindi signature database consists of 100 sets. Each set consists of 24 genuine signatures and 30 skilled forgeries. The signature images employed consisted of 100 sets whereby the writer number ranges from H-S-Set-001 to H-S-Set-100 (Hindi-Signature-Set). In order to collect the genuine signatures corresponding to each individual, a collection form was designed. The form contained 24 boxes where the signatures could be written. From each individual, 24 genuine signatures were collected. A total number of 2400 genuine signatures from 100 individuals were collected. For each contributor, all genuine specimens were collected in a single day's writing session. In addition, only skilled forged signatures were collected for this proposed work. In order to produce the forgeries, the imitators were allowed to practice their forgeries as long as they wished with static images of genuine specimens. A total number of 3000 forged signatures were collected from the writers. Some genuine signature samples, along with their corresponding forgeries, are displayed in Fig. 3. Figure 3Open in figure viewerPowerPoint Genuine and forged signatures GPDS English database Another database consisting of 100 sets from GPDS-160 [26] was also utilised for these experiments. As GPDS is a popular and publicly available dataset, it was considered for our experiments for the purpose of performance comparison. Each signature set of this corpus consists of 24 genuine signatures and 30 simple forgeries. The reason 100 sets were used from the GPDS on this occasion, is due to the fact that the Hindi dataset described previously was comprised of 100 sets, and it was considered important to have equivalent signature numbers for experimentation and comparison between the two datasets. 6 Pre-processing The signatures to be processed by the system needed to be in a digital image format. Each signature was handwritten on a rectangular space of fixed size on a white sheet of paper. It was necessary to scan all signature document pages. At the very beginning, the images were captured in 256 level grey scale at 300 dpi and stored in TIFF format (tagged image file format) for the purpose of further processing. In the pre-processing step, a histogram-based threshold technique was applied for binarisation. In this step, the digitised grey-level image was converted to a two-tone image. Subsequently, a noise removal technique was applied to remove the salt and pepper noise and the signature images were extracted from the signature-collecting document forms. The signature collecting form containing 24 genuine signatures in binary format is shown in Fig. 4.The extracted binary signature images were stored in TIFF format. A typical scanned signature and its corresponding binary image are shown in Figs. 5 and 6, respectively. Figure 4Open in figure viewerPowerPoint Signature collection form with genuine signature images Figure 5Open in figure viewerPowerPoint Scanned signature image Figure 6Open in figure viewerPowerPoint Binary signature image 7 Properties of Devnagari scripts Devnagari is an oriental script descended from Brahmi script [27]. It is the most popular official script and national language of India. In Hindi script, the writing direction is from left to right and there is no concept of upper/lower case. Hindi script has about 50 basic characters. These characters are presented in Fig. 7. Figure 7Open in figure viewerPowerPoint Basic characters of Devnagari script Vowels in this script generally take a modified shape in most words and are called modifiers or allographs. Modifiers generally do not disturb the shape of basic characters in the middle zone of a line. If the shape is disturbed in the middle zone, we call the resultant shape a compound character. Vowel modifiers of Devnagari scripts are shown in Fig. 8. Figure 8Open in figure viewerPowerPoint Vowel modifiers of Devnagari A text line in such scripts can be partitioned into three zones (upper, middle and lower). The upper zone denotes the portion above the headline, the middle zone denotes the portion between the headline and baseline and the lower zone is the portion below the baseline. The imaginary line separating the middle and lower zones is called the baseline. 8 Feature extraction Feature extraction is a crucial step in any pattern recognition system. The Zernike feature and the gradient feature extraction technique are described as follows. Zernike moments feature Zernike polynomials are an orthogonal set of complex-valued polynomials (1)where (2)and n − |m| is even and radial polynomials {Rnm} are defined as (3)where (4)The complex Zernike moments of order n and repetition m are given by (5)where x2 + y2 ≤ 1 and the symbol * denotes the complex conjugate operator [28]. The Zernike moments can be computed by the scale invariant central moments as follows (6)where (7) Computation of 576-dimensional gradient features The grey-scale local-orientation histogram of the signature component is used for 576-dimensional feature extraction [29]. To obtain a 576-dimensional gradient-based feature vector, the following steps were executed. Step 1: 2 × 2 mean filtering is applied 5 times on the input image. Step 2: The grey scale image obtained in Step 1 is normalised so that the mean grey scale becomes zero with a maximum value of 1. Step 3: The normalised image is then segmented into 17 × 7 blocks. Compromising trade-off between accuracy and complexity, this block size is decided experimentally. To obtain the bounding box of the grey-scale image, the image is converted into a two-tone format using Otsu's thresholding algorithm [30]. This will exclude unnecessary background information from the image. Step 4: A Roberts filter is then applied to the image to obtain the gradient image. The arc tangent of the gradient (direction of gradient) is quantised into 32 directions and the strength of the gradient is accumulated with each of the quantised directions. The strength of the gradient f(X, Y) is defined as follows (8)and the direction of gradient (θ(x, y)) is (9)where Δu = g(x + 1, y + 1) − g(x, y) and Δv = g(x + 1, y) − g(x, y + 1) and g(x, y) is the grey level of point (x, y). Step 5: Histograms of the values of 32 quantised directions are computed for each of the 17 × 7 blocks. Step 6: The directional histogram of the 17 × 7 blocks is down-sampled into 9 × 4 blocks and 16 directions using Gaussian filters. Finally, a 9 × 4 × 16 = 576-dimensional feature vector is obtained. 9 Classifier details In our experiments, we have used SVMs as classifiers. SVMs have been originally defined for two-class problems and they look for the optimal hyper plane, which maximises the distance and the margin between the nearest examples of both classes, namely support vectors. Given a training database of M data: {xm|m = 1, …, M}, the linear SVM classifier is then defined as (10)where {xj} are the set of support vectors and the parameters αj and b have been determined by solving a quadratic problem [31]. The linear SVM can be extended to various non-linear variants; details can be found in [31, 32]. In our experiments, the RBF kernel SVM outperformed other non-linear SVM kernels, hence, we are reporting our verification results based on the RBF kernel only. Different gamma values for SVM training were established experimentally. No rule/formula was used for the selection of gamma values. The values of gamma were determined empirically until the desired accuracy was obtained. In this experiment, two different gamma values were employed for the SVM classifier for two different feature sets to obtain the maximum accuracy. If those two gamma values were interchanged and applied for experimentation, the system would not necessarily produce the desired outcome. 10 Experimental settings In the field of signature verification, there is a sparseness of publicly available signature databases. The quality of the available databases also varies, as there has been no standard collection protocol. Besides, it is very costly to create a large corpus with different types of forgeries, especially skilled forgeries. In this experimental method of verification, genuine signatures were considered for training and skilled forgeries were considered for testing purposes only. For the experiments in the proposed research, our own database described in Section 5 was used. For each signature set, an SVM was trained with 12 randomly chosen genuine signatures. The negative samples for training (random forgeries) were the genuine signatures of 99 other signature sets. Two signatures were taken from each set. In total, there were 99 × 2 = 198 random forgeries employed for training. For testing, the remaining 12 genuine signatures and 30 skilled forgeries of the signature set being considered were employed. The number of samples for training and testing for experimentation with each signature set are shown in Table 1. Table 1. Number of signature samples used per set for experimentation Genuine signature Random forgeries Skilled forgeries training 12 198 n/a testing 12 n/a 30 11 Results and discussion An investigation of the performance of a signature verification system involving Hindi signatures is presented here. For training and testing of the system, 5400 Hindi signatures were employed. Among these signatures, 2400 (24 × 100) samples were genuine and 3000 (30 × 100) samples were skilled forgeries. The experimental results ofany signature verification system is associated with an FRR, FAR and AER (average error rate). Using the gradient feature and Zernike moment feature, an FRR, FAR and AER was calculated. At this operational point, SVMs produced an FRR of 11.50% and an FAR of 7.12% using the gradient feature whereas an FRR of 18.74% and an FAR of 14.60% were obtained using the Zernike moment feature. We obtained a better accuracy compared with our previous Bangla signature verification work [13] as well as our Hindi signature verification work [15]. A 5-fold crossed validation technique has been considered and the accuracy of 90.90% was obtained using gradient feature. From the experimental results obtained, it is also observed that the performance of the gradient feature is very encouraging, compared to the results obtained using the Zernike moment feature. The results obtained from these experiments using our Hindi dataset are comparatively better than the state-of-the-art methodology [11] and the accuracies are shown in Table 2. Table 2.
Referência(s)