HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís Dipòsit Legal: T. 876-2012 ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs. ADVERTENCIA. El acceso a los contenidos de esta tesis doctoral y su utilización debe respetar los derechos de la persona autora. Puede ser utilizada para consulta o estudio personal, así como en actividades o materiales de investigación y docencia en los términos establecidos en el art. 32 del Texto Refundido de la Ley de Propiedad Intelectual (RDL 1/1996). Para otros usos se requiere la autorización previa y expresa de la persona autora. En cualquier caso, en la utilización de sus contenidos se deberá indicar de forma clara el nombre y apellidos de la persona autora y el título de la tesis doctoral. No se autoriza su reproducción u otras formas de explotación efectuadas con fines lucrativos ni su comunicación pública desde un sitio ajeno al servicio TDR. Tampoco se autoriza la presentación de su contenido en una ventana o marco ajeno a TDR (framing). Esta reserva de derechos afecta tanto al contenido de la tesis como a sus resúmenes e índices. WARNING. Access to the contents of this doctoral thesis and its use must respect the rights of the author. It can be used for reference or private study, as well as research and learning activities or materials in the terms established by the 32nd article of the Spanish Consolidated Copyright Act (RDL 1/1996). Express and previous authorization of the author is required for any other uses. In any case, when using its content, full name of the author and title of the thesis must be clearly indicated. Reproduction or other forms of for profit use or public communication from outside TDX service is not allowed. Presentation of its content in a window or frame external to TDX (framing) is not authorized either. These rights affect both the content of the thesis and its abstracts and indexes. UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Mariano Fons Lluís HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS DOCTORAL THESIS Supervised by Dr. Enrique F. Cantó Navarro Department of Electronic, Electrical and Automatic Control Engineering Tarragona 2012 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ESCOLA TÈCNICA SUPERIOR D’ENGINYERIA DEPARTAMENT D’ENGINYERIA ELECTRÒNICA, ELÈCTRICA I AUTOMÀTICA Avinguda dels Països Catalans, 26 Campus Sescelades 43007 Tarragona – SPAIN Tel. + 34 977 559 610 Fax + 34 977 559 605 e-mail: secelec@urv.net http://sauron.etse.urv.es/DEEEA/ Enrique F. Cantó Navarro, professor at the Department of Electronic, Electrical and Automatic Control Engineering of the University Rovira i Virgili, STATES: That the present thesis, entitled “Hardware accelerators for embedded fingerprint-based personal recognition systems”, presented by Mariano Fons Lluís for the award of the degree of Doctor, has been carried out under my supervision at the Department of Electronic, Electrical and Automatic Control Engineering of the University Rovira i Virgili. Tarragona, March 2012 Doctoral Thesis Supervisor Dr. Enrique F. Cantó Navarro UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Abstract The current technological age demands the existence of efficient and reliable automatic user authentication systems for those daily use applications where a certain security level is required prior to providing an individual access to confidential information, restricted areas or protected resources: networks logon, automated physical access control systems, user’s entry to automatic teller machines, personal digital assistant devices, mobile phones, etc. are some application examples. While conventional solutions for user authentication have traditionally relied on knowledge-based or physical token-based methods such as passwords, personal identification numbers or identity cards, a new research window has been opened in the last decades dealing with biometrics-based approaches. The biometrics field relies on the analysis of those physiological and/or behavioural human characteristics that are inherent, permanent and unique to each individual such as iris, retina, face, fingerprints, gait, voice, handwriting, etc. Each of these features can be used as biometric identifiers, linked to the individual’s identity in front of the world. Although biometrics-based solutions are pointed as the most secure and reliable methods when the matching or recognition process is guided by human experts (e.g. manual and semi-automatic personal fingerprint recognition has been successfully applied to civilian and forensic applications for more than one century), some difficulties arise when trying to develop fully automated personal authentication systems that match biometric features stand-alone, without the supervision of human experts. Therefore, the design and development of such automatic user authentication systems that make use of biometric features is still a technological challenge nowadays. Two are the main constraining factors. The first one refers to the accuracy of the automated personal recognition algorithms. The biometrics-based algorithms used to compare and match those human characteristics in an automatic way are not mature enough. Although the accuracy of manual recognition methods can reach acceptable levels, the reliability performance of current automatic algorithms, measured by means of indicators such as false acceptance and false rejection ratios (FAR/FRR), is proven not to be sufficient for those applications demanding high security, where confidential data/resources must not be compromised. Most of the biometric features have been deeply analysed by the scientific community, but the biometric verification problem is highly complex, and more research needs to be done to automate the experts’ mind methodology in order to reach trustworthy matching reliability levels. The state of the art on biometric recognition algorithms points out that existing systems are inherently fallible, and no single trait has been identified to be truthfully distinctive. Therefore, the general trends to make the recognition algorithms more accurate consist in increasing the complexity of the processing by adding more computational stages, or even by combining multiple biometric features within the brain of the personal recognition system. Moreover, the second (but no less important) factor is the system architecture to be used in order to build such automatic personal authenticators in the way of static or portable embedded systems. The complex signal processing required for personal verification demands the usage of highperformance physical platforms. It is important to keep in mind that the final goal is to increase people’s quality of life by spreading the biometric security in a wide range of daily use consumer applications all over the world, so subjects such as the ease-of-use of the system, the non-intrusive or harmless methodology followed to acquire the biometric traits, the authentication operation time, the acceptability by the end-user and the final cost of the system become major issues that need to be taken into consideration. In this direction, a technical trade-off exists when implementing such applications on embedded systems with limited processing resources, trying to meet real-world constraints such as high recognition accuracy, low cost, short time-to-market, and real-time performances. Innovative architectures are needed to fulfil all those requirements in platforms different to those conventional and traditionally expensive personal computer systems. i UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 This work mainly addresses the second constraining factor but without losing the perspective of the first one. It is focused on the search of new architectures able to meet the application requirements while providing enough flexibility to afford the development of more complex and accurate algorithms in the near future (once the biometrics scientific community overcomes the accuracy constraints of current algorithms). The suggested new architectures merge the flexibility of sequential processing provided by software, with the efficiency, parallelism and acceleration factors provided by hardware. In order to go in depth into the analysis, among the different physiological and behavioural personal traits, fingerprints have been selected as the representative biometric features to be exploited in this research work. This thesis does not focus on biometric algorithms, however, the state of the art in fingerprint-based recognition algorithms has been considered for the physical implementation of an automatic fingerprint-based authentication system (AFAS). The fingerprint recognition algorithm implemented in this work has not been developed from scratch, but it has been based on existing reference biometric algorithms well detailed in literature. In this direction, it has been possible to develop, as use case, one application with similar computational demands to those exhibited by today’s best-in-class systems. This thesis studies in depth one architecture approach consisting of a multiprocessor system with one software block –based on central processing units (CPUs)–, able to execute sequential processing tasks, and one hardware block –based on application specific hardware accelerators–, where parallel processing tasks can be synthesised in order to speed up the application execution time. Both kinds of processors –hardware- and software-oriented– work concurrently in the application under field programmable gate arrays (FPGA) or system on programmable chip (SOPC) devices. The main conclusions, demonstrated in this work through the physical implementation of the same application under different processing platforms, and the corresponding comparison of the strengths and limitations featured by each of the solutions, are as follow: (i) it is feasible to develop biometrics-based user authentication systems by means of hardware-software co-design techniques; (ii) acceleration factors in the range from one to two orders of magnitude can be achieved when comparing the physical implementation of hardware-software embedded systems against purely software-based solutions; (iii) with the advent of hardware-software oriented platforms, more complex algorithms can be implemented and executed in real time, which leads to benefits in terms of recognition accuracy and system reliability performances; and in relation to hardware technologies, (iv) concerning to development costs issues, it is seen that application-specific hardware accelerators developed into FPGAs or SOPCs result in a good alternative to those other solutions based on application specific integrated circuits (ASICs) thanks to the short design and verification cycles linked to programmable logic electronic design automation (EDA) tools, as well as the high flexibility performance in terms of design maintenance tasks (future application evolutions/upgrades) exhibited by programmable logic technologies. All these advantages make the proposed system architecture suitable for the development of automatic biometrics-based user authenticators in the way of low-cost embedded systems. The merits of this thesis contribute to the spread and deployment of more reliable and more affordable biometric systems for today’s society. The work is structured in 6 chapters. The thesis is mainly composed of chapter 1, seen as the introductory section; chapter 5, where the main research work is exposed; and chapter 6, where the conclusions and main publications derived from the work are presented. Apart from these sections, chapters 2 to 4 provide a survey on the state of the art in different areas related to biometric systems such as fingerprint-based biometric algorithms (chapter 2), system architectures and available physical devices from which to build biometric applications (chapter 3), and an extensive review of those other major topics like fingerprint sensors, biometric products and services, active research groups, international journals, conferences, technological evaluation competitions, and published research works in the field of fingerprint-based biometric systems (chapter 4). Readers interested in the state of the art on those different areas can consult the respective chapters. Readers mainly interested in the advances achieved in this thesis can skip the introductory chapters (2-4) and directly go through chapters 1, 5 and 6. ii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 List of Acronyms 1-D 2-D 3-D ACM ADC AES AFAS AFIS AHB ALU AMBA AMD AMP ANSI APU ARM ASIC ASIP ASSP ATM BLPOC BM BRAM CAN CCD CDEFFS CISC CL CLB CMOS CN CORDIC CPU CrN CUDA CWT D2SP DAC DAPDNA DB DC DCM DDR DFT DHS DMA DNA DOG DPI One-Dimensional Space Two-Dimensional Space Three-Dimensional Space Association for Computing Machinery Analog to Digital Converter Advanced Encryption Standard Automatic Fingerprint Authentication System Automatic Fingerprint Identification System Advanced High-performance Bus Arithmetic Logic Unit Advanced Microcontroller Bus Architecture Advances Micro Devices, Inc. Asymmetric Multiprocessing American National Standards Institute Accelerated Processing Unit Advanced RISC Machine Application Specific Integrated Circuit Application Specific Instruction Set Processor Application Specific Standard Product Automated Teller Machine Band Limited Phase Only Correlation Bus Macro Block RAM Controller Area Network Charge Coupled Device Committee to Define an Extended Fingerprint Feature Set Complex Instruction Set Computing Certainty Level Configurable Logic Block Complementary Metal Oxide Semiconductor Connection Number Coordinate Rotation Digital Computer Central Processing Unit Crossing Number Compute Unified Device Architecture Continuous Wavelet Transform Dimensional DSP Digital to Analog Converter Digital Application Processor/Distributed Network Architecture Database Direct Current Digital Clock Manager Double Data Rate Discrete Fourier Transform Department of Homeland Security Direct Memory Access Deoxyribonucleic Acid Difference Of Gaussian Dots Per Inch iii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 DP-SRAM DRAM DSP DWT EBI EDA EEPROM EER ELFT EPP ESD FAR FBI FFT FIFO FIR FMR FNMR FPGA FPSLIC FPU FpVTE FRR FVC FVS GPIO GPP GPRS GPU GRS GSM HDL HLA HLSPS HPC HW I2C IC ICAP ID IDS IEC IEEE IET I/F IIR INCITS I/O IOB IP ISO Dual Port SRAM Dynamic Random Access Memory Digital Signal Processor Discrete Wavelet Transform External Bus Interface Electronic Design Automation Electrically Erasable Programmable Read Only Memory Equal Error Rate Evaluation of Latent Fingerprint Technologies Extensible Processing Platform Electrostatic Discharge False Acceptance Rate Federal Bureau of Investigation Fast Fourier Transform First In, First Out Finite Impulse Response False Match Rate False Non-Match Rate Field Programmable Gate Array Field Programmable System Level Integrated Circuit Floating Point Unit Fingerprint Vendor Technology Evaluation False Rejection Rate Fingerprint Verification Competition Fingerprint Verification System General Purpose Input/Output General Purpose Processor General Packet Radio Services Graphics Processing Unit Globally Removable Pixels Set Global System for Mobile Communications Hardware Description Language High Level Application High Level Security Portable System High-Performance Computing Hardware Inter-Integrated Circuit Integrated Circuit Internal Configuration Access Port Identification Intrusion Detection Systems International Electrotechnical Commission Institute of Electrical and Electronics Engineers Institution of Engineering and Technology Interface Infinite Impulse Response International Committee for Information Technology Standards Input/Output Input/Ouptut Block Intellectual Property International Organization for Standardization iv UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ISP ITL JMD LAN LCD LED LIN LLA LUT MAC MCU MDPI MEMS MIMD MINDTCT MINEX MIPS MMU MPU MTAP MTBF NFIS2 NIST NN NPI NRE OMINEX OTP PC PCB PCI PDA PFT PIN PLB PLD PLL PMCD POE POEBVA PRR PSTN PWM RAM RCS RD RF RFID RISC RMR RNMR In-System Programmable Information Technology Laboratory Justice Management Division Local Area Network Liquid Crystal Display Light Emitting Diode Local Interconnect Network Low Level Application Lookup Table Multiply-Accumulate Microcontroller Unit Multidisciplinary Digital Publishing Institute Microelectromechanical Systems Multiple Instructions, Multiple Data Minutiae Detector Minutiae Interoperability Exchange Test Million Instructions Per Second Memory Management Unit Microprocessor Unit Multi-Threaded Array Processor Mean Time Between Failures NIST Fingerprint Image Software Version 2 National Institute of Standards and Technology Neighbour Number Native Port Interface Non Recurring Engineering Ongoing Minutiae Interoperability Exchange Test One Time Programmable Personal Computer Printed Circuit Board Peripheral Component Interconnect Personal Digital Assistant Proprietary Fingerprint Template Personal Identification Number Processor Local Bus Programmable Logic Device Phase Lock Loop Phase Matched Clock Divider Point Of Entry Point Of Entry Bio-VISA Application Partial Reconfigurable Region Public Switched Telephone Network Pulse Width Modulation Random Access Memory Ridge Coordinate System Read Radio Frequency Radio Frequency Identification Reduced Instruction Set Computing Right Match Rate Right Non-Match Rate v UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ROC ROI ROM RS-232 RS-485 RTL SDK SDRAM SHA SIFT SIM SIMD SMP SOC SOPC SPEAR2 SPI SP-SRAM SRAM SW TFT UART UNED US USB USD VHDL VHSIC VLIW VLSI WR XML Receiver Operating Characteristic Region Of Interest Read Only Memory Recommended Standard 232 Recommended Standard 485 Register Transfer Level Software Development Kit Synchronous Dynamic Random Access Memory Similarity Histogram Approach Scale Invariant Feature Transformation Subscriber Identification Module Single Instruction, Multiple Data Symmetric Multiprocessing System On Chip System On Programmable Chip Scalable Processor for Embedded Applications in Real-time Serial Peripheral Interface Single Port SRAM Static Random Access Memory Software Thin Film Transistor Universal Asynchronous Receiver/Transmitter Universidad Nacional de Educación a Distancia Unites States Universal Serial Bus United States Dollar VHSIC Hardware Description Language Very High Speed Integrated Circuit Very Long Instruction Word Very Large Scale Integration Write Extensible Markup Language vi UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 List of Figures Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. Figure 7. Figure 8. Figure 9. Figure 10. Figure 11. Figure 12. Figure 13. Figure 14. Figure 15. Security performance comparison of personal recognition technologies. Variety of ID tokens (e.g. smart cards) used in automatic personal recognition systems. Annual biometric industry revenues ($M USD). Original source: Biometrics Market and Industry Report 2009-2014 from International Biometric Group, 2009. Market share of the leading biometric technologies (excludes AFIS revenues). Original source: Biometrics Market and Industry Report 2009-2014 from International Biometric Group, 2009. Block diagram of the enrolment process. Block diagram of the authentication process. Block diagram of the identification process. Template – Query evaluation process. Recognition system errors and accuracy descriptors. Typical FMR and FNMR operation points for different applications. Ideal personal recognition system performance. No overlap between G(t) and I(t) distributions exist, and EER = 0 when selecting the proper application threshold. Detail of human finger. Examples of arch patterns: A&B) plain arches, C&D) tented arches. Detail of human hand. The radius joins the hand on the same side as the thumb, and the ulna on the same side as the little finger. Examples of loop patterns: E&F correspond to radial loops if they belong to fingers of the right hand, or ulnar loops if they belong to fingers of the left hand. Similarly, G&H are ulnar loops if they belong to the right hand or radial loops if they belong to the left hand. Patterns E&F are also called left loops, and G&H right loops regardless of which hands they belong to. Examples of whorl patterns: I&M are plain whorls, J&N are central pocket loops, K&O are double loops, and L&P are accidental whorls. Fingerprint impressions of different users. Example of low inter-class variability exhibited by fingerprints. Fingerprint impressions of the same user. Example of high intra-class variability exhibited by fingerprints. Human experts matching fingerprints. Automatic Fingerprint Authentication System. User enrolment process. User authentication process. Processing steps involved in the enrolment and authentication stages. Rolled fingerprinting technology. One-touch sensing technology. Sweeping sensing technology. Touchless sensing technology. Examples of biometrics-based products embedding fingerprint sweep sensors. Original fingerprint impression with well-defined, recoverable and unrecoverable regions. Field orientation map of a greyscale fingerprint. Image enhancement through contextual filtering. Directional morphological filter templates. Directional median filter templates. Singular points. Location of singular points in fingerprints. vii Figure 16. Figure 17. Figure 18. Figure 19. Figure 20. Figure 21. Figure 22. Figure 23. Figure 24. Figure 25. Figure 26. Figure 27. Figure 28. Figure 29. Figure 30. Figure 31. Figure 32. Figure 33. Figure 34. Figure 35. UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 36. Figure 37. Figure 38. Figure 39. Figure 40. Figure 41. Figure 42. Figure 43. Figure 44. Figure 45. Figure 46. Figure 47. Figure 48. Figure 49. Figure 50. Figure 51. Figure 52. Figure 53. Figure 54. Figure 55. Figure 56. Figure 57. Figure 58. Figure 59. Figure 60. Figure 61. Figure 62. Figure 63. Figure 64. Figure 65. Figure 66. Figure 67. Figure 68. Figure 69. Figure 70. Figure 71. Figure 72. Figure 73. Figure 74. Fingerprint ridge maps. Fingerprint minutiae (ridge endings in blue, and ridge bifurcations in red). Image thinning. Level 3 features: fingerprint pores. Level 3 features: fingerprint dots and incipient ridges. Alignment result of template (a) and query (b) fingerprints based on minutiae sets (c). Recognition accuracy performance of single and multiple feature-based matcher systems. FPGA chip – Basic blocks. Xilinx Zynq-7000 SOC block diagram. Altera SoC FPGA block diagram. Tilera TILEPro64 block diagram. Aspex AsProCore block diagram. ClearSpeed CSX700 block diagram. Digital signal processing architectures. Quality & Cost balance in the development of an AFAS application. Embedded system architecture. Relationship between the size (amount of resources) of one FPGA device and its cost. Costs details are approximate (only for reference). Fingerprint recognition stages: composition of Enrolment and Authentication processes. AFAS application development workflow. Processing stages of the suggested fingerprint verification system. Authentication result decision. Genuine and Impostor distributions. False Match and False Non-match distributions. Receiver Operating Characteristic curve. Match-on-Card system architecture. Authentication-on-Board embedded system architecture. Development of one application as a set of sequential stages, and partitioning of each of the stages into hardware and software tasks that can be executed either sequentially or in parallel. Comparison of static FPGA-based design concept (left side) and run-time reconfigurable-FPGA-based design concept (right side). Embedded system architecture suggested in this work for the physical implementation of the AFAS application. In the semiconductor industry, the size of the chip drives its cost. Design of a chip of big dimensions (36 chips per wafer, left side), design of a chip of smaller dimensions (164 chips per wafer, central side), and real wafer photo (right side). FCD4B14 fingeprint sensor. Frame transmission sequence (1124 clocks). Fingerprint slices transmission sequence. Fingerprint image reconstruction process from the consecutive acquired slices. Fingerprint image acquisition process flow. Fingerprint acquisition tasks scheduling. Fingerprint image acquisition and reconstruction on-the-fly process. Fingerprint image enhancement algorithm: processing stages. Sobel mask operators: a) Sobel mask used to compute Gx, b) Sobel mask used to compute Gy. viii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 75. Figure 76. Figure 77. Figure 78. Figure 79. Figure 80. Figure 81. Figure 82. Figure 83. Figure 84. Figure 85. Figure 86. Figure 87. Figure 88. Figure 89. Figure 90. Figure 91. Figure 92. Figure 93. Figure 94. Figure 95. Figure 96. Figure 97. Figure 98. Figure 99. Figure 100. Figure 101. Figure 102. Figure 103. Figure 104. Figure 105. Figure 106. Figure 107. Figure 108. Figure 109. Figure 110. Figure 111. Figure 112. Figure 113. Figure 114. Figure 115. Figure 116. Figure 117. Fingerprint image segmentation: a) original image (greyscale, 256 levels), b) segmented image (average block threshold Gthreshold = 700). Fingerprint image normalization: a) input image (m = 111, σ2 = 1002), b) normalized image (m0 = 127, σ02 = 1272). Discrete isotropic filter Gf. Image isotropic filtering: a) normalized image, b) filtered image. Orientation maps computation: a) field orientation map b) filtered field orientation map. Discrete directional filter Hf,0º. Image directional filtering and binarization: a) greyscale image, b) binarized image. Image directional smoothing: a) binary image, b) smooothed image. Directional smoothing filters: a) Ridge orientation Φ’ = 0º, b) Ridge orientation Φ’ = 45º. Embedded system topology. Three-level hardware accelerator modular design. Image processing tasks scheduling. Computation of the directional gradient Gx of the image pixels p(j,i). Only shifts and additons of integer operands are needed to compute the gradient at pixel level. Image segmentation hardware coprocessor (part I). Image segmentation hardware coprocessor (part II). Image normalization hardware coprocessor. Image convolution with symmetric filters. Image isotropic filtering hardware coprocessor. Field orientation map hardware coprocessor. Symmetry of Gabor filters. Construction of directional Gabor filters Hf,Φ' from the field orientation data. Image binarization hardware coprocessor. Image smoothing hardware coprocessor. Fingerprint feature extraction algorithm: processing stages. Image pixel neighbourhood definition (kernel 3×3). Image thinning process flow. Image skeletonization: a) smooothed image, b) thinned image. Minutia descriptor definition. Minutiae extraction and verification: a) thinned image, b) extracted minutiae. Thinning process application flow. Iterative image thinning hardware coprocessor. Irreducible image thinning hardware coprocessor. Minutiae extraction hardware coprocessor. Fingerprints alignment algorithm: processing stages. Filtered field orientation maps to be aligned: a) Template fingerprint, b) Query fingerprint. Field orientation matrices correlation process. Overlapped region. Field orientation map rotation process. Fingerprints alignment methodology. Field orientation maps: a) Template fingerprint, b) Query fingerprint. Template and query field orientation maps alignment result. Template and query minutiae sets alignment result. Field orientation maps alignment algorithm: processing stages. Alignment process: hardware-software partitioning. ix UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 118. Figure 119. Figure 120. Figure 121. Figure 122. Figure 123. Figure 124. Figure 125. Figure 126. Figure 127. Figure 128. Figure 129. Figure 130. Figure 131. Figure 132. Figure 133. Figure 134. Figure 135. Figure 136. Figure 137. Figure 138. Figure 139. Figure 140. Figure 141. Alignment hardware coprocessor. Partially-pipelined correlation process. Fingerprints matching algorithm: processing stages. Fingerprint minutia descriptor (d,φ,γ), where (x0,y0,β0) is a ridge ending used as reference and (x1,y1,β1) is a ridge bifurcation used as neighbour of the reference minutia. Characterization of minutia m0 from its 8 closer neighbours m1, … , m8. Corresponding template-query minutia pairs on the aligned images. Integration of the hardware matching coprocessor in the embedded system: basic interfaces. Matching hardware coprocessor. Set of processing tasks applied to template and query fingerprints along the enrolment and authentication stages. Processing stages involved in the personal recognition algorithm: (1) & (2) fingerprint image acquisition of template (left side) and query (right side) based on sweeping technology sensors, fingerprint image reconstruction from acquired slices, image segmentation and normalization, (3) fingerprint image enhancement based on isotropic filtering, (4) & (5) computation of field orientation and filtered field orientation maps, (6) directional filtering and image binarization, (7) image smoothing, (8) & (9) image thinning, minutiae extraction and minutiae filtering, (A) & (B) template-query feature sets alignment and matching. System processors in charge of the execution of the application tasks. Physical layout of FPSLIC device. Block diagram of FPSLIC device. Atmel FPSLIC-based development platform. Fingerprint acquisition system under FPSLIC-based development platform. FPSLIC-based physical prototype developed in this work. Automatic fingerprint acquisition system developed with Atmel FPSLIC device. Physical layout of EXCALIBUR device. Automatic fingerprint authentication system developed under Altera EPXA10 evaluation platform. Altera EXCALIBUR-based development platform. Physical layout of VIRTEX-4 XC4VLX25 device. Automatic fingerprint authentication system developed under Xilinx ML401 evaluation platform. VIRTEX-4 XC4VLX25 FPGA floorplan. Partitioning between Static (black area) and Partially Reconfigurable (grey area) regions suggested in this work. Xilinx VIRTEX4-based development platform. x UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 List of Tables Table 1. Table 2. Table 3. Table 4. Table 5. Comparison of biometric technologies (H: high, M: medium and L: low performances). Original source: Handbook of Fingerprint Recognition, 2003&2009. Biometric technologies, application fields and key exploitation markets. Fingerprints classification. Accuracy and efficiency indicators –EER (Avg. Matching Time)– of the three most accurate algorithms in FVC evaluation contests. Databases are different in each evaluation contest. Accuracy indicators –EER– of those published results in FVC-onGoing on-line evaluator web site since June 2009 till December 2011. Performance evaluation of fingerprint verification open algorithms (left side) and matching algorithms based on standard minutiae representation template format [ISO/IEC 19794-2 (2005)] (right side). Hybrid matching algorithms. DSP processors. Logic fabric & interconnect. FPGA devices. Fingerprint sensor suppliers. Fingerprint sensor devices. Companies that support the development of AFAS applications. Fingerprint-based processor chips. Fingerprint-based embedded system modules. Fingerprint-based SDKs. List of participants FVC2000. Performance results FVC2000. List of participants FVC2002. Performance results FVC2002. List of participants FVC2004. Performance results FVC2004 (open category). Performance results FVC2004 (light category). List of participants FVC2006. Performance results FVC2006 (open category). Performance results FVC2006 (light category). Published performance results FVC-onGoing (fingerprint verification) in period 2009-2011. Published performance results FVC-onGoing (minutiae matching - ISO) in period 2009-2011. Performance results Medium-Scale test FpVTE2003. Performance results ongoing PFT. List of participants PFTII and 1K sample enrolment/matching time results distribution. PFTII 1K sample template size distribution. PFTII recognition accuracy performance results. List of participants MINEX04 and recognition performance results under POEBVA database. FNMR at FMR = 1% for the POEBVA database in MIN:A encoding scheme. The SDK identified in the row produces the enrolment template. The SDK identified in each column produces the authentication template and performs the matching. FNMR at FMR = 1% for the POEBVA database in MIN:B encoding scheme. The SDK identified in the row produces the enrolment template. The SDK identified in each column produces the authentication template and performs the matching. xi Table 6. Table 7. Table 8. Table 9. Table 10. Table 11. Table 12. Table 13. Table 14. Table 15. Table 16. Table 17. Table 18. Table 19. Table 20. Table 21. Table 22. Table 23. Table 24. Table 25. Table 26. Table 27. Table 28. Table 29. Table 30. Table 31. Table 32. Table 33. Table 34. Table 35. UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Table 36. Table 37. Table 38. Table 39. Table 40. Table 41. Table 42. Table 43. Table 44. Table 45. Table 46. Table 47. Table 48. Table 49. Table 50. Table 51. Table 52. Table 53. Table 54. Table 55. Table 56. Table 57. Table 58. Table 59. Table 60. Table 61. Table 62. Table 63. Table 64. Table 65. Table 66. Table 67. Template generation times. Average execution time, in milliseconds, for generation of proprietary templates in each of the databases. Template matching times. Average execution time, in milliseconds, for matching of proprietary templates in each of the databases. List of participants OMINEX. FNMR values at FMR=0.05% in case of performing the match-on-card operation of one single enrolment fingerprint template and one authentication fingerprint template in 4 different scenarios (the enrolment and authentication templates are produced by different sources in each scenario). Median durations of the ISO/IEC 7816-4 VERIFY command (for genuine minutiae template comparisons) derived from 1210 genuine trials. Some of the most active academy-related research groups in the field of fingerprint biometrics. Some of the most relevant companies in the field of fingerprint biometrics. List of Journals. List of International Conferences. Software-based AFAS applications disclosed in literature. Hardware-based AFAS applications disclosed in literature. Hardware/software-based AFAS applications disclosed in literature. Performance comparison of the proposed algorithm against FVC2004 DB3 Open Category contest results. Performance comparison of the proposed algorithm against FVC2004 DB3 Light Category contest results. Computational platforms used in the execution time performance evaluation process. Enrolment process execution time performance. Authentication process execution time performance. FCD4B14 fingeprint sensor characteristics. Physical implementation in different platforms. Ridge pixels type definition. Tasks involved in the fingerprints matching stage. Technical description of SOPC devices used in this work. FPGA reconfiguration characteristics of each of the devices under evaluation. FPSLIC resource usage in the development of the automatic fingerprint acquisition system. Execution time performance reached in the enrolment stage: SW-only versus HW-SW implementations. Execution time performance reached in the authentication stage: SW-only versus HW-SW implementations. FPGA resources usage in each of the contexts in which the application is partitioned. Spatial partitioning of the programmable logic device into one static and one reconfigurable regions. Execution time performance reached in the enrolment stage: SW-only versus HW-SW implementations. Execution time performance reached in the authentication stage: SW-only versus HW-SW implementations. FPGA resources usage in each of the contexts in which the application is partitioned. Execution time performance comparison table. xii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 List of Publications Articles in Indexed Journals E. Cantó, M. Fons, F. Fons, M. López and R. Ramos. Fast self-reconfigurable embedded system on Spartan-3. Journal of Universal Computer Science (under second review). M. Fons, F. Fons, and E. Cantó. Biometrics-based consumer applications driven by reconfigurable hardware architectures. Future Generation Computer Systems, Vol. 28, No.1, pp. 268-286, 2012, Elsevier. F. Fons, M. Fons, E. Cantó, and M. López. Deployment of run-time reconfigurable hardware coprocessors into compute-intensive embedded applications. Journal of Signal Processing Systems, Vol. 66, No. 2, pp. 191-221, 2012, Springer. M. Fons, F. Fons, E. Cantó, and M. López. FPGA-based personal authentication using fingerprints. Journal of Signal Processing Systems, Vol. 66, No. 2, pp. 153-189, 2012, Springer. F. Fons, M. Fons, E. Cantó, and M. López. Real-time embedded systems powered by FPGA dynamic partial self-reconfiguration: a case study oriented to biometric recognition applications. Journal of Real-Time Image Processing, pp. 1-23, 2011, Springer, doi:10.1007/s11554-010-0186-1. F. Fons, M. Fons, and E. Cantó. Run-time self-reconfigurable 2D convolver for adaptive image processing. Microelectronics Journal, Vol. 42, No. 1, pp. 204-217, 2011, Elsevier. M. Fons, F. Fons, and E. Cantó. Fingerprint image processing acceleration through run-time reconfigurable hardware. IEEE Transactions on Circuits and Systems II: Express Briefs, Vol. 57, No. 12, pp. 991–995, 2010, IEEE. F. Fons, M. Fons, E. Cantó, and M. López. Trigonometric computing embedded in a dynamically reconfigurable CORDIC system-on-chip. Reconfigurable Computing: Architectures and Applications, Lecture Notes in Computer Science, Vol. 3985, pp. 122-127, 2006, Springer. E. Cantó, N. Canyellas, M. Fons, F. Fons, and M. López. FPGA Implementation of the ridge line following fingerprint algorithm. Field Programmable Logic and Applications, Lecture Notes in Computer Science, Vol. 3203, pp. 1087-1089, 2004, Springer. Book Chapters M. Fons, and F. Fons. Exploiting run-time reconfigurable hardware in the development of automatic fingerprint-based personal recognition applications. Recent Applications in Biometrics, InTech, pp. 239-266, July 2011, ISBN 978-953-307-488-7. Communications to International Conferences E. Cantó, M. Fons, M. López, and R. Ramos. Acceleration of complex algorithms on a fast reconfigurable embedded system on Spartan-3. IEEE International Conference on Field Programmable Logic and Applications (FPL 2009), pp. 429-434, 31 August – 2 September 2009, Prague, Czech Republic, ISBN 978-1-4244-3892-1. M. Fons, F. Fons, and E. Cantó. Embedded VLSI accelerators for fingerprint signal processing. IEEE International Symposium on Intelligent Signal Processing (WISP 2007), pp. 1-6, 3 – 5 October 2007, Alcalá de Henares, Madrid, Spain, ISBN 978-1-4244-0830-6. xiii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 F. Fons, M. Fons, and E. Cantó. Approaching fingerprint image enhancement through reconfigurable hardware accelerators. IEEE International Symposium on Intelligent Signal Processing (WISP 2007), pp. 1-6, 3 – 5 October 2007, Alcalá de Henares, Madrid, Spain, ISBN 978-1-4244-0830-6. M. Fons, F. Fons, E. Cantó, and M. López. Design of a hardware accelerator for fingerprint alignment. IEEE International Conference on Field Programmable Logic and Applications (FPL 2007), pp. 485-488, 27 – 29 August 2007, Amsterdam, The Netherlands, ISBN 978-1-4244-1060-6. M. Fons, F. Fons, and E. Cantó. Embedded security: new trends in personal recognition systems. IEEE International Conference on Ph.D. Research in MicroElectronics and Electronics (PRIME 2007), pp. 89-92, 2 – 5 July 2007, Bordeaux, France, ISBN 978-1-4244-1000-2. F. Fons, M. Fons, E. Cantó, and M. López. Flexible hardware for fingerprint image processing. IEEE International Conference on Ph.D. Research in MicroElectronics and Electronics (PRIME 2007), pp. 169-172, 2 – 5 July 2007, Bordeaux, France, ISBN 978-1-4244-1000-2. M. Fons, F. Fons, and E. Cantó. Hardware-software codesign of a fingerprint alignment processor. IEEE International Conference on Mixed Design of Integrated Circuits and Systems (MIXDES 2007), pp. 661-666, 21 – 23 June 2007, Ciechocinek, Poland, ISBN 83-922632-9-4. F. Fons, M. Fons, and E. Cantó. Hardware-software co-design of a dynamically reconfigurable FPGA-based fuzzy logic controller. IEEE International Conference on Electronics, Circuits and Systems (ICECS 2006), pp. 1228-1231, 10 – 13 December 2006, Nice, France, ISBN 1-4244-03952. M. López, E. Cantó, and M. Fons. Hardware-software co-design of a fingerprint image enhancement algorithm. 32nd Annual Conference of the IEEE Industrial Electronics Society (IECON 2006), pp. 3496-3501, 6 – 10 November 2006, Paris, France, ISBN 1-4244-0390-1. F. Fons, M. Fons, and E. Cantó. System-on-chip design of a fuzzy logic controller based on dynamically reconfigurable hardware. International Transactions on Systems Science and Applications – ITSSA (SOAS 2006), Vol. 2, No. 2, pp. 191-196, September 2006, Erfurt, Germany. M. López, E. Cantó, M. Fons, A. Manuel, and J. del Rio. Hardware coprocessor design for fingerprint image enhancement. IEEE International Midwest Symposium on Circuits and Systems (MWSCAS 2006), pp. 520-524, 6 – 9 August 2006, San Juan, Puerto Rico, USA, ISBN 1-42440173-9. M. Fons, F. Fons, and E. Cantó. Design of an embedded fingerprint matcher system. IEEE International Symposium on Consumer Electronics (ISCE 2006), pp. 1-6, 28 June – 1 July 2006, Saint Petersburg, Russia, ISBN 1-4244-0216-6. M. Fons, F. Fons, and E. Cantó. Design of FPGA-based hardware accelerators for on-line fingerprint matcher systems. IEEE International Conference on Ph.D. Research in MicroElectronics and Electronics (PRIME 2006), pp. 333-336, 12 – 15 June 2006, Otranto, Lecce, Italy, ISBN 14244-0157-7. F. Fons, M. Fons, and E. Cantó. Custom-made design of a digital PID control system. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006), Vol. 3, pp. 1020-1023, 14 – 19 May 2006, Toulouse, France, ISBN 1-4244-0469-X. xiv UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 M. Fons, F. Fons, E. Cantó, and M. López. Hardware-software co-design of a fingerprint matcher on card. IEEE International Conference on Electro/Information Technology (EIT 2006), pp. 113118, 7 – 10 May 2006, East Lansing, Michigan, USA, ISBN 0-7803-9592-1. F. Fons, M. Fons, E. Cantó, and M. López. Dynamically reconfigurable CORDIC coprocessor for trigonometric computing. International Conference on Architecture of Computing Systems (ARCS 2006), Lecture Notes in Informatics, Vol. P-81, pp. 254-263, 13 – 16 March 2006, Frankfurt am Main, Germany, ISBN 3-88579-175-7. M. Fons, F. Fons, N. Canyellas, E. Cantó, and M. López. Hardware-software co-design of an automatic fingerprint acquisition system. IEEE International Symposium on Industrial Electronics (ISIE 2005), Vol. 3, pp. 1123-1128, 20 – 23 June 2005, Dubrovnik, Croatia, ISBN 0-7803-8738-4. E. Cantó, N. Canyellas, M. López, M. Fons, and F. Fons. Coprocessor of the ridge line following fingerprint algorithm. XIX Conference on Design of Circuits and Integrated Systems (DCIS 2004), pp. 139-143, 24 – 26 November 2004, Bordeaux, France, ISBN 2-9522971-0-X. F. Fons, M. Fons, and S. Ibáñez. Biometrics is the key. 24. Tagung Elektronik im Kraftfahrzeug – Neue Technologien, Integration und Systementwurf, 29 – 30 June 2004, Essen, Germany. Communications to National Conferences M. Fons, F. Fons, E. Cantó, and M. López. Procesador de alineamiento de huellas dactilares. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2007), pp. 27-34, 12 – 14 September 2007, Zaragoza, Spain. F. Fons, M. Fons, E. Cantó, and M. López. Procesador hardware auto-reconfigurable de huella dactilar. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2007), pp. 19-26, 12 – 14 September 2007, Zaragoza, Spain. E. Cantó, M. López, N. Canyellas, M. D. Palomera, M. Fons, and F. Fons. Coprocesador para la esqueletización de huellas dactilares. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2005), pp. 103-108, 13 – 16 September 2005, Sevilla, Spain. M. López, E. Cantó, N. Canyellas, M. D. Palomera, M. Fons, and F. Fons. Diseño de un coprocesador hardware para segmentación de huellas dactilares. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2005), pp. 173-178, 13 – 16 September 2005, Sevilla, Spain. E. Cantó, N. Canyellas, M. Fons, F. Fons, and M. López. Coprocesador de extracción de minutia para MicroBlaze. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2004), pp. 605-611, 13 – 15 September 2004, Barcelona, Spain. M. Fons, F. Fons, N. Canyellas, M. López, and E. Cantó. Codiseño hardware-software de un algoritmo de matching biométrico. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2003), pp. 399-406, 10 – 12 September 2003, Madrid, Spain. F. Fons, M. Fons, N. Canyellas, M. López, and E. Cantó. Planteamiento de una alternativa de solución al reto del proceso de matching sobre bases de datos grandes. Aplicación del método en los sistemas de identificación personal basados en biometría de huella dactilar. Jornadas sobre Computación Reconfigurable y Aplicaciones (JCRA 2003), pp. 597-610, 10 – 12 September 2003, Madrid, Spain. xv UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 M. Fons, F. Fons, N. Canyellas, M. López, and E. Cantó. Trends on personal recognition systems: evolving to biometric security. Workshop on Electron Devices and Microelectronics, June 2003, Tarragona, Spain. F. Fons, M. Fons, N. Canyellas, M. López, and E. Cantó. Trusted smart cards: a future new generation of embedded systems that merges biometrics and system-on-chip technology. Workshop on Electron Devices and Microelectronics, June 2003, Tarragona, Spain. Other Publications F. Fons, and M. Fons. FPGA-based automotive ECU design addresses AUTOSAR and ISO 26262 standards. Xcell Journal, Issue 78, pp. 20-31, January 2012, www.xilinx.com. F. Fons, and M. Fons. Auf die finger blacken. Elektronik Journal, pp. 16-18, October 2010, www.elektronikjournal.com. F. Fons, and M. Fons. Making biometrics the killer App of FPGA dynamic partial reconfiguration. Xcell Journal, Issue 72, pp. 24-31, July 2010, www.xilinx.com. F. Fons, M. Fons, E. Cantó, and M. López. Dynamically reconfigurable CORDIC coprocessor for trigonometric computing. Mitteilungen – Gesellschaft für Informatik (GI) e. V., ParallelAlgorithmen und Rechnerstrukturen, No. 23, pp. 34-43, December 2006, ISSN 0177-0454. F. Fons, M. Fons, and E. Cantó. System-on-chip design of a fuzzy logic controller based on dynamically reconfigurable hardware. International Transactions on Systems Science and Applications (ITSSA), Vol. 2, No. 2, pp. 191-196, Xiaglow Research, 2006, ISSN 1751-1461. M. López, E. Cantó, M. Palomera, F. Fons, M. Fons, and N. Canyellas. Hardware-software codesign for fingerprint biometric identification. Instrumentation Viewpoint, Issue 3, pp. 7-10, Spring 2005, SARTI Technological Development Centre of Remote Acquisition and Data Processing Systems, ISSN 1697-2562. xvi UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Hardware Accelerators For Embedded Fingerprint-Based Personal Recognition Systems TABLE OF CONTENTS Abstract ......................................................................................................................................... i List of Acronyms ........................................................................................................................ iii List of Figures ............................................................................................................................ vii List of Tables .............................................................................................................................. xi List of Publications .................................................................................................................. xiii Articles in Indexed Journals ................................................................................................... xiii Book Chapters ........................................................................................................................ xiii Communications to International Conferences ...................................................................... xiii Communications to National Conferences ............................................................................. xv Other Publications .................................................................................................................. xvi 1. Introduction ............................................................................................................................. 1 1.1. Personal Recognition Systems ........................................................................................... 2 1.2. Fingerprint Biometrics ..................................................................................................... 12 1.3. Physical Implementation of Fingerprint-Based Recognition Systems ............................ 16 1.4. Thesis Motivation and Scope ........................................................................................... 20 2. Automatic Personal Authentication Using Fingerprints ................................................... 25 2.1. Personal Authentication Process ...................................................................................... 25 2.1.1. Enrolment .................................................................................................................. 25 2.1.2. Verification ............................................................................................................... 26 2.2. Fingerprint Acquisition .................................................................................................... 27 2.2.1. Sensing Techniques .................................................................................................. 28 2.2.2. Optical Sensors ......................................................................................................... 30 2.2.3. Solid-State Sensors ................................................................................................... 31 2.2.4. Ultrasound Sensors ................................................................................................... 32 2.2.5. Conclusions ............................................................................................................... 32 2.3. Image Enhancement ......................................................................................................... 33 2.3.1. Image Conditioning .................................................................................................. 35 2.3.2. Field Orientation Map Computation ......................................................................... 37 2.3.3. Ridge Frequency Map Computation ......................................................................... 41 2.3.4. Segmentation and Region Mask Estimation ............................................................. 43 2.3.5. Contextual Image Filtering ....................................................................................... 50 2.3.6. Conclusions ............................................................................................................... 55 xvii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.4. Feature Extraction ............................................................................................................ 55 2.4.1. Greyscale-based Features .......................................................................................... 55 2.4.2. Singular Points .......................................................................................................... 56 2.4.3. Ridge Map ................................................................................................................. 58 2.4.4. Minutiae .................................................................................................................... 59 2.4.5. Pores and Other Ridge Details .................................................................................. 63 2.4.6. Conclusions ............................................................................................................... 65 2.5. Feature Alignment ............................................................................................................ 65 2.5.1. Alignment Based on Greyscale Information ............................................................. 65 2.5.2. Alignment Based on Singular and Other Reference Points ...................................... 66 2.5.3. Alignment Based on Minutiae .................................................................................. 67 2.5.4. Alignment Based on Ridge Map ............................................................................... 69 2.5.5. Alignment Based on Field Orientation ..................................................................... 70 2.5.6. Alignment Based on Pores ........................................................................................ 71 2.5.7. Hybrid Alignment Techniques .................................................................................. 71 2.5.8. Conclusions ............................................................................................................... 73 2.6. Feature Matching ............................................................................................................. 74 2.6.1. Correlation-Based Matching Techniques ................................................................. 75 2.6.2. Minutiae-Based Matching Techniques ..................................................................... 77 2.6.3. Ridge or Non-Minutiae Feature-Based Matching Techniques ................................. 79 2.6.4. Hybrid Matching Techniques ................................................................................... 82 2.6.5. Conclusions ............................................................................................................... 85 2.7. Conclusions ...................................................................................................................... 86 3. Architectures for Real-Time Digital Signal Processing ..................................................... 89 3.1. Microprocessor / Microcontroller Chips .......................................................................... 89 3.2. Digital Signal Processor Chips ........................................................................................ 90 3.3. Application-Specific Chips .............................................................................................. 93 3.4. Structured ASIC Chips ..................................................................................................... 94 3.5. Field Programmable Gate Array Chips ............................................................................ 95 3.6. Multi-core Chips ............................................................................................................ 100 3.7. Array of Processing Elements ........................................................................................ 103 3.8. Multiprocessor Systems ................................................................................................. 105 3.9. Conclusions .................................................................................................................... 106 4. Fingerprint-Based Biometric Systems .............................................................................. 109 xviii UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 4.1. Fingerprint Sensors ........................................................................................................ 109 4.2. Automatic Fingerprint Authentication Systems (AFAS) ............................................... 113 4.3. Technology Evaluation .................................................................................................. 120 4.3.1. Fingerprint Verification Competition 2000 (FVC2000) ......................................... 121 4.3.2. Fingerprint Verification Competition 2002 (FVC2002) ......................................... 122 4.3.3. Fingerprint Verification Competition 2004 (FVC2004) ......................................... 123 4.3.4. Fingerprint Verification Competition 2006 (FVC2006) ......................................... 126 4.3.5. On-line Evaluation of Fingerprint Recognition Algorithms (FVC-onGoing) ........ 128 4.3.6. Fingerprint Vendor Technology Evaluation (FpVTE2003) ................................... 129 4.3.7. Proprietary Fingerprint Template Evaluation (PFT) ............................................... 130 4.3.8. Fingerprint Template Evaluation II (PFTII) ........................................................... 131 4.3.9. Minutiae Interoperability Exchange Test (MINEX04) ........................................... 132 4.3.10. Ongoing Minutiae Interoperability Exchange Test (OMINEX) ........................... 134 4.3.11. Assessment of Match-on-Card Technology (MINEXII) ...................................... 135 4.4. Research Institutions ...................................................................................................... 136 4.4.1. Academy ................................................................................................................. 136 4.4.2. Industry ................................................................................................................... 138 4.5. Research Disclosure ....................................................................................................... 139 4.5.1. Journals ................................................................................................................... 139 4.5.2. Conferences ............................................................................................................. 140 4.6. Related Work ................................................................................................................. 142 4.6.1. AFAS SW ............................................................................................................... 142 4.6.2. AFAS HW ............................................................................................................... 155 4.6.3. AFAS HW/SW ........................................................................................................ 157 4.6.4. Other Biometric Systems ......................................................................................... 171 4.6.5. Multibiometric Systems .......................................................................................... 172 4.7. Conclusions .................................................................................................................... 173 5. AFAS Architecture Approach with FPGA ....................................................................... 175 5.1. Background .................................................................................................................... 177 5.2. Design Flow ................................................................................................................... 179 5.2.1. Authentication Algorithm: Processing Stages ........................................................ 181 5.2.2. Recognition Accuracy Evaluation .......................................................................... 187 5.2.3. Processing Speed Evaluation under Software-based Platforms .............................. 190 5.2.4. Physical Platform: High-Performance and Low-Cost Driven Design .................... 192 xix UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 5.3. Fingerprint Acquisition Approach ................................................................................. 201 5.3.1. Selection of the Sensing Technique ........................................................................ 201 5.3.2. Hw/Sw Partitioning ................................................................................................. 204 5.3.3. Physical Implementation ......................................................................................... 208 5.4. Image Enhancement Approach ...................................................................................... 208 5.4.1. Algorithm Description ............................................................................................ 208 5.4.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration ........ 219 5.4.3. Physical Implementation – Design Development – Design Implementation ......... 222 5.5. Feature Extraction Approach ......................................................................................... 236 5.5.1. Algorithm Description ............................................................................................ 236 5.5.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration ........ 241 5.5.3. Physical Implementation – Design Development – Design Implementation ......... 242 5.6. Fingerprints Alignment Approach ................................................................................. 247 5.6.1. Algorithm Description ............................................................................................ 247 5.6.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration ........ 253 5.6.3. Physical Implementation – Design Development – Design Implementation ......... 255 5.7. Fingerprints Matching Approach ................................................................................... 259 5.7.1. Algorithm Description ............................................................................................ 259 5.7.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration ........ 265 5.7.3. Physical Implementation – Design Development – Design Implementation ......... 266 5.8. System Integration ......................................................................................................... 270 5.8.1. Recognition Appplication: Test Case ..................................................................... 270 5.8.2. Atmel FPSLIC Development Platform and Physical Prototype ............................. 274 5.8.3. Altera Excalibur EPXA10 Development Platform ................................................. 277 5.8.4. Xilinx Virtex-4 ML401 Evaluation Platform ......................................................... 283 5.9. Conclusions .................................................................................................................... 289 6. Research Contribution ....................................................................................................... 293 6.1. Conclusions of the Thesis .............................................................................................. 293 6.2. Reseach Projects ............................................................................................................ 294 6.2.1. TRUST-eS ............................................................................................................... 294 6.2.2. DELFIN .................................................................................................................. 294 6.2.3. PIBES ...................................................................................................................... 295 6.3. Future Work ................................................................................................................... 295 References/Bibliography ........................................................................................................ 297 xx UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 1. Introduction Once one electronic engineer takes the decision of pursuing a PhD degree, many questions arise in his mind that need to be carefully analysed before finding the right answers to all of them. Taking this challenge means accepting the fact that some habit changes are going to happen in the coming years; and even more than “some changes”, perhaps “many changes”, if he decides to make this new experience compatible with the development of his professional career in the electronics industry, in parallel, as I did with my personal decision. After accepting such a challenge, two are the basic questions to answer. The first one refers to the starting point of the work: “On what research topic have I to focus my thesis?”, whereas the second one tries to foresee the ending point: “Which has to be the final goal of my research work?”. In the following lines, I try to give a clear view about my personal answers to all those questions. The choice of the research topic is a key factor that sets to a large extent the final success of the work. It is important to select a topic that you really like since you will invest many working hours in it. In this way, it was easy for me to answer the first of the questions since I love the electronic design of embedded systems based on any kind of smart device such as microprocessors, microcontrollers, field programmable gate arrays, or others. Once identified the technological research field, the next step consists in finding the application field where such a technology can bring novel contributions and outstanding benefits. Unfortunately, the selection of the application field was also easy. I refer to this as unfortunate because of the fact that my decision was influenced to some extent by the undesirable events that took place on September 11th 2001. My work is focused on the research of application-driven architectures for high-performance computing algorithms. Among the different high-performance computing algorithms that today exist in the fields of engineering, I selected those addressed to the personal recognition process of human beings based on biometric features. This research field merges computer vision and signal processing in the form of complex and computationally expensive algorithms. Although this research field has been actively exploited since three decades ago, an exponential interest on automatic user recognition systems based on biometric technologies took place since the unfortunate events of 2001. When going in depth into the biometrics field, one realizes that apart from its inherent strengths, some limitations also exist. It is essential to cope with those limitations to reach the full deployment of biometric systems. As it happens in many other application fields, most of the computing algorithms developed to solve one specific problem are traditionally architecture-driven algorithms; it means that during the development stage of the algorithm it is kept in mind the physical platform on which the algorithm has to be finally executed. Normally, it refers to personal computer-based platforms so the algorithms are tuned to the architecture of the physical systems, and some highly expensive mathematical computations or even complete algorithms that would take too long execution times are discarded in the first design stages in spite of featuring good results due to the inherent limitations of the hardware platforms where the final algorithms are supposed to be run. However, in this work, a new approach is studied based on application-driven algorithms instead of architecture-driven algorithms. The main goal in this new scenario is to reach the best possible performance at algorithm level in spite of the amount of resources needed by the physical platform in charge of the processing. Biometric recognition algorithms are then a good field of analysis owing to the fact that the accuracy performance of state-of-the-art biometric systems does not present optimum scores, and their reliability needs to be further improved. Over the last three decades, many algorithmic approaches have been developed by the research community in the industry and the academia dealing with automatic user recognition through the analysis of human biometric features. However to date, infallible biometrics-based recognition is still a non-afforded challenge. This work copes with the physical implementation of recognition systems, not only from an academic point of view but also from an industrial or commercial perspective. Therefore, a major constraint has also to be taken into consideration: the final cost of 1 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 the system. A technical solution based on general-purpose CPUs and application-specific hardware accelerators developed on programmable logic under FPGAs or SOPCs devices is evaluated. This solution aims at providing flexible and cost-effective platforms where to implement those best-inclass biometric algorithms and systems. Based on this direction, I was able to find the proper answer to the second of the questions as well: so many hours of hard work were used to “try” at least to improve the world. It was important for me to address the research on a useful application for the real world, aiming at increasing the quality of life of people all over the world. The final goal of this research work is therefore to prove the technical feasibility of the proposed embedded system architecture oriented to the development of those complex and high-performance biometric systems. It aims at facilitating thus the technological progress of current society by spreading the biometric security in a wide range of daily use applications, making it accessible to everybody at affordable costs. I would summarize the present thesis as “a technical contribution done from both the academy and the industry perspectives to the service of today’s society”. Welcome to the fascinating biometrics and electronics worlds! 1.1. Personal Recognition Systems Nowadays the Earth planet has about 7.000.000.000 human beings. It is globally accepted that there do not exist in the world two people completely identical, even similar twins feature multiple physiological and/or behavioural differences between them. Once accepted that each person has distinctive features, and based on the huge population of the world, at first glance it seems really difficult either to confirm (authentication problem) or to determine (identification problem) in an automatic way the identity of one individual. Dealing with such automatic personal recognition issues, several methodologies have been developed along the years in order to either authenticate the identity claimed by one individual, or to identify a given individual from a database where a group of people is recorded. Those methodologies can be classified in three main groups according to the identity token used in order to build the automatic personal recognition system: (i) In the first methodology, the identity token is based on something physical, something that the user has such as an ID card, a driver’s license, the key of a car or a passport, etc. The possession of that token is used as a proof of identity of its owner. As it can be deduced, the security level of this methodology is quite low owing to the fact that the recognition system is vulnerable to some undesired attacks such as the theft or the forgery of the physical token. In case of token subtraction for instance, any impostor having possession of the token can compromise the recognition system acting as the genuine owner of that token. (ii) In the second methodology however, the identity token is not based on something physical but on something the user knows or mentally possesses. The personal recognition system is based on the user’s knowledge of one specific item such as a personal identification number (PIN), a password, or the answer to specific questions. All this information is only shared between the user and the own automatic recognition system. Typical applications are network logon, website access, e-commerce, etc. Although the security level of the system is higher than in the previous method, it is also inappropriate in those applications demanding high reliability owing to the fact that PINs and passwords are short words, normally people has to remember a lot of passwords so they become easy to deduce, or they are written wherever in order do not forget them, with the consequent easy exposition to fraudulent attacks. Therefore, a new evolution is done in order to increase the security level of the recognition systems. It consists in combining possession and knowledge-based tokens to establish a proof of identity as in case of smart cards, etc. Although the reliability and the security of the system are slightly higher in this new scenario, some weak points are still present since, as in the previous techniques, this new methodology cannot distinguish one genuine user from one impostor who in a fraudulent way acquired the possession-token and discovered the knowledge-token of the legitimate user. (iii) In order to address the weak points of previous systems and to increase the security and the reliability of the whole recognition system, a new methodology based on neither “anything the user 2 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 has”, nor “anything the user knows”, but “something the user is” was developed. The own biometric features of the individuals, universal and present to all human beings but unique and different to each person, become the inherent tokens to be used as inputs of the personal recognition system, with the main advantage that this token is inherent to the individual’s identity, and cannot be stolen, lost, forgotten, shared or misplaced. Biometric recognition deals with the usage of the biological measurements of those distinguishing human traits, either of physiological nature such as face, fingerprint, hand geometry, iris, retinal pattern, DNA, ear, etc. or of behavioural nature such as signature, voice-print, keystroke dynamics, gait, etc. in order to recognize the identity of an individual in a reliable and accurate way. There is a wide variety of recognition systems available in the field, ranging from those more simple and less secure –based on one single methodology such as “something you have” or “something you know”– to those others more complex and reliable –based on a mix of all three technologies–, as depicted in Figure 1 and Figure 2. + S E C U R I T Y L E V E L ▬ Something you have + something you know + something you are Something you know + something you are Something you have + something you are Something you are Something you have + something you know Something you know Something you have Figure 1. Security performance comparison of personal recognition technologies. PIN Mr. Smith Mr. Smith PIN PIN PIN Mr. Smith Mr. Smith Mr. Smith Mr. Smith Mr. Smith Figure 2. Variety of ID tokens (e.g. smart cards) used in automatic personal recognition systems. Many applications tend to introduce biometric features in the personal recognition process in order to increase the security and reliability of their products/services. Biometric recognition systems are being increasingly deployed in both public and private sectors, in a large number of governmental, forensic and civilian applications. Moreover, the current trends indicate a higher investment on biometrics in the coming years, as reported in Figure 3. 3 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 $10,000.0 $9,000.0 $8,000.0 $7,000.0 $6,000.0 $5,000.0 $4,000.0 $3,000.0 $2,000.0 $1,000.0 $0.0 2009 2010 2011 2012 2013 $3,422.3 $4,356.9 $5,423.6 $6,581.2 $7,846.7 $9,368.9 2014 Figure 3. Annual biometric industry revenues ($M USD). Original source: Biometrics Market and Industry Report 2009-2014 from International Biometric Group, 2009. There is a popular misconception that automatic personal recognition based on biometrics is a fully solved problem. However, automatic biometric recognition is still an open research problem today. Several biometric technologies have been developed to date to deal with personal recognition in forensic, government or commercial applications. In order any human physiological or behavioural characteristic to become a valid biometric identifier, it is needed to feature the following properties: a) universality, understood as the fact that the biometric identifier has to be an inherent characteristic to each individual; b) distinctiveness: there cannot exist two or more people with identical biometric characteristics, any person has to be sufficiently different to the rest in terms of such biometric identifier; c) permanence: the biometric identifier has to keep invariant or stable over the time; d) collectability: it has to be possible the quantitative measurement of the biometric features of any individual; e) performance: it refers to the measurement of those application parameters such as the recognition accuracy, the processing speed, and the robustness against environmental factors that could affect the reliability of the recognition system; f) acceptability, which measures the social acceptation level that the biometric trait features on the part of the intended population or end users; and g) circumvention, which reflects the robustness level of the biometric traits against fraudulent attacks that attempt to fool the system. Biometrics ID DNA Ear Universality Distinctiveness Permanence Collectability Performance Acceptability Circumvention H H H M M H Face H L M Facial thermogram H H L Fingerprint M H H Gait M L L Hand geometry M M M Hand vein M M M Iris H H H Keystroke L L L Odor H H H Retina H H M Signature L L L Voice M L L Table 1. Comparison of biometric technologies (H: high, M: Handbook of Fingerprint Recognition, 2003&2009. L M H H M H H M M M L L H M medium and H L L M H M L H H M H L H M M L H M M M M M M L H L L L M M L M L H L L L H H L H H L: low performances). Original source: 4 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Despite biometrics vendor claims, there is no best biometric technology. Each biometric trait has its strengths and weaknesses, and the choice of one or several particular traits depends on the requirements of the final application. In order to increase the recognition accuracy and the security of one application it is possible to combine several biometric traits, what is also known as multimodal biometric systems. Therefore, a distinction between monomodal and multimodal biometric systems can be made depending on how many biometric traits of the user are taken into consideration in the recognition process. No biometric trait is right for all situations. Table 1 shows the list of the most deeply analysed human biometric identifiers and their features. Nowadays, there are many biometric technologies under continuous research and development to be exploited in several application fields in a variety of market segments, as depicted in Table 2. Biometric technologies Physiological: - Dental - DNA - Ear - Face - Facial thermogram - Fingerprint - Hand geometry - Hand vein - Iris - Nose - Odor - Palmprint - Retina Behavioural: - Gait - Handwriting/Signature - Keystroke - Voice/Speech “Soft” biometrics: - Height - Scars - Tattoos - Weight - Others Applications Forensic: - Criminal identification/authentication - Victim identification/authentication Government: - Civil identification/authentication Commercial: - Consumer identification/authentication - Access control/attendance - Surveillance - Device/system access Markets Law Enforcement Regional/State Government National Government Financial Services Gaming Hospitality Health Care High-Tech and Telecom Industrial Manufacturing Retail Travel and Transportation Prison Management Military & National Security Community Border Control & Immigration Checks Entitlement Programs Licensing National Identity Card & Voter Registration Banking and Financial Services Personnel Management Access Control Information Systems Management Table 2. Biometric technologies, application fields and key exploitation markets. “Soft” biometric traits are those characteristics that provide some information about the individual, but lack the distinctiveness and permanence to sufficiently differentiate two individuals. Although they do not have sufficient discriminatory information to fully assert the identity of the user, they can improve to some extent the accuracy of the recognition process when combined with real or “hard” biometric traits. Among the existing “hard” and “soft” distinctive traits, fingerprint-based biometrics systems continue to be the leading biometric technology in terms of market share, followed by face, iris and voice recognition, as shown in Figure 4. Vein 3.9% Voice 4.9% Iris 8.3% Hand Geometry 2.9% Other Modalities 2.6% Fingerprint 46.0% Middleware 12.9% Face 18.5% Figure 4. Market share of the leading biometric technologies (excludes AFIS revenues). Original source: Biometrics Market and Industry Report 2009-2014 from International Biometric Group, 2009. 5 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The term recognition is indistinctively used in this work to refer to either the identification problem or the authentication problem of individuals’ identities. Identification refers to the process of associating one individual to one of the identities recorded in a database or, on the contrary, certifying that the subject is not enrolled in that database. The system asserts or denies the identity of an individual by searching such identity among all the templates recorded in the database. The user is not required to claim his/her legitimate identity; the identification system is in charge of looking for the user’s identity instead, by matching those characteristic traits of the user with those others available in the database templates. Typical applications are the ones dealing with criminal investigations, where the latent biometrics found in the scene of the crime are correlated with the criminals’ database; or forensic investigations, where the corpse’s identity is recognized by comparing those biometrics extracted from the corpse with those recorded in national or international ID databases. Other typical examples refer to civilian applications, where it is needed to confirm that a person belongs to one group of users with certain rights in one application prior to giving him/her some privileges such as access to restricted areas, confidential information, or valuable resources. In this kind of applications, the recognition system needs to check the user’s identity against all the identities present in the database prior to deciding if the user belongs or not to the specific group of people recorded in the database. Because identification in large databases can result computationally expensive, classification and indexing techniques are often deployed in order to limit the amount of comparisons to be performed by the recognition system. According to the characteristics of the biometric traits, a classification in several mutually exclusive groups is performed, and each identity is linked to only one of those groups. Therefore, when trying to identify one individual, the group at which such individual belongs is first determined, and afterwards, the identity of the user is matched against those template identities classified in the same group only, in order to reduce thus the identification overhead. Authentication or Verification refers to the problem of confirming or denying the validity of the identity claimed by the user prior to giving him access or denying the access to the privileges of any application. In such a scenario, and owing to the fact that the user claims an identity, the recognition problem is reduced to the comparison of only two identities: the first one refers to the user’s identity, and the second one refers to the claimed identity (which is recorded in a database or a smart card). In both scenarios –identification and authentication– the right and genuine individual’s identity needs to be first recorded into the system, in a database or a smart card during the enrolment process. In the enrolment process, the user is properly registered into the system: the biometric traits of the user, together with any other personal information, are recorded in a secure database. After registration, the system provides the user with certain privileges in any application. The user can make use of those privileges through the authentication or identification processes. Figure 5, Figure 6 and Figure 7 show the general diagrams of enrolment, authentication, and identification processes, respectively. The identification problem is also referred to as one to many matching owing to the fact that a comparison of the user’s identity against a large number of recorded identities is needed. The authentication problem however, is also known as one to one matching because only one comparison of characteristics is carried out, so the final matching time is much lower than in the identification scenario. In certain scenarios, the authentication problem is also known as one to few matching if several templates of the user are stored in the database. Also, one to few matching can refer to those identification applications where the database is composed of a very limited amount of users. The identification problem can be seen to some extent as a generalization of the authentication or verification problem since it can be implemented as a repetitive execution of one to one matches. Without loss of generality, in both the identification and the authentication scenarios, the automatic recognition process can be reduced to the comparison of two identities: (i) a known identity previously saved in a database and called template T, which represents the identity 6 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 of any legitimate individual that possesses the proper rights in one application; and (ii) another identity corresponding to the user who attempts making use of the application, referenced as sample or query identity Q. Face Iris USER MONOMODAL or MULTIMODAL FEATURE EXTRACTOR TEMPLATE ID USER S DATA BASE Finger USER S MA R TCAR D Mr. Smith Hand Teeth PERSONAL DATA Figure 5. Block diagram of the enrolment process. USER MONOMODAL or MULTIMODAL FEATURE EXTRACTOR FEATURE MATCHER 1:1 CLAIMED ID Mr. Smith SAMPLE ID MATCH / NOT MATCH ? = SAMPLE TEMPLATE TEMPLATE ID Figure 6. Block diagram of the authentication process. USER MONOMODAL or MULTIMODAL FEATURE EXTRACTOR FEATURE MATCHER 1:N USER ID / USER NOT PRESENT SAMPLE ID ? ∈ SAMPLE USERS DATABASE (TEMPLATES) SAMPLE CLASS N TEMPLATES ID Figure 7. Block diagram of the identification process. 7 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The automatic personal recognition system is in charge of matching both identity features and deducing from that analysis a similarity score s(T,Q) ranged in the interval [0,1]. The closer the score is to 1, the more certain is the recognition system that both identities correspond to the same individual. On the contrary, the closer the score is to 0, the more certain is the system that both identities correspond to different users. The automatically generated score is passed through a filtering stage consisting of comparing the score with a certain threshold in order to finally deduce if the query and the template match or do not match. As indicated in Figure 8, the threshold t determines the final decision of the recognition system. // Similarity score and Threshold ranges [0% – 100%] if (s(T,Q) >= t) { T = Q; // Match OK } else { T ≠ Q; // Match NOK } Figure 8. Template – Query evaluation process. Unfortunately, the accuracy of a system cannot be deduced in global terms, and it can only be estimated from empirical data. The performance depends on several factors such as (i) the specific population or database on which the accuracy of the system is analysed; and (ii) the environmental conditions (user age, biometric data acquisition conditions, habituated/non-habituated users, attended/non attended system, etc.) on which the test is done. Therefore, the performance of a system is only valid for the specific database and in the specific environmental conditions on which the test is carried out. It points the complexity of the system accuracy characterization process. For the scope of such empirical performance characterization process to be as general as possible, the data used in the test have to be large enough to represent the whole population in any possible environmental working condition. The set of patterns under test has to feature large samples of each possible category present in the population in order to objectively represent the real accuracy performance of the recognition system. From the recognition accuracy perspective, any personal recognition system, authenticator or identifier, can commit two types of errors when matching a pair of biometrics: either (i) to determine as corresponding or genuine two sets of biometrics that belong to different individuals, also known as false match error; or (ii) to assert as non-corresponding two sets of biometrics from the same individual, also known as false non-match error. To properly determine the recognition accuracy of any application it is needed to check the system in two different scenarios: (i) to test the behaviour of the system in case of matching pairs of different identities, and (ii) to test the behaviour when the patterns under test belong to the same identity. The first group is called Impostor group, and the second is known as Genuine group. After performing the tests with both groups, the genuine and impostor distributions are obtained. Let I(t) denote the impostor distribution, and G(t) denote the genuine distribution, both defined as a function of the application threshold t used to discern between positive and negative recognition responses. The typical distributions featured by an automatic personal recognition system are depicted in Figure 9. The x-axis corresponds to the similarity score or application threshold t, and the y-axis represents the probability or the percentage of pairs in the population featuring similarity scores with value t. The biometric verification problem can be formulated as follows: ⎧ 1 ⎪ ∫ I (t ) dt = 1 ⎪ ∀t ∈ [0,1] ⎨ 10 ⎪ G (t ) dt = 1 ⎪ ∫ ⎩ 0 8 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 From both distributions, it is possible to extract the accuracy performance of the system. The accuracy is measured by means of several quantifiers: - FMR(t) or false match rate, which measures the probability of accepting as valid an impostor user: FMR (t ) = ∫ I (t ) dt t 1 - FNMR(t) or false non-match rate, which measures the probability of rejecting as invalid a legitimate user: FNMR (t ) = ∫ G (t ) dt 0 t - RMR(t) or right match rate, which measures the probability of accepting as valid a legitimate user: RMR (t ) = ∫ G (t ) dt = 1 − FNMR (t ) 1 t RMR (t ) + FNMR (t ) = 1 - RNMR(t) or right non-match rate, which measures the probability of rejecting an impostor user: RNMR (t ) = ∫ I (t ) dt = 1 − FMR (t ) RNMR (t ) + FMR (t ) = 1 0 t % population I (t) % population I (t) G (t) G (t) FNMR (t0) t0 FMR (t0) threshold 0 error 100% FMR (t) FNMR (t) EER 0 tEER 1 t 1 t Figure 9. Recognition system errors and accuracy descriptors. Apart from the distributions G(t) and I(t), Figure 9 shows the curves FMR(t) and FNMR(t), where the point EER (Equal Error Rate) is highlighted. EER denotes the threshold (tEER) that can be applied in the application to have identical FMR and FNMR failure rates: FMR (tEER ) = FNMR (tEER ) 9 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In any application, the lower the FMR and FNMR rates are, the higher the accuracy exhibited by the recognition system is. However, as it can be deduced from Figure 9, FMR and FNMR have opposite behaviours. If the threshold t is increased to avoid impostors to be accepted, FMR decreases but FNMR increases; and vice versa, if the threshold t is decreased to avoid genuine users to be rejected, FNMR decreases but FMR increases. The selection of the proper threshold for any application is therefore a compromise. Those applications that require a high security level such as access control systems to restricted areas will demand a low FMR in order to avoid impostor users to access the systems at the cost of having to reject legitimate users several times before accepting them into the system (high FNMR). On the contrary, forensic or criminal applications will demand a low FNMR in order do not miss any potential criminal in spite of having to check many potential criminals (high FMR). In between both kinds of market segments, those civilian or commercial applications requesting moderate security levels (e.g. access control to a gym) can be placed. They feature intermediate FMR and FNMR levels to avoid many user recognition retries before accessing the system at the expense of accepting some non-registered users. Figure 10 shows the scope of each potential application. FMR and FNMR quantifiers are normally plotted in the way of a 2-D graphic called ROC (Receiver Operating Characteristic) curve that points how the system behaves as a function of the threshold used to delimit a right (genuine) from a wrong (impostor) personal recognition. FMR Forensic Applications Civilian Applications EER High Security Applications FNMR Figure 10. Typical FMR and FNMR operation points for different applications. Based on the accuracy descriptors FMR(t) and FNMR(t), it is possible to determine the performance of both authentication and identification systems. Fixed a threshold t0 for the application: FMR (t0 ) = FMR FNMR(t0 ) = FNMR it is possible to deduce the accuracy exhibited by both systems in terms of the quantifiers FMR and FNMR. In case of a personal authentication system where one single pattern per user is saved as template during the enrolment procedure, and one single pattern is matched as query in the authentication process, the accuracy performance of the system, represented by the descriptors FMR1 and FNMR1, is determined as: FMR1 = FMR FNMR1 = FNMR Similarly, assuming that in the personal identification system only one single pattern per user is used as template in the enrolment procedure and as query in the recognition procedure, no 10 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 indexing/classification mechanisms are used, and up to N matches are carried out in the identification process, the accuracy performance of the system, represented by the descriptors FMRN and FNMRN, is determined as: FNMRN = FNMR FMRN = 1 − (1 − FMR ) N From the above formulas, and assuming that the system is not ideal (FMR ≠ 0, FNMR ≠ 0), it is easy to notice the high difficulty that exists in order to achieve efficient FMR rates in identification systems in case of managing large or very large databases (N → ∞). But not only this, the state of the art points out the difficulty in achieving efficient FMR and FNMR rates in both authentication systems and identification systems (even of small size). Therefore, nowadays two main options are proposed in the scientific literature in order to overcome those limitations: either (i) to improve the accuracy of the system by enhancing the recognition algorithm itself trying to achieve that ideal behaviour (FMR = 0, FNMR = 0), as depicted in Figure 11, when handling monomodal biometric systems; and/or (ii) to fuse several biometric traits in order to build multimodal biometric systems able to gradually improve the performance exhibited by each of the biometric features alone. % population I (t) G (t) 0 error 100% FMR (t) FNMR (t) 1 t EER = 0 0 1 t Figure 11. Ideal personal recognition system performance. No overlap between G(t) and I(t) distributions exist, and EER = 0 when selecting the proper application threshold. Biometrics technology has advanced tremendously over the last few years and has moved from research and academia to industry and end-market in the way of real-world applications. However, the state of the art in biometrics points out the existence of a large room for improvement in the accuracy exhibited by those automatic personal authentication and identification systems based on monomodal or multimodal technologies today. Robust biometric sensors, powerful computational platforms, and complex recognition algorithms are needed to overcome the existing performances of present systems. 11 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 1.2. Fingerprint Biometrics The skin on the human fingers and palms of the hands along with the skin on the soles and toes of the foot feature some ridge-vallley patterns, as shown in Figure 12, which have some well-known functional purposes: (i) They increase the friction between the skin and other surfaces, thereby reducing slipping. Because of the wide variety of ridge orientations on the surface, the ridges are able to increase the drag in all directions. Thanks to the presence of skin ridges on our fingers and palms, objects held in our hands do not slip through our fingers. (ii) The corrugation also benefits the sense of touch by increasing sensitivity and helping to distinguish different textures. The skin ridges also amplify and/or filter vibrations triggered when fingertips brush across an uneven surface. These processes help transmiting the signals of touch to deeply embedded nerves involved in fine texture perception. Figure 12. Detail of human finger. History of Fingerprints Apart from those inherent roles of fingerprints, their distinguishing traits have permitted to use them for personal recognition purposes along the history. The first known usage of fingerprints as signs of identity takes place in the prehistory, where the early pot makers used fingerprint impressions to identify the work as their own. There is evidence of fingerprints during the building of the pyramids in Egypt. In ancient Babylon, fingertips were pressed into clay to record business transactions. In ancient China, it was common practice to use inked fingerprints on official or legal documents: land sales, contracts, loans, criminal confessions, etc. Although fingerprints were used to establish identity in courts, it is not clear whether our ancestors were aware of the uniqueness of fingerprints. In the recent history however, the permanence of the ridge-valley patterns of fingerprints has been proven, the uniqueness of fingerprints has been theorised, the anatomical classification of fingerprints as a function of the ridge-valley flows has been carried out, and fingerprints have served all governments worldwide during the past 100 years to provide accurate personal identification. In addition, they have been extended to other forensic, governmental and civilian applications till the point that nowadays fingerprints have become the oldest and most deeply used signs of identity in the biometrics field. Fingerprints classification The current fingerprint classification systems (e.g. FBI finger classification) are based on three distinct groups of patterns, and each of the groups is split into several subcategories, as depicted in Table 3. Some examples of each category are given in Figure 13, Figure 15 and Figure 16. Pattern category Arch Loop Subcategory Plain Arch Tented Arch Radial Loop Ulnar Loop Plain Whorl Central Pocket Loop Double Loop Accidental Whorl Population Percentage 5% 60~70% Whorl 25~35% Table 3. Fingerprints classification. 12 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Arches represent about 5% of the fingerprint patterns encountered in the population. In arch patterns, the ridges run from one side of the finger to the other without making any backward turn. There are two subcategories of arch patterns: - Plain arches, where the ridge-valley pattern flows through the print, from one side to the other, with a small rise or wave in the center but without significant changes in the orientation. - Tended arches, where the ridges in the center of the pattern are not continuous as in the case of the plain arches, but they make a significant change. The ridges, which adjoin each other in the center, converge and thrust upward, giving the impression of a pitched tent. A B C D Figure 13. Examples of arch patterns: A&B) plain arches, C&D) tented arches. Loops represent between 60% and 70% of the fingerprint patterns encountered in the population. In loops, the ridges run from one side of the finger, make a backward turn without twisting, and go out the same side of the finger. Loops can be further divided in two subcategories: - Radial loops, which flow toward the radius bone of the hand, as indicated in Figure 14. - Ulnar loops, which flow toward the ulna bone of the hand, as shown in Figure 14 as well. THUMB FINGER INDEX FINGER MIDDLE FINGER RING FINGER LITTLE FINGER RADIUS BONE ULNA BONE Figure 14. Detail of human hand. The radius joins the hand on the same side as the thumb, and the ulna on the same side as the little finger. E F G H Figure 15. Examples of loop patterns: E&F correspond to radial loops if they belong to fingers of the right hand, or ulnar loops if they belong to fingers of the left hand. Similarly, G&H are ulnar loops if they belong to the right hand or radial loops if they belong to the left hand. Patterns E&F are also called left loops, and G&H right loops regardless of which hands they belong to. Whorls constitute between 25% and 35% of the patterns. In whorls, the patterns posses two or more singular points of type delta, and there exists a recurve preceding each delta. The whorl class is divided into four subcategories: - Plain whorls, where there is at least one ridge making a complete circuit, which may be circular, oval, spiral, etc. in shape. They are the most common. 13 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 - Central pocket loops, similar to the previous patterns but where the ridges recurve a second time forming a pocket within the loop. - Double loops, where the ridges exhibit two different loop formations. - Accidental whorls, covering that relatively small amount of too irregular patterns that do not match the previous whorl subcategories or any other conventional type. I J K L M N O P Figure 16. Examples of whorl patterns: I&M are plain whorls, J&N are central pocket loops, K&O are double loops, and L&P are accidental whorls. Fingerprints characteristics Independently of the biometric identifier(s) used, designing algorithms able to extract those salient features and match them in a robust way is nowadays an open problem under active research. Although great progress has been done in the last decades, efficient automatic human recognition is still a technical challenge. As it can be deduced from Table 1, fingerprint pattern recognition results in a reliable method, which features a good balance of all those desirable properties. Fingerprint classification is generally based on global features such as ridge-valley pattern structures and singularities, and fingerprint matching is mostly based on local features. The science of fingerprint examination has matured to a point at which three different levels of detail are distinguished in fingerprints: (i) The level 1 focuses on the overall appearance of the fingerprint patterns. At this level, aspects such as the general ridge flow, the fingerprint shape, its pattern classification, the ridge-valley orientation field, the frequency of the ridges along the image, or the presence of singular points such as cores and deltas are considered. (ii) The level 2 goes more in depth and focuses on the spatial distribution analysis of those friction ridge details, called minutia points and based on the ridge endings and ridge bifurcations, present along the pattern. (iii) The level 3 deals with the characterization of the intra-ridge details at the very-fine level. This kind of analysis is only possible in case of having high-resolution fingerprint impressions, from which to extract those salient characteristics such as the edge shape and the width of individual ridges, the location of dots and incipient ridges, the position of the finger sweat pores that are distributed along the ridges of the skin, etc. The different types of features exhibited by the fingerprint patterns directly depend on the scales used in the analysis. The higher is the detail level used in the fingerprint examination process, the higher is the distinctiveness exhibited by the features under analysis. 14 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Fingerprint classification and matching processes are extremely difficult pattern recognition problems due to the large intra-class variability exhibited by fingerprints, as well as the small interclass variability. Fingerprint intra-class variability refers to the fact that a high variability between two fingerprint impressions of the same finger can exist, as depicted in Figure 17, because of multiple factors such as displacement, rotation, or partial overlap between consecutive acquisitions of the same finger, variable skin conditions, variable pressure, hostile environmental conditions present in the acquisition process, etc. Small inter-class variability refers to the fact that a high level of similarity can be present in two fingerprint impressions generated from different fingers, as shown in the examples of Figure 18. Figure 17. Fingerprint impressions of the same user. Example of high intra-class variability exhibited by fingerprints. Figure 18. Fingerprint impressions of different users. Example of low inter-class variability exhibited by fingerprints. Among all biometric technologies, this work is focused on fingerprint traits for several reasons: (i) Owing to their proven distinctiveness. The uniqueness of human fingerprints is not an established fact but an empirical observation. The overall pattern of ridges and valleys is largely determined by genetic factors. However, during its formation, minor development disturbances create local ridge irregularities. These disturbances are consequence of the unique development environment of the foetus in the womb, so all ridge-valley patterns are unique when examined sufficiently closely. It is believed that do not exist in the world two people with identical fingerprints. Fingerprints permanence is a proven fact. Fingerprints formation occurs during the embryo development; at about the seventh month of foetus development, fingerprints are fully formed, and they remain unchanged throughout an individual’s life. (ii) Owing to their non-intrusive acquisition methodology. The first recognition systems were based on manual methods of fingerprint acquisition –by means of ink and paper techniques– and matching –by means of human experts in charge of comparing finger impressions–, what resulted time-intensive, slow and expensive. The increasing demands on fingerprint recognition systems required soon the development of automatic acquisition and matching techniques. The development of electronic fingerprint sensors helped in improving the user convenience in the fingerprint acquisition stage, during either the enrolment or authentication/identification processes. The continuous development of digital fingerprint sensors based on several techniques –and specially those based on solid-state technologies– at progressively lower costs makes easy the development of automatic recognition systems in the form of embedded platforms such as cellular phones, smart cards, personal data assistants, etc. with small fingerprint sensors able to capture high quality images in a reliable way. 15 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 (iii) Owing to the profound knowledge in this field and the continuous research advances. The background in digital signal processing and pattern recognition techniques applied to fingerprint biometrics shows a gaining interest in fingerprint recognition research and has converted fingerprint recognition in one of the most mature biometric technologies. The success of fingerprint-based recognition technology in the law enforcement application field (forensics, criminal investigations, national ID cards, etc.) after more than one century of experience has permitted the spread of fingerprint biometrics in either civilian, commercial or financial domains in the last decades till the point that, nowadays, fingerprints continue to be the most widely used physiological characteristics in automatic personal recognition systems; and fingerprint recognition technology is leading the market share, being exploited in a large amount of real-world applications. 1.3. Physical Implementation of Fingerprint-Based Recognition Systems The development of personal recognition systems based on fingerprints has a long history. The first applications were oriented to personal identification purposes, in both law enforcement and forensic fields, but progressively the identification methodologies were extended to other fields such as governmental and civilian applications. Moreover, fingerprint recognition focused on personal authentication is developed and exploited in many consumer applications today. Those automatic recognition systems based on the personal identification of individuals from that distinctive information available in fingerprints are referred to as AFIS (Automatic Fingerprint-based Identification Systems), whereas those others systems focused on authentication are known as AFAS (Automatic Fingerprint-based Authentication Systems). Concerning personal identification in general, the anthropometry was the first scientific system used by the police to identify criminals. It was developed in the 1880s by Alphonse Bertillon. The Bertillon System of Anthropometric Identification, based on a number of meticulous physical measurements of body parts (extremities, head, etc.), corporal descriptions (such as the colour of the eyes, hair, and skin, etc.) and photographs that produce a detailed definition of any individual, was superseded by fingerprint identification systems around the turn of the century, after general acceptance of the higher distinctiveness provided by human fingerprints. The Henry Classification System, developed by Edward Richard Henry, and mainly focused on five types of prints –arch, tented arch, left loop, right loop and whorl–, replaced the Bertillonage system as the primary method of human recognition throughout most of the world in the 20th century. Figure 19. Human experts matching fingerprints. Fingerprint recognition was formally accepted, and fingerprint characteristics were used as evidence in the court of law to establish a proof of identity. The first recognition systems were based on manual techniques of fingerprint analysis. As it is depicted in Figure 19, the personal recognition 16 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 process required well-trained experts in the methodologies of fingerprint acquisition, classification/indexing techniques, and fingerprint matching. A set of rules was developed to perform measurements of those biometrics traits available in fingerprint impressions, to match different biometrics to decide whether a pair of biometric measurements belong to the same person or not, or to search a given biometric measurement in a database consisting of a number of other measurements of the same biometrics. However, with the rapid expansion of fingerprint recognition in forensic and law enforcement applications, fingerprint databases became so huge that manual verification became obsolete and infeasible because of its obvious tediousness and high costs. Therefore, the automation of the fingerprint matching process became a priority and the development of AFIS applications was the only chance to cope with the needs related to criminal identification or identity verification. In the 1970s, computers were in existence, and the FBI knew it had to automate the process of classifying, searching for and matching fingerprints. The Japanese National Police Agency paved the way for this automation, establishing the first electronic fingerprint matching system in the 1980s. In the 1990s, the AFIS applications began widespread use around the country. This computerized system of storing and cross-referencing criminal fingerprint records became capable of searching millions of fingerprint files in minutes, revolutionizing thus the law enforcement efforts. Today’s scenario points out that, among all the methods of personal identification used along the history such as branding, tattooing, distinctive clothing, photography, body measurements (Bertillon system), or others, biometrics in general, and fingerprinting in particular, has proven to be superior to those older methods. This thesis is specifically focused on authentication applications, keeping as a second and future step of research the physical implementation of efficient identification systems. The reason behind that is the fact that the authentication topic can be understood as the basic or the elementary personal recognition problem. Once the authentication problem is solved, some of its applied solutions can be ported to some extend to the identification field. The authentication problem can be seen as one of the portions in which the identification problem is split. Therefore, it is needed to solve each of the portions individually in order to solve the global problem, and the authentication matter is the first of the steps. Figure 20. Automatic Fingerprint Authentication System. Accepting that one human expert, based on visual inspection and experimental measures, can reliably determine whether two fingerprint impressions correspond or not to the same finger, the next step consists of automating the human mind to match fingerprints, as summarized in Figure 20. The implementation of one electronic system able to perform the same task in an automated way, without the need of human supervision, is a challenge today because of the major differences in the capabilities of human minds versus computers. 17 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The design of highly secure, reliable and accurate biometrics-based authentication systems able to automate the human experts methodology of establishing a proof of identity involves a multidisciplinary approach. The performance of the system is not only a matter of its recognition accuracy (in terms of indicators like FMR and FNMR); also other factors such as its acceptability by the users, the security exhibited by the system against fraudulent attacks, the recognition response time, the amount of resources required to implement the application, its power consumption, or the system costs take part of the whole AFAS efficiency. The minimum functional blocks that need to be present in any automatic fingerprint authentication system are the following: (i) Fingerprint acquisition unit, in charge of acquiring digital impressions of the users’ fingerprints in both enrolment and authentication stages of the recognition process. (ii) Image conditioning unit, responsible for enhancing the acquired fingerprint images to make easy the next feature extraction stages. (iii) Feature extractor unit, with the role of identifying that discriminatory traits available in the fingerprint impressions. The set of extracted features is used as the inherent and distinctive information of the user in the following matching stages. (iv) Feature matcher unit, aiming at matching the template and query feature sets in order to determine whether both fingerprints belong or not to the same finger/user. Additional functional blocks such as cryptographic units in charge of securing the information that flows externally to the authentication system can exist in order to improve the security of the whole system against potential fraudulent attacks. In the next two figures, the block diagram of a generic AFAS application is depicted. In Figure 21, those operative units along the user enrolment process are shown. Figure 22 shows those active units that take part in the authentication process. Those inoperative blocks in each of the stages are marked in dotted lines. Legitimate Identity SENSOR FINGERPRINT ACQUISITON UNIT IMAGE CONDITIONING UNIT FEATURE EXTRACTOR UNIT T USER’S TEMPLATE INPUT: User’s Finger (T - Template) FEATURE MATCHER UNIT AUTOMATIC FINGERPRINT AUTHENTICATION SYSTEM Figure 21. User enrolment process. Claimed Identity SENSOR FINGERPRINT ACQUISITON UNIT IMAGE CONDITIONING UNIT OUTPUT: Authentication Result FEATURE EXTRACTOR UNIT Q USER’S TEMPLATE T INPUT: User’s Finger (Q - Query) FEATURE MATCHER UNIT AUTOMATIC FINGERPRINT AUTHENTICATION SYSTEM OUTPUT: Authentication Result Figure 22. User authentication process. 18 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The fingerprint acquisition phase becomes the input stage of the fingerprint recognition application in both enrolment and authentication processes, and in essence it has many implications for the design of the rest of the system. The image acquisition process (and image reconstruction in case of dealing with sensors of sweeping technology), as well as the image enhancement and the feature extraction processes will be fitted to the own characteristics of the sensor device (quality aspects, robustness against hostile environmental conditions –humidity, etc.– or low quality fingerprints – too dirty, too wet, with even surfaces because of work conditions in case of miners, farmers, etc., habituation factors of the user to the system, etc.–). The authentication result, or the assessment that template and query fingerprints either match or not match, is the output of the recognition application. The final performance of the biometric system depends on all the components in the chain: the sensor, the acquisition process, the quality of the acquired images, the recognition algorithm itself, etc. The state of the art on fingerprint biometrics shows a slow but progressive improvement in the recognition accuracy performances of the fingerprint matching algorithms developed along the last four decades. Open technology evaluation contests such as the Fingerprint Verification Competitions (https://biolab.csr.unibo.it/) have been developed as a mean of fair evaluation of the advances in fingerprint matching techniques. The participant algorithms are tested under the same conditions and the same fingerprint databases to get objective comparative results. To set some examples, Table 4 shows some performance indicators of the best algorithms in FVC2000, FVC2002, FVC2004 and FVC2006 contests. Database (Classification) DB1 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd FVC2000 0.67% (0.96s) 1.17% (0.88s) 5.06% (0.89s) 0.61% (1.03s) 0.82% (0.93s) 2.75% (0.17s) 3.64% (2.13s) 4.01% (1.94s) 5.36% (0.36s) 1.99% (0.77s) 3.11% (0.69s) 5.04% (0.11s) FVC2002 0.10% (2.47s) 0.24% (1.15s) 0.25% (1.99s) 0.14% (2.03s) 0.17% (2.44s) 0.21% (1.12s) 0.37% (2.24s) 0.72% (1.86s) 0.81% (0.25s) 0.10% (0.73s) 0.17% (1.06s) 0.21% (2.04s) FVC2004 1.97% (1.87s) 2.72% (3.19s) 3.38% (0.75s) 1.58% (0.83s) 2.59% (0.57s) 2.79% (0.62s) 1.18% (2.30s) 1.20% (0.85s) 1.64% (0.81s) 0.61% (0.53s) 0.80% (1.06s) 1.01% (0.65s) FVC2004(*) Light category FVC2006 5.564% (0.039s) 5.978% (0.506s) 6.122% (0.311s) 0.021% (1.100s) 0.032% (1.461s) 0.095% (0.899s) 1.534% (1.678s) 1.608% (0.307s) 1.645% (0.641s) 0.269% (0.601s) 0.453% (0.238s) 0.466% (1.023s) FVC2006(*) Light category DB2 DB3 DB4 3.89% (0.21s) 4.18% (0.16s) 4.78% (0.11s) 4.01% (0.23s) 4.02% (0.18s) 4.25% (0.21s) 2.92% (0.25s) 3.21% (0.21s) 3.53% (0.15s) 1.88% (0.21s) 1.99% (0.16s) 2.03% (0.11s) 5.356% (0.029s) 5.564% (0.039s) 5.888% (0.036s) 0.148% (0.091s) 0.158% (0.066s) 0.169% (0.056s) 1.634% (0.056s) 1.645% (0.046s) 2.351% (0.082s) 0.427% (0.049s) 0.496% (0.054s) 0.522% (0.062s) Table 4. Accuracy and efficiency indicators –EER (Avg. Matching Time)– of the three most accurate algorithms in FVC evaluation contests. Databases are different in each evaluation contest. (*) Light category refers to the fact that the system in charge of the processing is featuring limited computing resources. The enrolment and matching times are limited, as well as the maximum amount of memory allocated in the processing (4 Mbytes) and the template size (2 Kbytes). Different personal computer platforms were used to carry out the performance evaluation of the algorithms submitted to each of the contests: - FVC2000: Windows NT 4.0 and Linux RedHat 6.1 O.S. on PC Intel Pentium III – 450 MHz. - FVC2002: Windows 2000 O.S. on PC Intel Pentium III – 933 MHz. - FVC2004: Windows XP Professional O.S. on PC AMD Athlon 1600+ – 1.41 GHz. - FVC2006: Windows XP Professional O.S. on PC Intel Pentium 4 – 3.2 GHz. In 2009, a new tool was developed in order to measure the performance of those fingerprint-based recognition algorithms without the need of taking place specific evaluation contests. The new tool is called FVC-onGoing, and is a web-based automatic evaluation system. Specific databases are used to measure the performance of those submitted algorithms in two different modalities: open systems –using either proprietary or standard template formats to define fingerprints– or ISO-based systems –using a standard minutiae-based template format according to ISO/IEC 19794-2 (2005)–. Table 5 shows some of the published results (https://biolab.csr.unibo.it/FVCOnGoing/). 19 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FVC-onGoing evaluation date 31/08/2011 31/08/2011 29/08/2011 29/08/2011 15/05/2011 15/05/2011 15/05/2011 15/05/2011 14/05/2011 14/05/2011 14/05/2011 14/09/2010 14/09/2010 26/08/2010 25/08/2010 18/04/2010 02/04/2010 01/03/2010 01/03/2010 24/02/2010 24/02/2010 25/11/2009 25/11/2009 31/08/2009 20/07/2009 20/07/2009 15/07/2009 24/06/2009 Database FV-HARD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-STD-1.0 FV-HARD-1.0 FV-STD-1.0 FV-STD-1.0 FV-STD-1.0 EER 0.722% 0.142% 0.687% 0.108% 0.176% 0.700% 2.021% 0.418% 1.257% 0.293% 6.701% 0.735% 0.118% 6.769% 3.649% 3.649% 0.813% 0.827% 0.194% 0.216% 0.824% 1.046% 0.261% 0.665% 1.528% 0.281% 1.265% 1.618% FVC-onGoing evaluation date 26/12/2011 15/05/2011 15/05/2011 14/05/2011 14/05/2011 24/03/2011 24/03/2011 15/12/2010 15/12/2010 30/11/2010 15/09/2010 22/07/2010 22/07/2010 02/04/2010 09/03/2010 26/02/2010 26/02/2010 09/02/2010 12/10/2009 26/09/2009 26/09/2009 09/09/2009 20/07/2009 20/07/2009 Database FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-STD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 FMISO-STD-1.0 FMISO-HARD-1.0 FMISO-STD-1.0 EER 1.089% 0.234% 1.113% 0.380% 1.588% 0.234% 1.103% 0.258% 1.407% 1.017% 1.334% 0.570% 2.315% 0.559% 2.400% 1.700% 0.432% 1.927% 0.317% 2.552% 0.582% 0.405% 2.430% 0.598% Table 5. Accuracy indicators –EER– of those published results in FVC-onGoing on-line evaluator web site since June 2009 till December 2011. Performance evaluation of fingerprint verification open algorithms (left side) and matching algorithms based on standard minutiae representation template format [ISO/IEC 19794-2 (2005)] (right side). As an increasing number of electronic applications are being deployed today, the need to easily and safely authenticate ourselves to machines gets more and more relevance in the current technological age. The development of powerful and low-cost computational platforms in the way of embedded systems has evoked considerable interest, to be considered as an alternative to those expensive HPC platforms traditionally in charge of the biometric recognition process. In this direction, deciding efficient architectures for integration is an open research problem that becomes the main objective of this thesis. There is a major design compromise that needs to be handled: application needs versus available technologies. The continuous but slow accuracy improvement experienced by the recognition algorithms points clearly out the need of developing highly flexible platforms able to absorb future algorithm updates on the field, covering both hardware and software upgrades to avoid thus the penalty of having to replace the complete physical platforms later on because of the new demands of the evolving recognition algorithms and/or the new biometric-based applications. 1.4. Thesis Motivation and Scope Thesis Motivation In the last decades, the biometrics field, or the idea of using body and behavioural measurements for establishing a proof of identity of human beings, has been gaining popularity. With the advent of the current technological age, the automation of the personal recognition process is on the rise, and the physical implementation of biometric systems is a reality. Both private and public sectors increasingly adopt biometric recognition systems to enhance the security of their products/services. One of the most representative examples is the set composed of AFIS/AFAS applications, which uses that discriminative information available in human fingerprints in order to recognize the identity of any individual. The fingerprints comparison process involves a large variety of pattern matching and image processing tasks, which demand the usage of powerful computational platforms in order to optimize the execution time performances featured by the applications. 20 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The exploitation of personal recognition systems based on biometric technologies and their usage as a first filtering step in any customized service/product becomes relevant and features some added value that is appreciated by the society. However, a benchmark on biometric systems reveals the low accuracy performance of the state-of-the-art recognition algorithms, the reduced security level and the privacy issues related to the biometric data, as well as the existing compromise between execution time and system costs as the main handicaps of current physical systems. Some proof of that is the recent technological patent US007800592, developed by Apple Inc. and known as “Hand held electronic device with multiple touch sensing devices”, which points to the personalization as the final purpose of the biometrics technology instead of the security because of the low reliability performance exhibited by today systems. Another example is the report “Biometric Recognition: Challenges and Opportunities” from the National Research Council (2010) about the state of the art of automated biometrics-based recognition security, where it is argued that existing technologies as implemented are inherently fallible, and that more research and better practices are needed before they can be relied upon in high-security contexts. Besides, other examples are the fact that although tiny fingerprint scanners have been built into many laptops or other mobile devices today, they are not widely used for the simple reason that they do not work reliably enough. Therefore, the market clearly demands the existence of more reliable and more efficient products, and there is a real need that justifies the research on this area. Although the society has sometimes dismissed the reality and has seen biometric recognition as that ideal and already proven solution, the implementation of biometric recognition systems raises many open issues in several disciplines and numerous sources of uncertainty. Biometric recognition systems are incredibly complex, and need to be addressed as such. Because of that now, at the start of the twenty-first century, there exists a change in the trends of the personal recognition algorithms: those original architecture-driven recognition algorithms evolve to application-driven algorithms, where the main objective is the search of that reliable algorithm able to meet the application demands (recognition accuracy) in spite of the power requirements of the computational platforms in charge of the processing. The physical platform in charge of executing the recognition algorithm is open, and it must not constrain the processing algorithm itself. The main goal is to develop that ideal or foolproof algorithm able to reach the requested recognition accuracy performance at the expense of the physical resources to be finally used. Many research lines are focused on the analysis of alternative architectures such as HPCs, multiprocessor platforms, GPUs or FPGAs, etc. to those traditional personal computer platforms or embedded systems based on general-purpose microprocessors or DSPs used in forensic, governmental or commercial systems. In order to spread the biometric security all over the world, it is needed to think about physical platforms based on powerful embedded systems that can be easily integrated into any kind of product such as mobile phones, automatic teller machines, laptops, personal digital assistant devices, access control systems, etc. In this direction, the usage of programmable logic devices offers some advantages since a continuous improvement of the recognition algorithms is foreseen for the future so the development of flexible platforms able to absorb any potential modification of the recognition algorithms in the field, on existing products and without impacting on further development costs, is appreciated. In this sense, the deployment of made-to-measure digital circuits used as application-specific hardware accelerators making use of parallelism and pipelining techniques on field programmable logic devices can be easily justified. Unlike those flexible platforms based on purely software execution on fixed hardware resources such as HPC, GPU or multiprocessor platforms, the usage of programmable logic devices in the way of FPGAs or SOPCs adds a new dimension in the development of the physical platform and allows the flexibility not only at software level but also at hardware level thanks to their programmable available resources. The physical implementation of the suggested applications is also a challenging and motivating factor since a multidisciplinary approach is needed that covers many different fields such as (i) 21 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 system architecture and hardware topics, (ii) software topics, (iii) system integration and hardwaresoftware co-design techniques, (iv) recognition algorithms, (v) programmable logic devices such as FPGAs or SOPCs and hardware description languages, (vi) EDA tools linked to FGPAs and SOPCs devices, (vii) power consumption topics, (viii) costs optimization, etc. It permits to get a general view of most of the aspects to be taken into consideration in the development of such real-world products. Hence, the motivations for selecting this thesis are diverse, and fingerprint-based recognition applications implemented on programmable logic become a challenging field of research that justifies the intellectual effort invested in this work. Thesis Scope This thesis encompasses the study of one specific system architecture on which to develop highperformance applications such as those derived from the biometrics field. The justification of the proposed architecture is performed in terms of accomplishment of the application functional requirements, as well as in terms of improvement of the execution time performances, minimization of the hardware resources, and optimization of the system costs in comparison with other existing systems, what results in some valuable advances in the state of the art of fingerprint-based personal recognition applications. This thesis provides the means (i) to understand the purpose of biometrics technology, the issues underlying the design of biometrics systems, and the application fields that take benefits on the exploitation of biometrics technology, (ii) to receive an in-depth survey of the state of the art in automatic fingerprint-based recognition algorithms and systems, (iii) to explore an alternative architecture solution suitable for the design of highly secure, reliable and accurate fingerprint-based authentication systems, and (iv) to identify the intrinsic limitations of the existing solutions, the main trade-offs in the design of those physical systems, and the open issues that need to be addressed to provide better systems in the coming future. The thesis is organized in 6 chapters. After the introductory chapter, which presents the research topic and the main objectives of this work, the thesis addresses each one of the following aspects in the next chapters. Chapter 2 is focused on the development of fingerprint maching algorithms and presents an overview of the recent advances done in fingerprint recognition, covering sensing techniques, fingerprint image enhancement, feature extraction, feature alignment and matching processes. It provides a general view of the different functional modules and stages that take place in one AFAS application, as well as the main weaknesses and limitations that need to be addressed in the coming years. Chapter 3 surveys the existing system architectures of those computational platforms oriented to the development of image and signal processing applications demanding real-time performances like biometric systems. The chapter is organized in different sections, and in each of the sections the pros and cons of the potential solutions are analysed. Among the different options presented, the deployment of computational platforms based on field programmable logic devices appears to be a promising research direction able to overcome some of the intrinsic limitations of those already available solutions in the computer science arena. Chapter 4 provides an in-depth survey of the state of the art in fingerprint authentication systems and their adoption in multiple civilian recognition applications. A review of the existing fingerprint sensors that can be easily integrated in embedded system architectures is also provided. To facilitate readers who do not have the proper background on biometric technologies, the main research institutions, technology evaluation programs, and bibliographic sources on fingerprint biometrics are also given. Chapter 5 deals with the main contributions of this thesis. It discusses the technical design, the physical implementation, and some maintenance aspects of fingerprint recognition systems based on programmable logic devices under system-on-chip architectures. Two different computational platforms, with dissimilar features but under the same basic system architecture, are used to 22 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 implement the same personal recognition algorithm. Those two different approaches of the same real application permit to evaluate the feasibility of the proposed system architecture in both scenarios. The fingerprint recognition algorithm suggested in this work is not developed from scratch but based on some best-in-class biometric algorithms well described in literature [Jain et al., 1999a], [Maltoni et al., 2009]. The justification of each of the processing stages that take place in the recognition algorithm is provided. The accuracy performance of the proposed system is evaluated with a public database and discussed in this chapter. Both implementations result in a technical proof of concept of the proposed architecture, and it stands out as a suitable alternative to those existing systems. Its main advantages and drawbacks are also detailed. In order to overcome some of the limitations featured by the suggested system, Chapter 6 covers in its conclusions and future work sections those technological aspects that can help in improving the performance of the proposed systems, and those other points that require further investigation. In terms of research publications, the main contributions of the author along the realization of this thesis are also listed in this chapter. What is not in the scope of this thesis, although some remarks are given along the text, are: (i) aspects like the available techniques for securing biometric systems against external attacks such as the inclusion of cryptographic processors or protocols in order to provide further security in the bidirectional communication link that exists between the end-user and the recognition system itself, or in the internal links that exist to interchange any kind of information among the different functional blocks that take part of the computational platform; (ii) other specific aspects related to the development of reliable recognition algorithms aiming at improving the intrinsic limitations in accuracy and robustness against low quality fingerprint impressions featured by current algorithms; (iii) some privacy issues related to the biometric data management of the recognition systems; and (iv) the benefits of the run-time reconfigurability performance featured by some existing FPGAs, which can help in lowering the system costs and the power consumption of the recognition systems at the expense of the reconfiguration overhead. Readers interested in the development of embedded systems that exploit the run-time reconfigurability performance of programmable logic devices may refer to the thesis “Embedded electronic systems driven by run-time reconfigurable hardware”, written by Francisco Fons (Universitat Rovira i Virgili, 2012), where some detailed examples on the implementation of biometric embedded systems under such a technique are also provided. Because of the limited accuracy performance exhibited by the presented recognition algorithm, and the fact that the present thesis has not dealt with security aspects related to cryptographic protocols, the intended field of application of this work, as it is, is that set of applications requesting moderate reliability levels such as most of the commercial applications in use today: access control to restricted facilities (a library, a gym, a parking, etc.), personal devices (mobile phones, PDAs, laptops, cars, etc.), web access, etc. which do not imply excessive risks in case of false matchs or fraudulent attacks. In the near future however, if cryptographic processing is embedded in the system, and the reliability of existing recognition algorithms gets further improved, the proposed system architecture would be also valid to cover other market segments such as law enforcement, forensic and high-security applications, or even fingerprint-based identification applications (since most of the processing stages involved in the authentication process are also valid for identification). Although absolute security does not exist, the implementation of efficient biometrics-based personal recognition systems allows improving the security performance of those applications previously handled through either physical-based personal ID tokens or knowledge-based tokens. This work aims at helping to establish confidence in the usage of automatic biometric systems, reducing fraud, and enhancing public safety in daily use applications. 23 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 24 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2. Automatic Personal Authentication Using Fingerprints Although crucial aspects such as the distinctiveness and the permanence or stability over time of human fingerprints are already accepted and proven facts respectively, the physical implementation of automatic systems in charge of recognizing a person based on the analysis of his/her inherent fingerprint features is still a technological challenge nowadays. Many scientific aspects need to be taken into consideration, covering from biometric algorithms [Jain et al., 1999a], [Jain et al., 1999c], security and privacy concerns [Mateos and Pizarro, 2005], to system design issues like hardware and software implementation, power consumption, cryptographic protocols for the safe exchange of information, and so on [Maltoni et al., 2003], [Maltoni et al., 2009]. This section covers the development of personal authentication systems from an algorithmic point of view. The state of the art on fingerprint recognition algorithms is provided, with special emphasis on each of the stages of the processing. Reliable algorithms are intended, able to cope with multiple known constraints such as low quality fingerprint impressions, large intra-class and small inter-class fingerprint variabilities, etc. that limit the recognition accuracy performance of current automated systems. 2.1. Personal Authentication Process The fingerprint-based personal authentication problem consists in verifying the identity claimed by one individual through the analysis of his/her own fingerprint characteristics. The user authentication process is usually implemented as a checking point in those applications that feature personalized access or limited access to registered users only. The user who requests the access to the application must exhibit first his/her identity to the authentication system. The authentication system is then responsible for confirming that the user is really the person who claims to be, and he/she possesses the proper application privileges by comparing or matching the biometric traits of the user with the biometric traits of the claimed person. Although depending on the authentication domain, the personal recognition system can operate either as an on-line system (where and immediate authentication response is needed) or an off-line system (where a long response delay is allowed); and either as an attended process (manual support is allowed in the recognition process) or a non attended process, this work focuses on those on-line and fully automatic unattended systems. The authentication system tries to find the right answer to the question: “Is the user really who claims to be?”, and tries to do so with real-time characteristics and in an automatic way, without any human support apart from the one related to the own user’s interaction with the authentication system. As a result of the authentication process, the system either accepts or rejects the submitted claim of identity done by the user. To do this, the AFAS bases the analysis on fingerprints. The AFAS conducts either one to one, one to few, few to one, or few to few comparisons between one or several on-line fingerprint signatures directly obtained from the user (one or several instances of one or several fingers of the user), and one or several fingerprint templates (instances of either one single finger or several fingers) related to the claimed identity –pre-stored off-line in the authentication system during the enrolment stage–. After the comparison, the AFAS decides whether the identity claimed by the user corresponds to the genuine user’s identity or not. The authentication process is split in two main stages: (i) the enrolment and (ii) the verification or authentication phase. Both stages are covered in the next sub-sections. 2.1.1. Enrolment The Enrolment stage refers to the process of recording the user’s biometric features into the authentication system, either in the way of a centralized database where all the legitimate users are registered, or in a decentralized and individualized database such as a personal smart card issued to each user. The fingerprint characteristics of every user are recorded together with his/her personal 25 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 data in this stage, so the identity of each user is properly trusted by the system. It is understood that only those users enrolled in the system will have the right application privileges. The enrolment process consists of several steps. In the first step, called fingerprint acquisition, a digital bitmap of the user’s fingerprint is obtained by means of an electronic fingerprint sensor. A greyscale image with the ridge-valley flow pattern of a specific finger of the user is handled as the physical input to be processed in the enrolment stage. In order to improve the quality of the acquired image, the original fingerprint impression is enhanced in a second step by digital signal processing techniques. The fingerprint enhancement process makes easy the next step, the feature extraction process, where those distinctive features linked to the individual’s identity are identified, codified, and grouped in a compact representation form that is called template. The template is usually a compressed version of the fingerprint, with losses. It is not possible to recover the original fingerprint image from the template, but the template has enough information to define the uniqueness of the fingerprint. This template, together with the personal data of the user, becomes the identity-token used by the authentication system. The template stores those unique and invariant distinctive characteristics inherent to the user, so it is converted into the personal identifier of the user. In the latest step of the enrolment process –template storage–, the user’s information is stored in the AFAS, in either a central database or a personal smart card. The different sequential steps that take place in the enrolment process are depicted in Figure 23. The enrolment process needs to be inherently secure since the authentication system bases its decisions in the enrolled templates, and it affects directly to the accuracy and reliability of the complete application. For this reason, and depending on the security requirements of the authentication system, all the personal information (user’s data, fingerprints, templates, etc.) that flows externally to the system should be protected by encryption techniques when being transferred from external acquisition modules (fingerprint sensors, and users’ personal data transmission interfaces) to the processing units, or from the processing units to external storage resources (portable smart cards, or external users’ databases). 2.1.2. Verification Once the user is enrolled in the authentication system, he/she becomes a valid owner of the application privileges. In order to make use of those privileges, the user needs to claim his/her identity and to exhibit his/her fingerprints to the authentication system. The authentication system is then responsible for extracting the biometric features from the acquired fingerprints, and matching them against those template features corresponding to the claimed identity (it is assumed that the template features were properly stored in the enrolment stage). The comparison of both biometric features will result in the decision whether both biometric characteristics belong or not to the same finger/user. In case of a positive match, the authentication system considers the user as a genuine client and gives him/her access to the application. Otherwise, the user is judged as an impostor and the access to the application is denied. As in the enrolment stage, the authentication stage is split in several steps: fingerprint acquisition, fingerprint enhancement, feature extraction, and feature matching. Although most of these steps are identical to those of the enrolment stage, one new processing step takes place in the authentication stage. In the feature matching step, the query and template features are compared in order to decide if the original query and the template fingerprints correspond or not to the same finger/user. Thanks to the believed uniqueness of human fingerprints, once decided that both fingerprints are generated from the same finger, it is easy to state that template and the query users are both the same individual. On the contrary, if after the comparison it is decided that both fingerprints do not belong to the same finger, it is stated that both individuals are not the same (although it would be possible that template and query fingerprints were impressions of two different fingers from the same user). The steps that take place in the authentication or verification process are depicted in Figure 23. In a similar way as in the enrolment stage, during the authentication stage it is needed to protect the system against any malicious attack. Encryption techniques applied to those data sets received from 26 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 outside (query fingerprint, query claimed identity, template, etc.), or to those data transferred from the authentication system to the external world (matching result in the way of authentication ok/nok response) are good alternatives to increase the robustness of the system against fraudulent attacks. Enrolment Stage A: Legitimate Fingerprint Authentication Stage B: User Fingerprint Fingerprint Acquisition Fingerprint Acquisition Fingerprint Enhancement Fingerprint Enhancement Feature Extraction Feature Extraction B’: Query Feature Storage Feature Sets Matching A’: Template A = B or A ≠ B ? B A=B? 100% 88% Threshold Authentication OK A Authentication NOK 0% Figure 23. Processing steps involved in the enrolment and authentication stages. 2.2. Fingerprint Acquisition Fingerprint acquisition refers to the process of capturing a detailed image of the user’s fingertip. Owing to the fact that the acquired image has to be processed by an automatic system, the fingerprint image is usually presented in a digital form. Some of the key parameters that characterize the fingerprint sensors and their acquired images are the following: (i) Resolution: it refers to the number of pixels or dots per inch (dpi) available in the acquired bitmap. Depending on the resolution of the sensor device, the resultant image will be able to provide more or less fingerprint details. Resolutions in the range of 500 dpi guarantee an excellent distinction between ridges and valleys, as well as the isolation of minutia points. Lower resolutions, in the range of 200 to 300 dpi, result in lower accuracies; whereas bigger resolutions, around 1000 dpi, allow the extraction of more specific finger details such as the pores present in the skin. 27 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 (ii) Sensing area: the sensing area of the fingerprint scanner is an important parameter linked to the size of the acquired images. The larger the fingerprint region to be acquired, the more finger details are captured and the more distinctive the image can result. On the contrary, the smaller the acquired fingerprint, the more difficult results in finding overlap between different acquisitions of the same finger, what leads to false non-match errors and reduces the accuracy of the recognition system. (iii) Dynamic range or depth of the pixel: it refers to the number of bits used to encode the intensity value of each pixel. Most available fingerprint scanners acquire greyscale images with typical pixel depths of 4 bits (16 levels of grey) or 8 bits (256 levels of grey). Other metrics used to characterize one fingerprint scanner are the following [Yau et al., 2004]: a) Image quality score, which refers to the degree of accuracy at which an automated fingerprint recognition system can extract unique features from the acquired images for subsequent recognition. The quality of the acquired images plays thus a key role in the accuracy performance of the AFIS/AFAS applications. b) Usability, defined as the range of environmental (weather, light exposure, etc.) and finger’s skin conditions (worn-out, rigid, elastic, etc.) over which acceptable quality fingerprints can be acquired. c) Consistency, defined as the rate at which the quality of the acquired fingerprints varies with the usage of the sensor. It measures the variation in the sensor acquisition performance over time. d) Ergonomics, which refers to the ease of use of the sensor device by the intended user. e) Acquisition speed, that points out the processing time needed in order to get the bitmap or digital fingerprint impression. f) System interface, which refers to the specific means of interconnection of the fingerprint scanner with off-chip CPUs or memory devices from which to manage the acquisition and storage processes. Current fingerprint scanners are provided with a wide range of either serial or parallel interface communication links (USB, I2C, SPI, RS-232, dedicated 4-bit or 8-bit parallel buses, etc.). g) Price, which is one of the main limiting factors when dealing with low-cost embedded systems. Although fingerprint scanner technology is constantly improving and costs continue to decrease, the market relentlessly demands lower-cost fingerprint sensors in innovative products and applications. 2.2.1. Sensing Techniques A general classification of the fingerprint sensing techniques establishes two main groups: (i) manual or semi-automatic sensing techniques, and (ii) fully automatic sensing techniques. In case of manual techniques, rolled fingerprints are usually obtained. Rolled fingerprinting deals with the procedure in which the user rolls his/her finger from one side of the fingernail to the other obtaining thus all available ridge detail. This method is normally used in law enforcement applications, where the acquisition process is based on ink&paper techniques and it is required to maximize the acquired fingerprint area. The user’s finger is spread with black ink and pressed against a white paper card. The inked impression produced on the paper is then scanned by an electronic device to obtain a digital image of the whole finger pattern. This is an example of manual or semi-automatic acquisition technique, where both the acquisition and the quality control processes are guided and supervised respectively by human experts. This technique, depicted in Figure 24, is more and more being replaced by other equivalent digital techniques at present. Figure 24. Rolled fingerprinting technology. 28 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 When dealing with fully automatic fingerprint acquisition techniques, it is possible to perform a classification in three subcategories according to the live-scan methodology used in the acquisition process: a) one-touch sensors, b) sweep sensors, and c) touchless sensors. Static or one-touch sensing refers to the procedure in which the user presses, without moving, his/her finger on the sensing surface of a digital scanner. The size of the sensing surface is normally big enough to acquire the required finger details. Although this methodology is characterized by its ease of use, this method has a main drawback based on the fact that a latent fingerprint can remain on the sensing surface once the finger is removed from the sensor. Figure 25 shows the main parts of the process. Figure 25. One-touch sensing technology. There is a trade-off between the sensing area of the fingerprint scanner, the accuracy of the recognition system, and the cost of the sensor device. The sweeping acquisition technique tries to find the good balance between all of them while solving the problem related to latent fingerprints in parallel. Sensor manufacturers tend to reduce the sensing area in order to lower the cost of their devices. In general, the larger the sensing area, the greater is the device cost. The sensing surface of sweeping sensors is dramatically reduced in comparison with one-touch sensors. The sweeping sensor presents a sensing surface with a width similar to one-touch sensors, but its height is only several pixels tall. The user sweeps his/her finger over the sensing surface and a series of small slices of the fingerprint are collected during the acquisition process, as depicted in Figure 26. A C B B C A Figure 26. Sweeping sensing technology. 29 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 To enable image reconstruction without dependence on the finger sweeping speed, the sensing area has enough lines per slice to guarantee the overlap between consecutive acquired slices. A processing algorithm is then needed in order to reconstruct the fingerprint image by searching overlapped regions between consecutive slices and removing the distortion produced by the finger speed non-uniformity [Xia and O’Gorman, 2003]. A latent image can be left from the oil residue of a previously applied finger on a touch sensor. However, the sweeping action leaves no more that a slice size of residue, and the sweeping technique is self-cleaning: the sweeping action cleans the device. Finally, the 2-D or 3-D touchless sensing technique aims at sensing the fingerprint without submitting the finger to any distortion or deformation due to the contact of the finger with any sensing surface [Hiew et al., 2007]. This technology, summarized in Figure 27, is mainly based on optical sensors, and although the acquisition process is touchless, a guide is normally used to get the right distance between the finger and the optical sensor. Figure 27. Touchless sensing technology. Another classification of fingerprint sensors can be made according to the acquisition and consequent signal processing stages that take place until obtaining the fingerprint digital bitmap. There are two main fingerprint sensing techniques: (i) Off-line techniques, in which a relatively long response delay is allowed between the fingerprint acquisition process and the bitmap generation (e.g. digital bitmaps obtained from ink&paper impressions). (ii) On-line sensing techniques, in which the acquisition process automatically results in a digital fingerprint impression. The bitmap is obtained on-the-fly during the acquisition process. The online sensing techniques can be grouped into three generic categories, known as optical, semiconductor, and ultrasound sensors: - Optical sensors; - Solid-state or semiconductor sensors, such as: - Capacitive sensors, - Thermal sensors, - Pressure-based sensors, - RF-based sensors, - Micro electro-mechanical sensors; and - Ultrasound-based or ultrasonic sensors. Among all the sensors, semiconductor sensors are considered to be low cost; optical sensors are considered to have a high degree of stability and reliability; whereas ultrasound sensors are very precise and fraud-free though expensive to implement. They are all covered in the next sections. 2.2.2. Optical Sensors The working mechanism of optical sensors is as follows: the finger is exposed to a light source, and based on the optical mechanism used, either reflection or transmission, the reflected or transmitted light is captured by a CCD (Charge Coupled Devices) camera, a CMOS (Complementary Metal Oxide) camera, or a TFT (Thin Film Transistor) display acting as a sensor. The optical sensor is in charge of receiving the image of the fingerprint. Optical sensors are often selected for high traffic 30 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 environments where a robust sensor is needed. The resilient glass or plastic surface protecting the optical sensor is suited for harsh environment applications typically found in multi-user open access applications. Optical sensors are quite common, despite the fact that their cost cannot be extremely low because of the costs related to the mechanical assemblies and the camera. Other sensors, called electro-optical scanners, make use of the electrical properties of some polymers, which are able to emit light when properly excited with the appropriate voltage level. The polymer is directly connected to a CMOS camera. When the finger touches the sensing plain, the polymer emits light in those regions where the ridges touch. The optical sensor then collects the fingerprint image. 2.2.3. Solid-State Sensors Semiconductor technology is the basis of the development of multiple silicon chips, among them, also fingerprint sensors [Mainguet et al., 2000]. A classification of silicon fingerprint sensors is possible based on the physical signal that is converted by the transducer into electrical information. In this direction, there exist capacitive and RF sensors based on electrical fields, thermal sensors based on pyro-electrical information, and pressure chips based on piezo-electrical signals. Capacitive This technique is focused on the measurement of the capacitance between the skin and a reference electrode on the surface of the silicon sensor. The metal electrode acts as one capacitor plate, and the contacting finger acts as the second plate. A passivation layer on the surface of the device forms the dielectric between these two plates. The distance between the skin and the sensing surface varies as there is a ridge or a valley. As the distance varies, so does the capacitance. The capacitance map read by the sensor array is then converted into a greyscale digital image. The main drawback of this technique is the vulnerability of the sensor against external electrical fields such as electrostatic discharges (ESD). RF In this approach, the sensor injects a low radio frequency signal into the finger, and the local electrical field is read by the sensing pixels on the silicon, which act as antennas. The signal strength depends on the local conductivity of the skin (the active capacitive/resistive connection, which is affected by the distance between the skin and the pixel). The sensors are able to detect if there is a ridge or a valley, so a fingerprint pattern image can be obtained. RF-based sensors can be configured to generate images of the internal layers of the skin. The sensors can read below the surface of the skin, into the live layer. Thermal This technique makes use of pyro-electric sensors. Pyro-electric material is able to convert temperature differentials into voltages. The sensors are typically maintained at a high temperature by electrically heating them up in order to increase the temperature difference between the sensor surface and the finger ridges. Because the ridges are in contact with the sensor and the valleys do not make contact, there is a big change in temperature of the pyro-electric material under the ridges, whereas the temperature of the pyro-electric material under the valleys remains almost unchanged (small temperature differential between the sensor and the air caught in the fingerprint valley) when the finger is applied on the sensor. The main drawback of this technique is that the thermal image disappears quickly. When applying the finger on the sensor, a temperature differential is measured and a thermal image is obtained. However, after a short period of time (less than 100 ms) the image vanishes as the finger and the chip reach thermal equilibrium. Pressure This technique is based on piezo-electric material. When the finger is applied on the sensing surface, automatically some pressure is done by the ridges of the skin, whereas the valleys do not 31 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 exert such a pressure. The pressure map exerted over a conductive membrane on a CMOS silicon chip is then converted into a digital image. MEMS This technique, based on microelectromechanical sensors, is similar to pressure devices. It consists in building a matrix of extremely tiny silicon switches. Each switch corresponds to a pixel of the image. Once the finger is placed on the matrix, ridges in contact with the switches are automatically closed by pressure effect, whereas switches below valleys remain open. This permits to get a binary image of the fingerprint. Similarly to pressure-based sensors, this technology is affected by the coating layer on the sensing surface. Few devices have been developed with such a technology. Solid-state fingerprint scanners present notable features such as high quality resolution, small size, zero maintenance, low power consumption and cost-effectiveness, which make them ideal for personal recognition applications focused on embedded system architectures. 2.2.4. Ultrasound Sensors Ultrasound scanners deal with high-frequency acoustic sound waves that are transmitted toward the fingertip. Based on the echo signal reception, the distances between the sensing device and the finger can be measured. These distances are then converted into a representative fingerprint ridgevalley pattern image. The used frequencies range from 20 kHz to several gigahertz, and its energy is acoustic, not electromagnetic. The main advantage of this technique with regard to optical scanners is that it images the derma, or sub-surface of the skin, rather than the surface, which may be worn or scratched. Optical fingerprint scanners rely on the difference in the indices of refraction of skin, air and the fingerprint platen in order to obtain a live-scan fingerprint image. Because of its limited depth of penetration, this optical technology is particularly sensitive to the surface conditions of the skin. Ultrasonic fingerprint scanners however, rely on the difference in the acoustic impedance of skin, air and the fingerprint platen. Because ultrasound has the ability to penetrate many materials, the performance of this scanners is invariant to the surface conditions of the finger. At each interface level, sound waves are partially reflected and partially transmitted through. This penetration produces return signals at successive depths. Ultrasound permits therefore imaging beyond air gaps and skin surface contamination into the true ridge-valley structure of the fingers to accurately map the essential details of fingerprints. Despite its advantages, ultrasound sensing requires a large device with mechanical parts, and this is not suited for large production volumes at low cost. 2.2.5. Conclusions Once introduced the fingerprint acquisition process, the different types of fingerprint scanners available in the market are presented, with silicon-based sensors being shown as a less expensive solution that paves the way toward the deployment of automatic personal recognition systems. The recent advances in technology have made possible to extend the usage of fingerprint capture devices from small-volume law enforcement applications to the larger-volume arena, covered by civilian and commercial applications. The trends on the electronics market point to continuously reduce the size of semiconductor devices, so the process of miniaturization is also affecting to silicon-based fingerprint sensors. This results in smaller one-touch sensors and thus smaller fingerprint images (even with higher sensor resolutions). Since the sensor captures a smaller area of the finger, there is a strong need for sophisticated algorithms because the amount of distinctive information is reduced. In order to face those constraints, and owing to the fact that lower area means lower cost for silicon chips, sweeping sensing technique becomes an efficient alternative to silicon-based one-touch sensors, which permit to acquire large images at lower costs. Figure 28 shows some examples of commercial products embedding sweeping technology sensors to perform personal authentication on platforms like desktops, laptops, mobile phones, PDAs, wireless devices or access doors. 32 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 28. Examples of biometrics-based products embedding fingerprint sweep sensors. Capturing good quality fingerprint images under all possible working conditions is extremely difficult. Not only the sensing technology plays an important role; the environmental and the skin conditions affect the resultant quality of the fingerprint impressions. The amount of sweat produced by the pores of the skin affects the acquisition process. Too much sweat causes the finger impression produced to appear smeared, while too little sweat causes the skin to become dry, which is difficult to image. The performance of automated fingerprint recognition systems highly depends on the capability of the fingerprint sensor to acquire good quality images independently of the environmental and skin conditions. Besides, as addressed in next sections, image processing stages can help in improving the quality of the acquired fingerprint impressions. 2.3. Image Enhancement A fingerprint is the reproduction of the fingertip epidermis, which is composed of the ridges and the valleys of the skin. The ridge-valley pattern of each fingerprint is permanent; once injuries such as skin cuts, burns or abrasions affect the ridge structure, the skin is regenerated and the ridge pattern is restored. The ridge-valley structures available in a fingerprint provide essential information for recognition, till the point that the performance of a fingerprint recognition system heavily relies on the quality of the acquired images and the reliability of the discriminatory information extracted from them. The quality of fingerprint impressions is affected by several factors: a) The inherent quality of the user’s fingerprint, which depends on aspects such as the kind of job done by the user –manual workers like miners or farmers normally have thin or abraded friction ridges from which it is more difficult to extract the ridge pattern–, the age –elderly people present very elastic skin, and this can generate some distortions in the acquired images–, etc. b) The fingerprint acquisition conditions, such as the dryness or wetness levels of the finger skin during the acquisition process, which can introduce certain levels of noise in the resultant image. c) The own performance of the scanning sensor used. Each fingerprint sensor features lower or higher robustness levels against factors like skin conditions, environmental working conditions, device aging effects, non-habituated users to the acquisition process (e.g. incorrect finger pressure on one-touch sensors, or incorrect sweeping speed in sweeping sensors, which can affect the quality of the obtained images). All these factors could lead to inherently noisy or deficient fingerprint impressions. Therefore, the aim of the image enhancement stage is to improve the quality of the already acquired fingerprint images. Fingerprint image enhancement algorithms aim at satisfying two main objectives: (i) to improve the clarity and distinctiveness of ridge and valley structures in the fingerprint (filling holes, interrupted ridges, etc.), and (ii) to remove noise within the ridge-valley pattern (inter-ridge bridges, small artefacts, etc.). A fingerprint impression can present regions of good, medium and poor quality, as depicted in the example of Figure 29. In those well-defined regions, ridges are clearly differentiated from each other; in recoverable regions, ridges are corrupted by gaps, creases, smudges, links and the like, but 33 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 they are still visible and the neighbouring regions provide enough information about their structure to enhance them; and in unrecoverable regions however, the level of degradation present on the image does not allow the pattern retrieval: ridges are corrupted by such a severe amount of noise and distortion that no ridges and valleys are visible and the neighbouring regions do not allow them to be reconstructed. Recoverable region Unrecoverable region Well-defined region Figure 29. Original fingerprint impression with well-defined, recoverable and unrecoverable regions. The enhancement algorithms are in charge of identifying those unrecoverable regions to mask them out as too noisy for further processing. Once this step is done, the enhancement process keeps only focused on those well-defined and recoverable regions present in the fingerprint. The final aim of the enhancement algorithm is to improve the clarity of ridge structures in those good quality and recoverable areas to facilitate the subsequent extraction of fingerprint features. A fingerprint enhancement algorithm should not result in any spurious ridge-valley structures, because spurious structures may change the individuality of input fingerprints and deteriorate thus the performance of the whole recognition system. The fingerprint enhancement algorithm is generally composed of a set of signal processing steps. The algorithm receives as input the acquired greyscale image, applies several signal/image processing stages to the input print, and outputs an enhanced version of the original impression, with a mask of those low quality print regions and a clear representation of the ridge-valley flow pattern in those good quality and recoverable regions. The output of the fingerprint enhancement process is normally a greyscale image as well. Depending on the specific algorithm, all or only some of the following processing steps are carried out along the enhancement stage: - image conditioning, - field orientation map computation, - ridge frequency map computation, - segmentation and region mask estimation, and - contextual image filtering. In a similar way as a fingerprint expert methodology, and despite the existence of noise in those recoverable and unrecoverable regions, it is possible to develop an enhancement algorithm able to exploit those visual clues available in the fingerprint image such as local ridge orientation, local ridge frequency, ridge continuity, etc. to improve the clarity of the acquired fingerprints. 34 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.3.1. Image Conditioning There are several preprocessing techniques in charge of adapting the acquired fingerprints to the next enhancement stages. Among them, some of the most cited techniques are the following: a) Normalization b) Histogram equalization c) Isotropic filtering d) Background subtraction and adaptive filtering e) Dyadic scale-space filtering Normalization In [Hong et al., 1998] the normalization process of a fingerprint image is defined. The aim of the image normalization process is to adapt the variation in grey level along ridges and valleys in the different regions of the image. The normalization process does not change the ridge and valley structures; it is not able to clean noisy areas, fill ridge breaks or separate parallel touching ridges. The main purpose of the normalization process is to set the intensity level of the image in a specified range in order to facilitate the subsequent processing steps. This allows having input images with similar characteristics, making uniform the input images independently of the sensor device and the acquisition conditions, removing thus the effects of grey level deformation due to finger pressure differences. Given a fingerprint image I(i,j), with N×M pixels, the mean m and variance σ2 of the image are computed as: 1 N −1 M −1 m= ∑ ∑ I (i, j ) N ⋅ M i =0 j =0 2 1 N −1 M −1 ∑ ∑ (I (i, j ) − m ) N ⋅ M i =0 j =0 It is possible to transform the input image I(i,j) to obtain a new image I′(i,j) with new mean and variance values, m0 and σ02 respectively. The normalized image is obtained by applying the following transformation to the original image: 2 ⎧ σ 02 ⋅ (I (i, j ) − m ) , if I (i, j ) > m ⎪m0 + ⎪ σ2 I ' (i, j ) = ⎨ 2 ⎪m − σ 0 ⋅ (I (i, j ) − m ) 2 , if I (i, j ) ≤ m ⎪ 0 σ2 ⎩ σ2 = Histogram equalization The histogram equalization technique aims at converting the original image into a new image in which there is a uniform spread of greyscale levels covering the complete greyscale range. Since contrast is expanded for most of the image pixels, the transformation makes easy the distinction between ridges and valleys on the fingerprint pattern [Greenberg et al., 2002]. Given a digital image I(i,j) of size M×N pixels in 256-greyscale format, the probability density function of a pixel with intensity level k is given by: n P(k ) = k n where 0 ≤ k ≤ 255 is the greyscale pixel intensity range, nk is the number of pixels at intensity level k in the image, and n is the total amount of pixels, n=M×N. The histogram of the image is obtained by plotting P(k) against k for the complete greyscale intensity range. For each pixel (i,j) of the original image with a greyscale intensity I(i,j)=ki,j, a new intensity I′(i,j)=k′i,j is calculated by applying the uniform histogram equalization transformation: ki , j ⎞ ⎛ ⎜ (k max − k min ) ⋅ ∑ P (k )⎟ k 'i , j = k min + ⎜ ⎟ k = k min ⎠ ⎝ 35 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 where kmax and kmin represent the maximum and minimum greyscale intensity levels respectively, kmax = 255 and kmin = 0. The histogram equalization process of fingerprints is performed locally on each pixel, taking into account a neighbourhood kernel of size W×W pixels centered on the specific pixel. This results in expanding the contrast locally, and changing the intensity of each pixel according to its local neighbourhood. Isotropic filtering Most of the existing fingerprint enhancement algorithms use filtering techniques to improve fingerprint impressions. The filtering techniques can be categorized into two main groups: anisotropic and isotropic filtering, based on whether the filter kernel is orientation sensitive or not, respectively. On the one hand, isotropic filtering can properly preserve features on the input images but can hardly improve the quality of the images. On the other hand, anisotropic filtering can effectively remove noise from the image, but only when a reliable ridge orientation is provided. Isotropic filtering is normally used in the first stages of the enhancement process as input image conditioning. The two most commonly used isotropic filters are the median filter and the Gaussian filter. The aim of both filters is to reduce the intensity dispersion level between pixels in a neighbourhood area, removing those isolated intensity peaks of the image [Wu et al., 2004]. Background subtraction and adaptive filtering In [Hadhoud et al., 2006] authors propose a new image conditioning technique based on background subtraction and adaptive filtering as an alternative to other image conditioning steps. The background subtraction process tries to remove the background of the image and to increase the dynamic range between the background and the foreground. The minimum and maximum pixel intensities of the image are recorded, and the stretching of the greyscale image is done by applying a transformation at pixel level. Given a fingerprint image I(i,j) with a resolution of 8 bits (256 grey levels), and being Imin and Imax the minimum and maximum intensity levels of the image respectively, a new image I′(i,j) is obtained by computing the new intensity level of each pixel as follows: I (i, j ) − I min I ' (i, j ) = ⋅ 255 I max − I min The resulting image I′(i,j) is used as input of an adaptive filtering process, which is suitable to enhance those images contaminated by noise. The frequency characteristics of the fingerprint image are as follows: - the low frequency components of the image dominate flat areas (image background), - edges are characterized by the presence of middle and high frequency components, whereas - noise dominates the very high frequency range. In this approach, the designed filtering structure keeps low frequencies without change (background of the image), amplifies the middle and high frequencies (corresponding to the ridge-valley fingerprint pattern), and attenuates the very high frequency range (that refers to the finger area dominated by noise). This effect is achieved by using a filter structure with low-pass and pass-band characteristics, where the frequency band is adapted dynamically according to the local variations of the fingerprint image. Using such adaptive filtering structures, it is possible to remove the noise of the fingerprint impressions while enhancing the contrast between ridges and valleys. Dyadic scale-space filtering Other enhancement techniques suitable to fingerprints are those based on scale-space theory [Cheng and Tian, 2004], which can be adopted as an alternative to previous contextual filtering techniques. The main drawback of contextual filtering lies in the fact that the adaptive filter characteristics are deduced from the local information of the fingerprint, such as ridge width, field orientation, ridge frequency, etc. This information can result unreliable in those areas corrupted by noise, which can 36 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 lead to poor enhancements. Scale-space filtering, however, is a method that describes images or signals qualitatively, managing the ambiguity of scale in an organized and natural way. Scale space theory simulates the behaviour of human vision and models visual operation at multiple scales. In some scales, only the sketch of the image can be seen, whereas in other scales detailed information can be found. The image is first convolved with Gaussian filters of different sizes and from the resulting scale-space image set, a qualitative description of the original image is deduced covering all scales of observation. The original fingerprint is decomposed into a series of images at different scales, ranging from course-grain to fine-grain details, to avoid the presence of noise at different scales. The number of details decreases and the noise is successively suppressed as the scale increases. The aim of this technique is to get rid of the influence of noise while enhancing the fingerprint [Cheng et al., 2002]. The reliable information at each scale is combined to enhance the fingerprint. In each iteration of the algorithm, an enhanced image with reduced noise and more precise fingerprint ridge-valley pattern is obtained. 2.3.2. Field Orientation Map Computation The fingerprint can be viewed as an oriented texture defined by the ridge-valley flow, which slowly changes along the fingerprint. The field orientation map defines the main direction of ridges and valleys of the skin in each local neighbourhood, as depicted in the example of Figure 30. Figure 30. Field orientation map of a greyscale fingerprint. Several methods have been proposed in the last decades to compute the orientation field of fingerprint images [Ratha et al., 1995b], [Turroni et al., 2011], [Maltoni et al., 2009]. Some of the most used techniques are described below: a) Field orientation estimation based on gradient b) Field orientation estimation considering singular points c) Field orientation estimation through filter-banks Field orientation estimation based on gradient This technique relies on the computation of the gradient of pixels from the greyscale intensity level of pixels in each local neighbourhood. The gradient points to the direction of the image with its maximum greyscale variation, and is composed of the components in x and y directions, Gx and Gy respectively. The gradient operator presents a size of W×W pixels (W=2w+1, w∈N), and may vary among Sobel, Prewitt, Kirsch, Marr-Hildreth, etc. depending on the computational requirements. 37 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Rao and Schunck, 1989] authors define a method to estimate the field orientation map of any oriented texture by means of the computation of the gradient of a Gaussian-filtered image. The fingerprint image is first smoothed through a Gaussian filter of size W×W pixels tuned to the wavelength of the fingerprint pattern. The symmetric Gaussian filter has the form: 2πσ where parameter σ permits to adjust the degree of smoothing required for the filter. The next step consists in computing the gradient of each pixel (i,j) of the smoothed image. The convolution of the smoothed image with the directional gradient operators (in x and y directions) makes possible to compute Gx and Gy for each pixel of the image, and from them the gradient G(i,j) and the field orientation θ(i,j) of pixel (i,j): 2 2 G (i , j ) = G x (i, j ) + G y (i, j ) 2 Gaussian ( x, y ) = 1 ⋅e − x2 + y2 2σ 2 G x (i, j ) Once computed the field orientation at pixel level for the complete image, the pixel orientation is smoothed in order to compute the average orientation over each neighbourhood region. The image is split into square blocks of size M×M pixels (M=2m+1, m∈N), and the orientation is computed as: θ (i, j ) = tan −1 G y (i, j ) 1 Φ = tan −1 2 ∑∑ G (i, j ) ⋅ sin 2θ (i, j ) 2 M M ∑∑ G (i, j ) ⋅ cos 2θ (i, j ) 2 i =1 j =1 i =1 j =1 M M The resultant orientation of the block is (Φ + π/2) since the gradient vector is perpendicular to the direction of anisotropy. In [Hong et al., 1998] authors develop a least mean square orientation estimation algorithm. The image is split into square blocks of size W×W pixels (W=2w+1, w∈N). Once computed both directional gradients for all pixels in one block, it is possible to estimate the local ridge orientation of the block (u,v) by computing the least square estimate of the ridge orientation: ∆ x (u, v ) = ∑∑ 2 ⋅ Gx (i, j ) ⋅ G y (i, j ) i =1 j =1 W W 2 2 ∆ y (u, v ) = ∑∑ (Gx (i, j ) − G y (i, j )) i =1 j =1 W W θ (u, v ) = tan −1 1 2 ∆ x (u, v ) ∆ y (u, v ) The orientation of one block θ(u,v) is estimated from the greyscale intensity level of all pixels of the block. Since the orientation may be affected by the presence of noise in the local ridge-valley structures, in order to make such computation more robust against poor quality regions, authors suggest one further processing step based on a low-pass filtering of the previously computed local ridge orientation. It improves the estimation of the field orientation in those uniform areas where the ridge orientation varies slowly, but can decrease a little bit the accuracy in those regions with singular points, where greater orientation changes occur. The orientation image is converted into a continuous vector field, defined by: φx (u, v ) = cos 2θ (u, v ) φ y (u, v ) = sin 2θ (u, v ) and the local ridge orientation of each block Φ(u,v) is computed by means of a smoothing operation at block level with a two-dimensional low-pass filter O(i,j), with unit integral an size H×H blocks (H=2h+1, h∈N), as: 38 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 φ ' x (u, v ) = φ ' y (u, v ) = Φ (u, v ) = i =−h +h ∑ ∑O (i, j ) ⋅ φ (u + i, v + j ) j =−h +h x +h +h 1 −1 φ ' y (u, v ) tan 2 φ ' x (u, v ) i = −h ∑ ∑O (i, j ) ⋅ φ (u + i, v + j ) y j = −h The resultant orientation angle is (Φ + π/2). The good performance reached by this algorithm has converted it in one the most deeply used in fingerprint recognition today. Field orientation estimation considering singular points The orientation pattern of a fingerprint is quite smooth and continuous except near the singular points (cores and deltas), where orientation becomes discontinuous. The previous methods make use of local ridge gradient information to deduce the whole fingerprint orientation field. However, those methods often result in poor accuracy in the regions close to singular points. Focused on the fact that the ridge orientation presents typical ridge patterns near the singular points, and these are stable even in poor quality fingerprint impressions, several methodologies have been developed to optimize the fingerprint direction field estimation taking advantage of the local ridge features in the surroundings of singular points. Some algorithms estimate the field orientation map by combining gradient-based methods in those regions far from the singular points, and singular points-oriented methods in those regions near each singular point. These two models are combined smoothly together through a weight function. The weight function guarantees that for each point of the image, its orientation follows the gradientbased method if it is far from the singular points, or follows the singular points-oriented method if it is near to one singularity mark. Since the combination model relies on the detected singularities, the performance of such algorithms is influenced by the accuracy of the singular points detection methods used. False, missing or displaced singular points from poor quality fingerprint regions/images can worsen the efficiency of those algorithms. In [Zhou and Gu, 2004] and [Gu et al., 2004] authors propose a combination method where a polynomial model is used to characterize the field orientation map in those regions far from singular points, and a point-charge model is introduced to estimate the orientation field locally in the noncontinuous regions around the singular points. The point-charge model is a symmetric method, which cannot give accurate results when dealing with singular points in which asymmetric directional fields are present in their surroundings. To overcome such limitations, Nie et al. introduces in [Nie et al., 2005] an optimized version of the combination model by keeping the polynomial model, but replacing the point-charge model by a singular point energy model that can be symmetric or asymmetric based on the different types of singularities. Field orientation estimation through filter-banks A new category is the one composed of those algorithms that deal with the estimation of the field orientation map through the application of a set of directional filters on the fingerprint. Instead of estimating the orientation field directly from the input fingerprint image, it is done from a set of filtered images obtained from the original fingerprint. This technique takes advantage of the fact that, in the filtered images, the presence of noise is reduced so a reliable estimation of the ridge orientation can be performed, and the combination of those filtered images can directly result in the enhanced fingerprint. However, a huge processing effort is needed before estimating the orientation field and obtaining the enhanced version of the original image with such a technique. In [Hong et al., 1996] a set of band pass Gabor filters, adjusted to the characteristics of the scanning sensor, and able to cover the whole ridge orientation and ridge frequency maps, is pre-computed and stored off-line. The input image is then convolved with such a set of Gabor filters. As a result of the filtering process, a set of filtered images is obtained. The filtered image corresponding to a 39 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 given Gabor filter mainly removes undesired noise and preserves those ridges and valleys that are oriented approximately in the same direction as the Gabor filter, while smoothes all those portions of the image with non-matched ridge orientations. A ridge extraction technique is then applied to each of the filtered images and a set of binary ridge maps is obtained. The combination, by means of a voting algorithm, of the true ridge maps extracted from the filtered images results in a coarselevel ridge map for the input fingerprint image. The coarse-level ridge map is then tuned locally by means of an orientation estimation algorithm to finally deduce the local orientation of each image pixel. Once the field orientation map is obtained, the fingerprint image can be adaptively enhanced by using the local orientation information and the filtered images. Jain et al. propose two types of filter-based algorithms based on a bank of Gabor filters to capture either global details or both local and global details from a fingerprint. In the first method [Jain et al., 1999b], the fingerprint representation scheme is used for classification purposes. According to the special patterns generated by the global ridge-valley flow in the central region of the fingerprint, a classification of fingerprints into five categories (whorl, right loop, left loop, arch and tented arch) is done. Gabor filters tuned to a ridge frequency of 1/10 pixels-1 in 500 dpi images, and four different selective orientations (0º, 45º, 90º and 135º) are used to capture the global ridge-valley pattern of any fingerprint. The fingerprint image is convolved with each of the four Gabor filters to produce four component images, which capture most of the ridge directionality information present in the fingerprint and form thus a valid representation scheme that permits to classify the fingerprint in any of the five categories. In the second method [Jain et al., 2000], the bank of Gabor filters is used to capture both local and global details of a fingerprint, so it is possible not only to classify but also match fingerprints. Up to eight directions (0º, 22.5º, 45º, 67.5º, 90º, 112.5º, 135º, and 157.5º) are required to completely capture the local ridge characteristics. The convolution of the original fingerprint image with eight Gabor filters result in eight image components, from which the ridge directionality information can be deduced. Similarly to the previous method, the region of interest is determined by the localization of a reference point in the image. The region of interest is tessellated into several sectors, and each sector is coded through the grey level information of its corresponding pixels in each filtered image. The set of eight coded images is known as FingerCode, which defines the fingerprint in a compact way. Another similar work is the approach done in [Lee and Wang, 1999]. Authors propose the usage of Gabor filter-based features for fingerprint ridge orientation estimation at block level. Given a greyscale fingerprint image, a set of predefined Gabor filters are deduced covering m orientations θk, being θk = π·(k-1)/m, with k = {1 … m}. The fingerprint image is tessellated into blocks, and each image block is convolved with the predefined set of Gabor filters, resulting in m Gabor features Gθk. Once obtained those m Gabor features Gθk corresponding to one block, it is possible to estimate the local ridge orientation Ф of the block as follows: Φ= ∑ Gθ k =1 m k =1 m k ⋅ θk k ∑ Gθ The performance of most enhancement techniques heavily relies on the local ridge orientation map. If the reliability of the ridge orientation can be guaranteed, then anisotropic filtering can produce better results than isotropic filtering. However, if the estimated orientation is not correct, then anisotropic filtering can corrupt the real ridge-valley pattern and consequently generate false fingerprint features. In this direction, the orientation estimation algorithm to be used in an AFAS should not only calculate the correct ridge-valley orientation flow, but also prevent assigning an orientation to noisy blocks. To achieve this goal, it is needed to discard the orientation in those unreliable regions, while verifying the correctness of the estimated orientation in those reliable regions. The reliable estimation of the orientation field normally concentrates a large amount of effort, which results in demanding execution times in case of on-line applications. 40 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.3.3. Ridge Frequency Map Computation Apart from the ridge orientation map, the ridge distance or ridge frequency map is another intrinsic characteristic that defines the ridge-valley pattern available in a fingerprint. In a local neighbourhood, the ridges and valleys flow in parallel. The greyscale ridge-valley texture can be modelled as a sinusoidal-shaped wave along a direction normal to the local ridge orientation. The frequency of such a wave gives information about the distance between parallel ridges in the local neighbourhood. Fingerprint ridge distance is defined as the distance from a given ridge to adjacent ridges. It can be measured as the distance from the center of one ridge to the center of its neighbour ridge. Ridge frequency is the reciprocal of ridge distance, and indicates the number of ridges within a unit length. Two main techniques exist related to ridge frequency determination: a) the first group considers that in fingerprint images scanned at a given resolution –e.g. 500 dpi– there is a little variation in the spatial frequencies (inter-ridge distances) among different fingerprints, and the average ridge distance in one fingerprint remains the same in all its regions. This implies that there is an optimal scale (average ridge frequency) for analyzing the fingerprint texture. Some methods presented in literature make use of such average ridge frequency, which is deduced based on the scanning sensor resolution characteristics, and the average inter-ridge distance experimentally obtained from a representative set of tested fingerprints. Actually, since each finger has a different dominant ridge frequency, and in the same finger the inter-ridge distance slightly varies in a range in the different regions of the finger according to the acquisition conditions (elasticity of the skin, uneven pressure exerted on the sensing surface, etc.), the usage of the average ridge frequency approach results in an inaccurate enhancement process. b) the second group however, computes the ridge frequency in each local neighbourhood in order to get a more accurate ridge frequency map. Several methods exist in literature that address the computation of the local ridge frequency map of an image. Some of them are described below. In [Hong et al., 1998] authors develop an algorithm able to estimate the ridge frequency map of a fingerprint image I(i,j) in the spatial domain. To do this, the image is split into blocks (u,v) of size W×W pixels, and for each block, a local ridge frequency is deduced. For each image block, a window of size L×W pixels, centered on the block and oriented in the direction Φ normal to the orientation of the block, is extracted. The mean grey level value X of each column m in the oriented window is computed as: X (m ) = 1 W W −1 n =0 ∑ I (u, v ) , m = 0, 1, 2, ..., L - 1 W⎞ L⎞ ⎛ ⎛ u = i + ⎜ n − ⎟ cos Φ (i, j ) + ⎜ m − ⎟ sin Φ (i, j ) 2⎠ 2⎠ ⎝ ⎝ W⎞ ⎛ ⎛L ⎞ v = j + ⎜ n − ⎟ sin Φ (i, j ) + ⎜ − m ⎟ cos Φ (i, j ) 2⎠ ⎝ ⎝2 ⎠ If no minutiae and singular points appear in the oriented window, the mean grey level map of the columns in the oriented window generates a sinusoidal-shape wave, which has the same frequency than the ridges and valleys in the oriented window. The ridge frequency of each block F(u,v) is estimated by counting the average number of pixels between two consecutive peaks of grey levels along a direction orthogonal to the local ridge orientation. Moreover, and owing to the fact that ridge frequency and inter-ridge distances change slowly in a local neighbourhood, in those blocks where minutia points, singular points, or low quality ridgevalley patterns are present, the ridge frequency of the block is estimated by interpolation with the frequency values of those neighbour blocks that have a well-defined frequency. A low-pass filter Y(i,j), of size H×H (H=2h+1, h∈N) with unit integral, is applied on the ridge frequency map F(u,v) to obtain and enhanced version F′(u,v) as follows: F ' (u, v ) = i=−h ∑ ∑ Y (i, j ) ⋅ F (u + i, v + j ) j=−h +h +h 41 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Since the ridge frequency methodology proposed by Hong et al. is computationally expensive, new approaches have been suggested in literature trying to reduce the computational load of the algorithm. In this direction, in [Bernard et al., 2002] and [Hong et al., 1996] authors propose to determine the ridge frequency map of fingerprints through filter-banks. Maio and Maltoni propose in [Maio and Maltoni, 1998] a new approach to estimate the local ridgeline density in digital images assuming that fingerprint patterns can be modelled by purely sinusoidal waves. Let fρ,θ,α,ν(x,y) be a sinusoidal signal that models a local shape of the ridge-line pattern over the xy plane. Let α and ν be the amplitude and the frequency of the wave respectively, θ∈[-π/2,π/2) the local direction, and ρ a constant that corresponds to the local average intensity of the image pixels. In such conditions, a general analytic expression can be obtained to model the ridge-valley pattern in a local neighbourhood of the image, as: f ρ ,θ ,α ,ν ( x , y ) = ρ + α ⋅ sin (ν ⋅ ( x ⋅ sin θ + y ⋅ cos θ )) Authors develop a methodology to calculate the unknown frequency ν at pixel level on digital fingerprints based on partial derivatives of the image. The method to calculate ν requires a discrete square window of size W×W pixels to be defined and centered on the pixel where the local frequency has to be estimated. A compromise exists when selecting the size W×W of the local window. It is proven that a good estimation of the unknown frequencies of digital images is performed when large windows of size W×W are selected as local neighbourhood of each pixel. However, if the ridge-line density varies noticeable, large windows have an undesired smoothing effect. This approach fails when dealing with poor quality images that cannot be modelled by sinusoidal waves. All previous methods are based on the idea that fingerprints can be modelled by sinusoidal waves. However, many poor quality prints cannot be modelled as sinusoidal signals. In order to deal with such limitations, some approaches address the estimation of the ridge distance map of fingerprints in the frequency domain. In [Kovács-Vajna et al., 2000] authors address the problem of local average ridge distance map computation in the spatial-frequency domain by means of a spectral approach. Given a twodimensional discrete fingerprint pattern in the spatial domain, the original image is tessellated in non-overlapped square blocks of size W×W pixels, and a spectral analysis of the pattern applied to each block is carried out in order to estimate its average ridge distance. If f(x,y) is the greyscale value of the pixel with coordinates x,y∈{0,…,W-1} in an image block, the discrete Fourier transform of f(x,y) is defined as: 2 ⋅π ⋅ j ( x ⋅u + y ⋅v ) − 1 W − 1 W −1 F (u, v ) = ∑ ∑ f ( x, y ) ⋅ e W W x =0 y =0 where j is the imaginary unit, and u,v∈{0,…,W-1}. F(u,v) is the 2-D discrete Fourier transform, or 2-D discrete spectrum of f(x,y). Both f(x,y) and F(u,v) can be considered as elements of two W×W matrices. Fourier transform makes possible to move from the spatial domain to the frequency domain, and the inverse Fourier transform allows moving back again: 1 W −1 f ( x, y ) = ∑ W u =0 W −1 v =0 ∑ F (u, v ) ⋅ e 2 ⋅π ⋅ j ( x ⋅u + y ⋅ v ) W F(u,v) is in general a complex number. If F(u,v) is expressed as │F(u,v)│· e j·arg F(u,v), and it is taken into account the fact that f(x,y) is real, not complex, the imaginary part of the previous equation can be discarded, and the image can be understood as a sum of harmonics whose phase and amplitude are modulated by the complex coefficients F(u,v): 1 W −1 W − 1 ⎛ 2 ⋅ π ⋅ (x ⋅ u + y ⋅ v ) ⎞ f ( x, y ) = W ∑ ∑ u =0 v =0 F (u, v ) ⋅ cos ⎜ ⎝ W + arg F (u, v )⎟ ⎠ A regular pattern f(x,y) can therefore be seen as a linear combination of periodic signals in the frequency domain –harmonics–. The set of all harmonic coefficients of the image block is commonly referred to as its spectrum, and the average ridge distance of the block is then estimated 42 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 from its spectrum. The spectral algorithm estimates the average ridge distance at block level, and it is based on a two-step procedure: first, the average ridge distance is estimated in those reliable blocks of the image, and then this information is propagated onto the remaining regions (areas affected by noise or other defects, which makes the local distance estimation by the spectral approach unreliable in those blocks) to complete the computation. Since the algorithm can identify those blocks in which the ridge orientation estimate is not reliable, a diffusion mechanism is used to extent the estimation to those blocks corrupted by noise or with a high curvature pattern by propagating the distance values of those closer reliable blocks. Given a square block centered on (x,y), the average ridge distance in that block is represented by d(x,y). If the estimation done by the spectral approach results unreliable, a reliable average ridge distance estimation can be deduced by means of a smoothing procedure based on geometric considerations from those neighbour blocks featuring a reliable estimation. To do so, a 2-level neighbourhood is taken into account. For those closer blocks, an estimation d ′(x,y) is performed as: 1 d ' ( x, y ) = [d (x + 1, y ) + d ( x − 1, y ) + d (x, y + 1) + d ( x, y − 1)] 4 And for those diagonal neighbours, a new estimation d ′′(x,y) is possible: d " (x, y ) = 1 [d (x + 1, y + 1) + d (x − 1, y − 1) + d (x − 1, y + 1) + d (x + 1, y − 1)] 4 By averaging both equations, it is possible to estimate the average ridge distance d(x,y) from all eight neighbour blocks and to get the complete average distance map of the fingerprint image. In [Zhan et al., 2005], an improved method to accurately estimate the fingerprint ridge distance is presented based on the spectral approach. The fingerprint ridge distance is estimated at block level as in the original method. The main novelty of this algorithm is related to the fact that the analysis of fingerprints is done in the continuous frequency domain instead of in the discrete frequency domain. The fingerprint ridge distance d is therefore a decimal fraction in the continuous spectral approach, which results in a higher precision than the integer estimation value typically obtained in the original discrete spectral approach [Kovács-Vajna et al., 2000]. Another algorithm to estimate the fundamental frequency for unknown, periodic non-sinusoidal signals in the frequency domain is developed by Xudong Jiang in [Jiang, 2000]. In general, several are the factors that constrain the reliable estimation of the ridge frequency map of fingerprints: a) the inherent variability of ridge distance from one person to another; b) the intrinsic variability of ridge distance within the same fingerprint; c) the intrinsic variability of ridge direction within one fingerprint can also affect the ridge frequency estimation methods; d) the presence of noises, such as low contrast, ridge breaks, ridge joints, and so forth may distort ridge frequency estimation methods as well; e) the occurrence of minutia points may disturb the estimation of the ridge distance; and f) the existence of high curvatures, such as regions containing singular points, makes also difficult the reliable estimation of local ridge distances. Despite all those aforementioned constraints, the ridge distance and field orientation maps become valuable information to be used in order to design adaptive filters able to enhance images locally. 2.3.4. Segmentation and Region Mask Estimation Segmentation of fingerprint images is an important step in fingerprint recognition that refers to the process of isolating the valid fingerprint area, also known as foreground, from the image background. It is aimed to identify non-ridge regions and unrecoverable ridge regions of very low quality (smudged or noisy areas) and exclude them all as background of the image, where no valid information is available or no valid information can be extracted. Furthermore, segmentation aims at identifying those good quality and recoverable regions of the image from which to extract reliable and discriminatory fingerprint features. 43 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 A good segmentation technique should be insensitive to the contrast of the original image, should be robust against the fingerprint acquisition conditions or the resultant image quality levels, and should provide consistent results throughout the whole variety of images used in any application. Several methods have been developed coping with the segmentation process of fingerprint images. Most of them can be classified in three main groups: a) Segmentation techniques based on grey level information b) Segmentation techniques based on directional information c) Segmentation techniques based on grey level and directional information There are two types of features used in fingerprint segmentation: block features, and pixel features. Because pixel-based segmentation methods are time consuming, block features are mostly used. Independently of the type of features used, segmentation methods divide the fingerprint image into non-overlapped elements –blocks or pixels– and decide on the type of each element –foreground or background– by comparing the extracted feature vectors with certain thresholds. Accurate segmentation is specially important for the subsequent reliable extraction of discriminative traits such as minutiae, singular points, ridge distance and field orientation maps. Segmentation techniques based on grey level information The features used in fingerprint segmentation are mainly derived from the greyscale pixel intensity information of each local neighbourhood. A wide variety of features can be computed: - the greyscale mean and variance in a local neighbourhood, - the greyscale contrast (or the quotient variance/mean) in a local neighbourhood, - the frequency of ridges and valleys in a local neighbourhood, - the histogram of pixel intensities in a local neighbourhood, - the average intensity difference between ridges and valleys, or - the cluster degree of a block. Some works deal with only one of such features. For instance, the fingerprint image in tessellated into non-overlapped blocks and the greyscale variance for each block is computed. If the variance is greater than a global threshold, the block is assigned to foreground, otherwise it is assigned to background. However, some other works mix several of the features in the segmentation criterion. In this direction, in [Chen et al., 2004a] authors present a segmentation algorithm based on multiple block features. The segmentation algorithm uses three types of features: the block clusters degree, the block mean information, and the block variance. The fingerprint image is partitioned into blocks of size W×W pixels, and for each of the blocks: (i) The block cluster degree CluD measures how well the ridge pixels are clustering. The block clusters degree is defined as: CluD = ∑ sign ( I i , j , Img mean )⋅ sign ( Di , j , ThreCluD ) i , j∈Block where: Di, j = m =i − 2 ∑ ∑ sign ( I n= j −2 i+2 j+2 m,n , Img mean ) ⎧1 sign (α , β ) = ⎨ ⎩0 if (α < β ) if (α ≥ β ) Ii,j is the greyscale intensity level of pixel (i,j), Imgmean is the greyscale intensity mean of the whole image, and ThreCluD is an empirical threshold parameter. (ii) The block mean information MeanI measures the difference between the local block mean and the global image mean: ⎛ 1 MeanI = Block mean − Img mean = ⎜ ⎜ W ⋅W ⎝ i , j∈Block ∑I i, j ⎞ ⎟ − Img mean ⎟ ⎠ 44 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 (iii) The block variance Var is used as a third feature. In general, the variance of the ridge-valley structures in the foreground is higher than the variance of the noise in the background. The block variance of one block is computed as: Var = 1 W ⋅W i , j∈Block ∑ (I i, j − Blockmean ) 2 A linear classifier is responsible for testing a linear combination f of the features for each block: f = wT ⋅ x = w0 ⋅ CluD + w1 ⋅ MeanI + w2 ⋅ Var + w3 where w=[w0 w1 w2 w3]T corresponds to the weight vector, and x=[CluD MeanI Var 1]T is the feature vector. The linear classifier is trained for block classification, and the criterion of minimal number of misclassified samples is used. Moreover, morphology (opening and closing) operators are applied as postprocessing stages to reduce the number of classification errors and to improve the segmentation performance of the suggested algorithm. Segmentation techniques based on directional information Some segmentation techniques use the directional image map to discern between valid and nonvalid fingerprint regions. These techniques exploit the existence of an oriented pattern in the foreground, and a non-oriented isotropic pattern in the background. The features normally used in those techniques are: - the average gradient of pixels in a block, - the histogram of pixel gradients in a block, - the gradient coherence at pixel level, or - the standard deviation of the Gabor feature map. In this scenario, the ridge orientation estimation task proceeds prior to the segmentation step. On the one hand, those blocks without a dominant orientation or those blocks with their orientations incorrectly estimated are practically unrecoverable; therefore, it is reasonable to convert them to background. On the other hand, those good quality and recoverable ridge regions are identified as foreground in the segmentation process to be further analysed. In [Maio and Maltoni, 1997] authors suggest to tessellate the image in blocks of size N pixels, and to discriminate foreground and background by using the average magnitude of the gradient in each block. The gradient of a pixel is obtained by computing the directional gradients Gx and Gy: 2 2 G pixel = Gx + G y Once computed the gradient of each pixel, the average gradient of one block can be deduced as: Gblock = ∑G N pixel N Since the fingerprint area is rich in edges due to the ridge/valley alternation, the gradient response is high in the fingerprint area; whereas it is low in the background. The comparison of the computed average gradient with a certain threshold permits to assess the block either to foreground or to background. In this method however, noisy or low quality unrecoverable regions are normally considered as foreground, which reduces the accuracy of the following feature extraction stages. In [Jain et al., 1997b] authors present a segmentation process based on the certainty level CL of the orientation field at pixel level. The certainty level of the orientation field at pixel (i,j) is defined as: V (i, j ) + V y (i, j ) 1 CL(i, j ) = ⋅ x W ⋅W Ve (i, j ) 2 2 where: V y (i, j ) = i+ ∑ u =i − W 2 j+ W 2 ∑ (G (u, v ) − G (u, v )) 2 x 2 y W 2 W 2 v= j− 45 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Vx (i, j ) = i+ ∑ u =i − i+ W 2 j+ W 2 ∑ v= j− j+ W 2 W 2 2 ⋅ Gx (u, v ) ⋅ G y (u, v ) W 2 Ve (i, j ) = ∑ u =i − W 2 W 2 ∑ (G (u, v ) + G (u, v )) 2 x 2 y W 2 v= j− W×W is the size of a local neighbourhood around pixel (i,j), and Gx and Gy are the gradient magnitudes in x a y directions, respectively. For each image pixel, if the certainty level of the orientation field is below a certain threshold TCL, the pixel is marked as a background pixel; otherwise it is considered as foreground. In [Bazen and Gerez, 2000] authors present a segmentation technique based on the coherence of the directional field. According to authors, the main distinction between foreground and background is originated by the strength of the orientation of the ridge-valley structures. Therefore, the coherence, which measures how well all squared gradient vectors share the same orientation, is proposed as segmentation criterion. The coherence varies in the range [0,1]. If the squared gradient vectors are all parallel, the coherence is 1; and if they are equally distributed over all directions, the coherence results 0. In between these two extreme situations, the coherence will vary between 0 and 1, thus providing the required measure. First, the fingerprint impression is low-pass filtered for noise reduction as image conditioning stage. Then, the directional gradients Gx and Gy for each pixel are computed taking a local neighbourhood N into account. Finally, the gradients at pixel level are averaged using a Gaussian filter. From the gradients, the coherence Coh for each pixel is computed as: (G Coh = xx 2 − G yy ) + 4Gxy 2 Gxx + G yy where: 2 Gxx = ∑ Gx 2 G yy = ∑ G y N Gxy = ∑ Gx ⋅ G y N N A coherence threshold is used to distinguish between foreground and background areas, resulting in a segmentation mask. The segmentation mask normally presents spurious points or noisy parts in the fingerprint area, as well as finger regions in the background. In order to remove all those undesired effects, and to get a smooth and clustered fingerprint region, morphological operations are applied on the segmentation mask. The opening operation permits to remove those small foreground areas, while the closing operation is the responsible for filling holes in the foreground region. The resultant segmentation mask delimits in a compact way the basic shape of the fingerprint, covering those good quality and recoverable regions available in the print impression, even in case of dealing with noisy images. In [Shen et al., 2001] authors present a segmentation technique that is based on Gabor filters. The fingerprint image is divided into blocks of size W×W pixels, and for each block, the response of m oriented Gabor filters is computed in order to determine whether the block belongs to the foreground or to the background. An even symmetric Gabor filter has the following spatial form: 2 2 ⎧ ⎫ ⎪ 1 ⎡ xθ k yθ k ⎤ ⎪ g (x, y ,θ k , f , σ x ,σ y ) = exp ⎨− ⎢ 2 + 2 ⎥ ⎬ cos 2πfxθ k ⎪ 2 ⎣ σ x σ y ⎦⎪ ⎩ ⎭ where: 46 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 xθk = x ⋅ cos θ k + y ⋅ sin θ k yθk = − x ⋅ sin θ k + y ⋅ cos θ k k = 1, ..., m θ k = π ⋅ (k − 1) / m f is the frequency of the sinusoidal wave, m denotes the number of orientations under study, θk is the kth orientation of the Gabor filter, and σx and σy are the standard deviations of the Gaussian envelope along the x and y axes, respectively. Since most local ridge structures of fingerprints present well-defined local frequency and orientation, in each block, f can be set by the reciprocal of the average inter-ridge distance, and the orientation θk is set to cover the range [0,π) by means of m equidistant orientations. After deciding the parameters of the Gabor filters in each block, m Gabor filters are obtained. The convolution of each Gabor filter with the grey level image block results in the magnitude of m directional Gabor features, computed as: G (x0 , y0 , θ k , f , σ x , σ y ) = W −1 2 W x0 = − 2 ∑ W −1 2 W y0 = − 2 ∑ I (x 0 + x, y0 + y ) ⋅ g (x, y , θ k , f , σ x , σ y ) where I(x,y) denotes the greyscale fingerprint image, g(x,y,θk , f,σx ,σy ) is the oriented Gabor filter, and G(x,y,θk , f,σx ,σy ) refers to the magnitude of the Gabor feature. The standard deviation of the m Gabor features corresponding to one block is used for block segmentation. After obtaining m Gabor features Gθk, the standard deviation value G ′ is computed as follows: __ 1 m Gθ = ∑ Gθ m k =1 k G' = __ 1 m ⎛ ⎞ ∑ ⎜ Gθ k − Gθ ⎟ m − 1 k =1 ⎝ ⎠ 2 It is observed that for good quality image blocks, with a dominant ridge orientation, the values of one or several Gabor features are much bigger than the values of the rest. And for poor quality image blocks or background blocks without local ridge orientation however, the values of all m Gabor features are close to each other. Therefore, the standard deviation of the m Gabor features G′ is used for both purposes: foreground/background segmentation and image quality estimation. The comparison of the standard deviation of each block with a certain threshold results in the segmented image. The presented technique has several drawbacks: loss of precision occurs in the borders of the region of interest or in low contrast regions, and some valid regions that may contain important ridge and minutiae information are lost as background. In order to overcome such limitations, in [Alonso-Fernandez et al., 2005] authors introduce some improvements to the Gabor-based segmentation algorithm. The loss of precision in the borders is solved by the introduction of a tolerance box around the borders of the region of interest to discard those ridges, singular points, or minutiae points found in this area since they are not stable for recognition. In order to avoid losing valid regions in the foreground, authors use block overlapping technique. In this new approach, authors use blocks of size W×W with an overlapping of W/2 pixels. As a result, four different Gabor features are obtained for each W/2×W/2 block of the image, which are then averaged. Moreover, some additional heuristic constraints have been imposed in order to discard those blocks not suitable for the frequency estimation algorithm. Experimental results show that the proposed enhanced algorithm provides higher robustness than the original one. A new fingerprint segmentation technique based on multiscale Gabor Wavelet filter banks is proposed in [Bernard et al., 2002]. A bidimensional Gabor Wavelet presents the general form: gω ,θ ( x, y ) = 1 σθ ⊥ ⋅e − v2 2 2 ⋅σ θ ⊥ ⋅ 1 σθ ⋅e − u2 2 2⋅σ θ ⋅ ei ⋅ω ⋅u 47 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 with: u = x ⋅ cos θ + y ⋅ sin θ v = − x ⋅ sin θ + y ⋅ cos θ σθ, σθ┴ are scale parameters in the direction of the wave and its orthogonal direction, respectively. Therefore, a bidimensional Gabor Wavelet is composed of a band pass filter in the direction θ of the wave, and a low pass filter in the orthogonal direction θ┴. A filter bank of Gabor Wavelets comprising eight orientations and three frequencies is used to estimate the local orientation and frequency of the fingerprint. A total of 24 filters are applied to each local neighbourhood of the fingerprint at pixel level. For each point (x0 ,y0), a local neighbourhood of size W×W is considered. The fingerprint image f(x,y) is normalized as f′(x,y) to have a constant energy V, being the mean value of f′ equal to 0, and its norm independent of (x0 ,y0): V f ' ( x, y ) = ⋅ ( f ( x, y ) − m( x0 , y0 )) v ( x0 , y 0 ) where m and v are the mean and variance values of f in the local neighbourhood of (x0 ,y0), respectively. The local projection of f′ on each of the 24 filters of the bank is computed as: W 2 ∫ ∫ f ' (x, y ) ⋅ gω θ (x, y ) ⋅ dx ⋅ dy , W 2 − W 2 W 2 Aω ,θ ⋅ ei ⋅ϕ ⋅θ = − f ' ⋅ gω ,θ Aω ,θ ∈ [0,1] with: ϕω ,θ ∈ [0,2π ) The local projection is a complex number. For segmentation purposes, only the magnitude Aω,θ of the projection is taken into account. For each pixel, the filter that gives the best coefficient of projection Aω,θ is selected and provides the local direction θ and the local frequency ω. After that, a low pass filter is applied on each parameter in order to obtain the final values to be used (θ′, ω′ and A′θ,ω). Because the energy of each pixel neighbourhood is normalized, the coefficients A′θ,ω are not influenced by the local contrast of the print. Two thresholds T1 and T2 (0 < T1 < T2 < 1) are used to perform the segmentation of the original fingerprint as follows: (i) if 0 ≤ A′θ,ω (x0 ,y0) < T1, it means that the pixel neighbourhood does not have an oriented and periodic structure, and the point (x0 ,y0) is considered as background; (ii) if T1 ≤ A′θ,ω (x0 ,y0) < T2, the pixel neighbourhood has a weak oriented and periodic structure, therefore the point is placed in a noisy area; (iii) if T2 ≤ A′θ,ω (x0 ,y0) < 1, it means that the neighbourhood of the given point has a strong oriented and periodic structure. The point is therefore considered as foreground. The presented segmentation technique is able to distinguish among background regions, noisy areas, and the region of interest. The segmentation avoids the detection of false ridges, singularities or minutia points in noisy areas, as well as the loss of valid features in foreground regions; and it provides a global quality score that can be used for automatic rejection of low quality images. Segmentation techniques based on grey level and directional information This group is based on the fusion of multiple features, related to the grey level intensity of the image pixels and the directional ridge-valley information. Both features are merged together in an automatic classifier in order to distinguish those noisy areas present in the image from the foreground. In [Bazen and Gerez, 2001] authors present an algorithm for the segmentation of fingerprints based on pixel features. The mean, the variance, and the coherence of the gradient are the three parameters used to perform segmentation: 48 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 (i) The mean corresponds to the average grey value in a local neighbourhood of each pixel. The mean grey value of those pixels in the foreground is in general darker than those in the background. (ii) The variance determines the greyscale uniformity level of the pixels in a local neighbourhood. In general, the variance of the ridge-valley structures in the foreground is higher than the variance of the noise in the background. (iii) The coherence gives a measure about how well the gradients are pointing in the same direction. Since a fingerprint mainly consists of parallel line structures, the coherence is considerably higher in the foreground than in the background. For noise reduction purposes, once computed all three features for each pixel, the features are averaged by means of a Gaussian filter. The averaged features of each pixel are merged together in a linear segmentation function f: f = wT ⋅ x = wo ⋅ Coh + w1 ⋅ Mean + w2 ⋅ Var + w3 where w=[w0 w1 w2 w3]T corresponds to the weight vector, and x=[Coh Mean Var 1]T is the feature vector. The comparison of the segmentation function value for each pixel with a null threshold results in the segmented image. A linear classifier is then needed to be applied to all image pixels. The classifier is first trained with a set of fingerprints in order to find the optimal values for the weight vector w that minimizes the probability of misclassifying feature vectors x. Once the input fingerprint is initially segmented by the linear classifier, morphological operators are applied on the segmented image as a postprocessing task to repair noisy segmentation areas, reduce the number of classification errors, and to obtain compact clusters in the segmented image. First, small clusters that are incorrectly assigned to the foreground are removed by means of an open operation. Next, small clusters that are incorrectly assigned to the background are removed by means of a close operation. In [Hadhoud et al., 2006] authors combine four different features to perform fingerprint segmentation. The fingerprint image is tessellated into blocks of size W×W pixels and the next features are computed for each block: (i) the greyscale mean intensity level, (ii) the greyscale variance, (iii) the gradient, and (iv) the certainty level of the orientation field. Each set of features corresponding to one block are compared with fixed thresholds in order to classify each of the blocks as image foreground or background. In [Ratha et al., 1995b] authors partition the fingerprint image in blocks of size W×W pixels. For each block, the orientation field is computed based on gradient magnitude. Once determined the field orientation for each block, the greyscale variance in the direction orthogonal to the orientation of the ridges is used to decide which part of the image belongs to the foreground and which part to the background. This segmentation method is based on the fact that background and noisy image regions have no directional dependence, so they present a low variance in all the directions. However, the regions of interest (fingerprint area) exhibit a very high variance in a direction orthogonal to the orientation of the pattern, and a very low variance along the direction of the ridges. The usage of a specific threshold determines the segmentation result. The effectiveness of this method is based on the reliability in the computation of the orientation field. Similarly, some works make use of the directional histograms and the greyscale variance of each block to effectively isolate those regions of interest in fingerprint images. Others take advantage of the greyscale variance at pixel level in a set of known directions, or the greyscale mean and variance together with the directional information to perform segmentation of fingerprint images. Despite the big amount of algorithms published in scientific literature, fingerprint segmentation is not a fully solved problem in fingerprint recognition. Segmentation stage mainly aims at improving the accuracy of consequent feature extraction stages. Spurious features or missing true features are generally produced by inclusion of unrecoverable low quality regions as foreground, or recoverable regions as background, respectively. An efficient fingerprint segmentation process helps to avoid such problems and to improve the accuracy of the recognition system. Besides, fingerprint segmentation reduces the size of the fingerprint image to be further processed so, it allows reducing time expenditure in the following image processing tasks linked to the recognition algorithm. 49 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.3.5. Contextual Image Filtering Some of the most common imperfections that affect fingerprint impressions are the following: (i) smudgy areas generated by wet fingers, over-pressured acquisitions, or over-inked impressions; (ii) regions with ridge breaks generated by dry fingers, under-pressured acquisitions, or under-inked impressions; and (iii) elastic distortions in the fingerprint pattern due to the inherent elasticity of fingerprints. All these types of noise can be mitigated by means of contextual image filtering techniques. Contextual image enhancement or contextual filtering refers to the process of adapting the filter design to the input image characteristics (the scale –given by the scanner resolution– of the acquired prints, the local ridge orientation, the local ridge distance, etc.) in order to efficiently improve the quality of the input images. One example of image enhancement through contextual filtering is depicted in Figure 31. Figure 31. Image enhancement through contextual filtering. A basis filter mask is first built with a reference orientation Φ = 0º. The enhancement process is carried out at pixel level, and the filter to be applied in each region of the image is first adjusted to the local features of the input print. To set one example, the dominant ridge orientation and ridge frequency in a small region around each pixel is first computed, and the basis mask, tuned with the same ridge frequency, is rotated the same orientation angle. Once the filter is adjusted to the characteristics of the local region under process, the enhancement is possible. Given an input fingerprint I(x,y), the enhanced fingerprint image E(x,y) is determined by applying to each pixel location the filter mask f(i,j) of size K×K pixels tuned to the characteristics of the image: E ( x, y ) = i=−K ∑ ∑ f (i, j ) ⋅ I (x + i, y + j ) j =−K K K The enhancement technique is based on the convolution of the input image with the proper filter mask in each local region. The enhancement filters are designed to make ridges clearly differentiated from each other. On the one hand, they provide a low pass filter along the ridge direction in order to connect broken ridges, link small ridge gaps, fill pores, and remove noise and dust. On the other hand, these filters perform band pass in the direction orthogonal to the ridges in order to separate ridges from each another. A classification of some of the most used image enhancement contextual filtering techniques is listed below: a) Gabor filtering b) Fourier filtering c) Wavelet filtering d) Morphological filtering e) Directional median filtering 50 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Gabor filtering Adaptive enhancement techniques based on Gabor filtering are the most widely used techniques. Owing to the fact that a fingerprint can be modelled as a set of sinusoidal-shaped waves with welldefined frequency and orientation characteristics in a local region, a set of Gabor filters tuned to the frequency and orientation of each region of the image can be used to effectively remove noise and improve the clarity of the fingerprint ridge-valley pattern. Hong et al. propose in [Hong et al., 1998] a fingerprint enhancement technique based on Gabor filters. Given a fingerprint image impression I(x,y), from which the ridge frequency and ridge orientation maps are deduced, the Gabor filter has the general form: h (x, y , Φ, f , σ x , σ y ) = e 1 ⎛ x2 y2 ⎞ − ⋅⎜ Φ + Φ ⎟ 2 2 ⎜σx σ2 ⎟ y ⎠ ⎝ ⋅ cos 2πfxΦ with: xΦ = x ⋅ cos Φ + y ⋅ sin Φ yΦ = − x ⋅ sin Φ + y ⋅ cos Φ where f is the frequency of the local sinusoidal wave, Φ is the ridge orientation, and σx and σy are the standard deviations of the Gaussian envelope along x and y axes respectively. The fingerprint image is split into non-overlapped blocks, and for each block (u,v) of the image, the ridge frequency f(u,v) and ridge orientation Φ(u,v) are used to define the customized Gabor filter. The accuracy in determining both parameters will set the performance of the filtering process. If f(u,v) is larger than the real ridge frequency, spurious ridges are created in the filtered image; if f(u,v) is too small, nearby ridges are merged into one. Similarly, if Φ(u,v) is not accurate enough, spurious or smoothed ridges can result. The selection of the values for σx and σy involves a trade-off: the larger the values, the more robust the filters are against noise, but the more likely the filters will create spurious ridges and valleys, or the more likely to smooth the image to the extent that the ridges and valleys details in the fingerprint are lost. The parameters σx and σy are normally selected based on empirical tests under a set of fingerprints. For each pixel of the image I(x,y), (i,j)∈ Block(u,v), the convolution of the pixel neighbourhood with the Gabor filter of size K×K, tuned to the characteristics of the pixel’s block, results in the enhanced image E(x,y) as follows: E ( x, y ) = i=−K ∑ ∑ h(i, j, Φ ( K K j =− K u,v ) , f (u , v ) , σ x , σ y ) ⋅ I ( x + i , y + j ) Gabor filters have the properties of spatial localization, orientation selectivity, and spatial-frequency selectivity, which make them suitable as band pass filters able to remove the noise and preserve the true ridge/valley structures. Although contextual Gabor filtering is one of the most efficient fingerprint-based enhancement techniques, this approach suffers from some drawbacks. Gabor filter is sensitive to the ridge orientation and the ridge frequency, and also to the spatial parameters of the Gaussian envelope. Gabor filtering can give inaccurate results in those regions closer to singular points where abrupt changes in ridge frequency and orientation occur, or in noisy regions where an unreliable ridge orientation is estimated. Moreover, Gabor filter implies large filter mask size, which requires big memory resources, and high computational cost. For this reasons, other modified Gabor design methods have been developed that try to reach fast implementations at the cost of not improving or even reducing the enhancement performance with regard to the original approach. In this direction, in [Jang et al., 2005] authors propose a half Gabor filter-based enhancement technique suitable for on-line authentication applications based on embedded systems, where the original Gabor filter implementation can suffer from large computational cost and unacceptable execution times. The enhancement algorithm is based on two filters: a half Gabor filter and a half Gabor stabilization filter. Two fast convolutions with the proposed filters are performed to enhance the input image. According to the authors, the proposed algorithm is faster than the conventional Gabor filter, reduces the memory requirements for filter mask storage, and enhances the ridge patterns as reliably as in the original Gabor-based filtering technique. Other methodologies developed in literature are based on the implementation of 1-D Gabor filters. 51 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Areekul et al., 2004], the original 2-D Gabor filter is transformed into a set of orientation separable 1-D Gabor filters in order to reduce the computational complexity of the filtering stage. Other methods to reduce the computational load of Gabor filtering deal with the implementation of the algorithm itself. There exists an important trade-off between enhancement accuracy and processing time when implementing the algorithms under microprocessor platforms. The use of highly accurate estimation methods often leads to an excessive execution time. In this direction, in [Chen et al., 2004b] authors replace the floating-point operations of Gabor filtering by fixed-point arithmetic in order to cope with the physical implementation of authentication systems on those embedded platforms in which floating-point processing units are absent. Other techniques deal with less accurate estimation methods of the frequency/orientation (i.e. discretized range of frequencies and orientations) to reduce the computational complexity of the algorithms [Chen and Chiu, 2006], [Hong et al., 1996], [Klimanee and Nguyen, 2004]. Fourier filtering Enhancement techniques based on Fourier filtering were mainly developed by Sherlock, Monro, and Millard in [Sherlock et al., 1994]. In the proposed enhancement method, the fingerprint image is ported from the spatial domain to the frequency domain by application of a 2-D discrete Fourier transformation. Given an original image, it is possible to estimate the ridge frequency and the ridge orientation in each local neighbourhood of the fingerprint. Because the exact computation of the ridge frequency and the ridge orientation for each local neighbourhood is highly expensive, a representative set of values for the parameters ridge frequency and ridge orientation are used instead (orientation and frequency filter-banks). Once in the frequency domain, the image is directionally smoothed using N filters. The number of filters N used in the algorithm corresponds to the combination of n1 discrete directions and n2 ridge frequencies; and each filter is tuned to a particular ridge orientation and ridge frequency (N = n1 · n2). The Fourier filter is expressed in the form: h(ρ , θ ) = H radial (ρ ) ⋅ H angle (θ ) which is separable in polar coordinates as a function of the ridge frequency ρ, and the ridge orientation θ. For Hradial(ρ), a second-order Butterworth band pass filter is chosen: H radial (ρ ) = ( ρ ⋅ρ ) ( ρ ⋅ρ ) +( ρ BW 4 BW 4 2 − ρ0 ) 4 2 being ρ the ridge frequency, ρ0 the central frequency of the filter, and ρBW the bandwidth. And for Hangle(θ), the following function is used: ⎧ 2 ⎛ π θ − θ0 ⎞ ⎟ if θ < θ BW ⎪cos ⎜ ⋅ ⎜2 θ ⎟ H angle (θ ) = ⎨ BW ⎠ ⎝ ⎪ ⎩ 0 otherwise where θ is the ridge orientation, θ0 the orientation of the filter, and θBW the angular bandwidth. The central frequencies ρ0 and θ0 are selected from their small range of values; and ρBW and θBW are adjusted according to the bandwidth range the filters have to pass. Once each filter is independently applied to the Fourier image, the inverse Fourier transform is used to convert each image back to the spatial domain, thereby producing a set of N directionally filtered images called prefiltered images. The next step consists of the construction of the final enhanced image making use of an adaptive interpolation method applied to the set of prefiltered images. The output of the filtering stage is an enhanced version of the original image, which has been smoothed, pixel by pixel, in the specific frequency and direction of the ridges. Wavelet filtering Another technique developed for fingerprint image enhancement is the one based on Wavelet transform and Gabor filtering [Paul and Lourde, 2006], [Zhang et al., 2002], [Wen et al., 2005], [Hatami et al., 2005]. In this method, the fingerprint image enhancement process is based on Gabor 52 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 filtering in the Wavelet domain instead of in the spatial domain. First, the image is transformed to the Wavelet domain. In the Wavelet domain, the transformed image is filtered by means of Gabor techniques, and then the enhanced image is reconstructed by inverse Wavelet transform. The Wavelet transform is used to eliminate noise from the input image and to increase the contrast between the ridge pattern and the background; and the Gabor filter method permits to further enhance the ridge-valley pattern by making use of the orientation and frequency information. Wavelets are basis functions able to represent signals in time-frequency domain. Wavelet transform can be seen as a transformation that maps the signal to a multi-resolution representation. The continuous one-dimensional Wavelet transform (CWT) is a decomposition of a continuous signal f(t) into a set of basis functions or Wavelets: ∗ W (a , b ) = ∫ f (t ) ⋅ψ a , b (t ) ⋅ dt ψ∗a,b(t) is the mother Wavelet. All the Wavelet functions used in the transformation are derived from the mother Wavelet through translation (shifting) and scaling (dilation or compression): 1 ⎛t −b⎞ ∗ ⋅ψ ⎜ ψ a , b (t ) = ⎟ ⎝ a ⎠ a where a represents the scale factor, and b represents the translation factor (a,b∈ R, a ≠ 0). When dealing with discrete signals, the discrete Wavelet transform (DWT) is used. DWT is suitable to compute multi-resolution representations of discrete signals such as 2-D fingerprint impressions. A Wavelet transform of an image consists of four sub-images, with each sub-image occupying a quarter of the original image area. Each sub-image is computed by convolving the original image with low and high pass filters, in both row and column directions. Wavelet transform is proven to be an effective tool of image de-noising. The image is decomposed at skeleton level and the denoising is done on the sub-images. The Wavelet inverse transform of the enhanced sub-images allows improving the clarity and continuity of the ridge-valley structures in the original fingerprints. Morphological filtering In [Milici et al., 2005] authors propose a fingerprint enhancement approach based on morphological filtering applied on greyscale images, exploiting fingerprint local ridge orientation characteristics, but without taking into account other information such as ridge frequency characteristics just to overcome the high computational demand of previous techniques. The original image is split into non-overlapped blocks of size 8×8 pixels. The field orientation map corresponding to each block is computed, resulting in an M×N local direction matrix. A morphological direction-oriented structuring elements database is developed. As shown in Figure 32, a set of 16 oriented structuring elements is used for enhancement purposes. Grey colour denotes pixel locations with a high grey level intensity; and white colour denotes low grey level locations. Φ = 0.0º Φ = 11.25º Φ = 22.5º Φ = 33.75º Φ = 45.0º Φ = 56.25º Φ = 67.5º Φ = 78.75º Φ = 90.0º Φ = 101.25º Φ = 112.5º Φ = 123.75º Φ = 135.0º Φ = 146.25º Φ = 157.5º Φ = 168.75º Figure 32. Directional morphological filter templates. For each region with a fixed local orientation Φ, a morphological greyscale closing operation by means of a 3×3 oriented structuring element sΦ is applied. The orientation of the structuring element sΦ corresponds to the orientation Φ of the block. The closing of an image f(x,y) by a structuring element sΦ(i,j) is denoted by f • sΦ, and is a transformation at pixel level, defined by a morphological dilation (⊕), followed by a morphological erosion (⊗): 53 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 f • sΦ = ( f ⊕ sΦ ) ⊗ sΦ with: ( f ⊕ sΦ ) (x, y ) = max { f (x − i, y − j ) + sΦ (i, j ) } i, j ( f ⊗ sΦ ) (x, y ) = min { f (x + i, y + j ) − sΦ (i, j ) } i, j The directional morphological enhancement process carried out at block level on the image closes holes and connects broken ridges preserving thus the ridges from fragmentations and discontinuities. The composition of the different filtered blocks results in the enhanced greyscale image. Directional median filtering Many types of noises can exist in a fingerprint impression. Gabor filtering is essentially a Gaussian modulated filtering technique, able to efficiently remove those noises with Gaussian distribution. However, the Gaussian-distribution noise model cannot exactly represent the whole noise present in a fingerprint image. In fact, pepper-and-salt noise, smudges in the valleys, or interrupted ridge lines cannot be modelled by Gaussian methods. Therefore, in [Wu et al., 2004] authors suggest the use of a directional median filter technique to reduce those impulse-type noises along the direction of the ridge flow, as a complement to those existing techniques focused on the removal of Gaussiandistributed type of noises. The median filter is able to remove pepper-and-salt noise, join broken fingerprint ridges, fill out the holes present in the finger impression, clean out the smudges of small and medium sizes present in the valleys, smooth irregular ridges, and remove small artefacts between ridges. Median filtering is performed by replacing a pixel with the median value of the selected neighbourhood. In order to adapt the filter shape to the local ridge orientation, pre-selected windows of size (2m+1) × (2n+1) are built (m,n∈N). Authors use eight possible orientations (0º, 22.5º, 45º, 67.5º, 90º, 112.5º, 135º, 157.5º). Based on the defined directions, the shapes of the directional median filters are depicted in Figure 33, where black dots correspond to the reference pixels, and grey pixels represent the local neighbourhood taken into account in the median filters. Based on empirical data, the optimum neighbourhood size for the median filter is set to 9 pixels. The filtering process is done at pixel level. For each pixel, the adaptive median filter mask with the same direction as the local ridge is used. It is proven that the restored fingerprint images result more suitable than the original images for automatic feature extraction. Φ = 0.0º Φ = 22.5º Φ = 45.0º Φ = 67.5º Φ = 90.0º Φ = 112.5º Φ = 135.0º Figure 33. Directional median filter templates. Independently of the methodologies covered in this section, the design of filters efficiently adjusted to the local characteristics of the input images permits to develop adaptive enhancement techniques for greyscale fingerprint impressions. The enhanced images result in directionally smoothed versions of the original prints, from which most of the unwanted information –noise– is removed, but which still contain that relevant and distinctive information –ridge-valley patterns– that need to be taken into account in the next stages of the recognition process. 54 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.3.6. Conclusions In general, most enhancement techniques published in scientific literature feature good performance when dealing with good quality fingerprints. However, when the quality of the input fingerprint images decreases, special efforts must be paid to reach good enhancement results on low quality prints. Efficient image enhancement techniques are in charge of recovering those corrupted regions to provide good accuracy performance to the automatic fingerprint recognition systems. In case of having regions of the fingerprints affected by noise, where the quality is low, the enhancement techniques make use of those surrounding areas with reliable information in order to reconstruct the degraded portions and to enhance the fingerprint impressions. However, if the quality level of the affected regions is too poor to be reconstructed in a reliable way from their neighbour areas, those regions are labeled as unrecoverable and discarded in the enhancement stage. Fingerprint image enhancement has become a necessary and common step that takes place after the image acquisition process and before the features extraction phase in AFIS/AFAS applications. 2.4. Feature Extraction Inherently, fingerprints have significantly high information content, which is substantially larger than other personal identifiers such as PINs or passwords. For reasons of security, most personal recognition systems do not store complete fingerprint images but only those sets of distinctive traits extracted from them. In a similar way as a human expert methodology, to decide if two finger impressions correspond to the same user, it is needed to determine which are those salient features of the finger impressions which: - can discriminate between different identities, as well as - remain invariant when dealing with several instances from the same finger/individual, in order to match them and take a decision based on the similarity level of both salient feature sets. The fingerprint pattern, when analysed at different scales, exhibits several types of features: (i) At a coarse level, a fingerprint is characterized by the ridge-valley flow and those distinctive features such as the field orientation map [Kulkarni et al., 2006], the ridge map [Xie et al., 2005], or its singular points [Park et al., 2006]. Coarse level features are aimed at fingerprint classification. (ii) At a fine level, the salient fingerprint features correspond to those ridge characteristics called minutiae, and mainly focused on two kind of points: ridge endings (defined as the ridge point where a ridge ends) and ridge bifurcations (where the ridge diverges into two or more branch ridges). The scientific community has accepted minutia points as efficient finger characteristics, and their high discriminating power has been proven along a deep list of biometric matching algorithms [Yager and Amin, 2004], [Maltoni et al., 2009]. Fine level features are mainly used for matching purposes. (iii) At a very-fine level, the salient features are those intra-ridges details such as sweat pores. However, to allow the extraction of pores, it is needed to use high-resolution fingerprint scanners (of about 1000 dpi) instead of traditional sensors (featuring typical resolutions around 500 dpi). Very-fine level fingerprint features aim at matching purposes as well [Zhao and Jain, 2010]. A classification of those distinctive features used to define fingerprints is as follows: a) Greyscale-based features b) Singular points c) Ridge map d) Minutiae e) Pores and other ridge details Each of the features is covered in the next sections. 2.4.1. Greyscale-based Features Given an enhanced fingerprint image, it is possible to perform the characterization of the fingerprint based on the greyscale intensity of their pixels in every local region. For such a purpose, the enhanced image is initially tessellated into blocks, and for each of the blocks, a set of features is 55 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 deduced that defines the block. Many parameters can be used as discriminatory information: the intensity mean, the standard deviation, or the variance in each local neighbourhood, their histograms, etc. Other approaches make use of representation schemes that combine global and local information of a fingerprint by means of the application of a bank of Gabor filters with different orientations and frequencies. Contextual filtering captures coarse- and fine-level information of the enhanced image in a compact form, and the resultant images are used to characterize the acquired fingerprints. Independently of the selected method, the set of features extracted from the images plays the role of fingerprint pattern descriptors. Correlation matching techniques are normally used to evaluate the similarity of those fingerprints characterized through greyscale feature sets. Greyscale-based features alone exhibit a reduced fingerprint distinctiveness level. For this reason, they are normally merged with other additional descriptor feature sets in order to continuously improve the recognition accuracy performance exhibited by personal authentication systems. 2.4.2. Singular Points Singular points are usually considered as high-level features linked to fingerprint pattern classification. Singular points are the points where the orientation field is discontinuous. Sir Edward Richard Henry, in his book Classification and Uses of Finger Prints, defined two types of singular points in terms of the ridge-valley structures: the core and the delta. The pattern associated to core points and the one associated to delta points are different, so it is possible to discern between both types of singularities. The core is the topmost point of the innermost curving ridge, and the delta is the center of triangular regions where three different direction flows meet. These points are highly stable, and rotation and scale invariant. Because of that, some recognition algorithms use them as reference points in order to align fingerprint impressions. Figure 34 shows the singular points of a fingerprint. Core Delta Figure 34. Singular points. Based on the number, type, and position of such singular points, it is possible to classify fingerprints in five different classes as whorl, right loop, left loop, arch and tented arch. Whorls have one core and two delta points (twin loops have two cores and two delta points), loops and tented arches contain one core and one delta, and arch-type fingerprints do not have singular points, as shown in Figure 35. Apart from classification purposes, the number of cores and deltas available in a fingerprint and their relative position becomes valuable information used for fingerprint recognition purposes. In order to accurately extract the singularities of a fingerprint impression it is needed to analyse the image field orientation map. Many techniques have been detailed in literature. Among them, the Poincaré index method is one of the most widely used techniques. 56 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 35. Location of singular points in fingerprints. The Poincaré index is obtained by calculating the directional difference between the adjacent blocks in a counterclockwise closed contour around an image point/block. The summation of the cumulative change in the orientation through the closed contour states whether the center of the contour contains or not a singular point. When this computation technique is applied around a core point, the cumulative change in the orientation results in π radians; carrying out this procedure around a delta point results in –π radians; however, when applied to locations that do not contain any singular point, the cumulative orientation change results zero. The Poincaré index at pixel (x,y), which is enclosed by the digital curve with Nψ pixels, can be computed as follows: N 1 ψ Poincaré index ( x, y ) = ∑ ∆(k ) 2π k = 0 with: ⎧δ (k ) , if δ (k ) < π / 2 ⎪ ∆ (k ) = ⎨π + δ (k ) , if δ (k ) ≤ −π / 2 ⎪π − δ (k ) , otherwise ⎩ i' = (i + 1) mod N ψ δ (k ) = O ψ x (i' ) , ψ y (i' ) − O ψ x (i ) , ψ y (i ) ( ) ( ) O(x,y) is the orientation field of the fingerprint, and ψx and ψy are the coordinates of the pixels in the enclosed contour around the candidate point (x,y). The Poincaré index takes the values ½, -½, and 0 for a core point, a delta point, and an ordinary point, respectively. The singular points extraction method based on the Poincaré index analysis has good performance in good quality images, from which it is feasible to get reliable field orientation maps. However, this technique is sensitive to noise, and its performance decreases in low quality images that feature less reliable field orientation maps. In order to eliminate the transition of π radians that exists in the directional field between the orientations θ(x,y) = -½ π and θ(x,y) = ½ π, the doubled orientation 2θ(x,y) is proposed in [Bazen and Gerez, 2002]. This makes the transition from 2θ(x,y) = -π to 2θ(x,y) = π continuous or, in other words, the orientation is considered to be cyclic. As a result, the Poincaré index is doubled. Once computed the directional field for each image pixel, the method is able to identify if any pixel corresponds to a singular point. Instead of summing the changes in orientation around the contour of a pixel, the summation of the gradients of the doubled orientation over the contour of the pixel is performed. In this new technique, a core point results in a cumulative change of 2π radians, a delta point in -2π radians, and the result for the rest of image pixels is equal to 0. 57 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Another modification of the Poincaré index computation method is suggested in [Zheng et al., 2006]. Authors propose an improved Poincaré index method to judge the type of candidate points. A new computation method is suggested over a disk region around the candidate point instead of around the enclosed contour. A disk of radius 15 pixels and centered on the candidate point is selected. The disk region is split into 8 sectors, and the dominant direction in each sector is computed by using the directional histogram of the pixels in each sector. The Poincaré index is then computed by using such sector directions. If the computed index is ½, the candidate singular point is classified to the category of core; if the index is -½, the singularity is regarded as a delta; otherwise, the candidate is considered a spurious singular point. Besides the usage of singular points for fingerprint classification purposes, singular points are also used as reference points for fingerprint alignment and matching purposes. However, the fact that fingerprint sensor technology continuously evolves to smaller sizes originates that in case of partial print impressions it is possible to miss singular points. Therefore, the distinctiveness of those coarse level salient features is not sufficient for accurate matching, and more features need to be used for reliable matching in those applications that deal with small size fingerprint impressions today. 2.4.3. Ridge Map Fingerprints can be seen as graphical ridge patterns present on human fingers. By inspecting a greyscale fingerprint impression, it may be noted that the image pixels located on fingerprint valleys usually exhibit lower grey level intensities than the image pixels located on fingerprint ridges. A greyscale fingerprint image can therefore be transformed into an explicit representation of complete ridge structures, known as ridge map. Some fingerprint feature extraction algorithms rely on detecting the ridge map by performing a binarization process of the input greyscale images, whereas some others attempt to extract the ridge map directly from the greyscale fingerprints. Ridge map extraction through binarization techniques An attempt to extract the ridge map of an enhanced image would be to classify image pixels as ridge or valley pixels by comparing their intensities with a certain global intensity threshold value. Unfortunately, such a simple approach fails in the presence of noisy images or prints with different contrast regions owing to factors such as variations of pressure between the finger and the acquisition sensor, worn artefacts, the environmental conditions present in the acquisition process, and so forth since such a kind of approach does not take into consideration the strong correlation that exists among neighbourhood intensity values in the fingerprint pattern. In general, different portions of an image may be characterized by different contrasts and intensities. Therefore, a single threshold cannot guarantee acceptable binarization results in all working conditions. Among the numerous fingerprint binarization techniques published in literature [Meenen and Adhami, 2005], [Zhang and Xiao, 2006], [Bartunek et al., 2006], [Maltoni et al., 2009], those more relevant approaches can be classified in three main groups: a) Binarization through locally adaptive thresholding techniques b) Binarization through contextual filtering/enhancement techniques c) Binarization based on peak detection in the grey-level profiles along sections orthogonal to the ridge orientation. The main purpose of the binarization process is to improve the clarity of ridge structures, maintain their integrity, avoid introduction of spurious structures or artefacts, and retain the connectivity of the ridges while keeping separation between different ridges. Some examples of ridge maps are depicted in Figure 36. When using local thresholding techniques, the binarization threshold is adapted to the image conditions in each region of the fingerprint. The image is initially partitioned in small regions, and each region is processed separately. From the local greyscale characteristics of each image region, a binarization threshold is deduced able to separate fingerprint ridges and valleys. 58 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 36. Fingerprint ridge maps. In case of deploying filtered-based binarization approaches, those methods rely on the application of a filter to the greyscale fingerprint image that either directly produces a binary result, or that produces an output that is easily converted to binary by simple thresholding. These filters may be applied in the spatial or frequency domains, and are filters usually tuned to the frequency and orientation of the ridges in each local neighbourhood. Approaches in this area have provided some of the most promising results in the field of fingerprint image binarization. In works like [Wei et al., 2004], the enhancement of the original fingerprint is carried out by means of oriented filters with zero DC components, thus the threshold can be invariant for each image block and equal to zero. That is, if the result of the filter application at a point is above zero, the pixel is set to the valley; otherwise, it is set to the ridge. The pre-computation of each oriented filter with a null DC component allows keeping the threshold as zero for the complete image. The third approach aims at binarizing the ridge-valley pattern from the observation that, when constructing an one-dimensional sequence by collecting the intensity values of the pixels located along a short line segment orthogonal to the local ridge orientation, the sequence usually exhibits an almost sinusoidal shape with low and high intensity values corresponding to pixels located on the ridges and on the valleys intersected by the segment line respectively. Therefore, pixels can be reliably identified to be ridge or valley pixels based on these properties. A binary ridge map is obtained with ridge pixels assigned to ‘1’, and the remaining points to ‘0’. Moreover, a refinement step is normally applied on the obtained ridge map in order to remove holes and other artefacts. This technique is deeply used in works such as [Ratha et al., 1995b], [Hong et al., 1996], [Onnia and Tico, 2002], [Gao and Xie, 2006] and [Kim and Park, 2003]. Ridge map extraction from greyscale images Some works focus on ridge map extraction through a sampling method that uses the grey level information of pixels and the direction information to trace the ridge lines [Maio and Maltoni, 1997], [Jiang et al., 2001], [Ma et al., 2005]. At each ridge line tracing step, the algorithm attempts to locate a point representing the local ridge line in a section along the ridge direction. By connecting all the traced points, a polyline approximation of the ridge pattern can be obtained. As already discussed, some ridge map extraction approaches aim at differentiating fingerprint ridge and valley features directly in greyscale domain, whereas some others result in binary images from where the ridge-valley pattern is clearly distinguished. Independently of the fingerprint representation technique used, the ridge map provides rich, unique and distinctive information that characterizes any fingerprint. 2.4.4. Minutiae The fingerprint pattern, when analysed at fine level, exhibits a type of salient features known as Galton’s characteristics or minutia points. As shown in Figure 37, minutia points are local 59 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 discontinuities in the fingerprint ridge pattern, and mainly refer to two kinds of features: ridge endings, defined as the ridge points where a ridge ends; and ridge bifurcations, defined as the points where the ridge diverges into two or more branch ridges. Minutia points are not uniformly distributed in the fingerprint. Depending on the size of the scanner sensing area, and the relative position of the finger on the sensing surface, only few minutia points can be available in the fingerprint impression; even it is possible to have most of them located in a relatively small area, what gives reduced discriminatory information about the global/partial print. Most approaches aimed at minutiae extraction are based on two different techniques: a) one group of techniques enhances first the fingerprint ridge structure, then obtains ridge lines through binarization and thinning processes, and finally extracts the minutiae set; b) other techniques extract salient features directly from the greyscale enhanced images. Figure 37. Fingerprint minutiae (ridge endings in blue, and ridge bifurcations in red). Minutiae extraction through binarization and thinning The fingerprint image discrimination between ridge and valley regions heavily influences the performance of the minutiae extraction process and hence the performance of the overall personal recognition system. The binary ridge map previously obtained is used further by subsequent processes in order to detect minutia points. After binarization, the obtained ridge map is submitted to a thinning process. The thinning process is responsible for reducing to a single pixel the width of the ridges in the binary image. It has to be done preserving the topological and geometric properties of the binary image, and maintaining its connectivity. In the best possible case, a good binarization algorithm should be able of providing an image consisting of smooth, flowing, black ridges of uniform width on a white background. An image of this type can easily be transformed into a thinned image. However, when the binarized image deviates from its ideal case, problems can appear. There are four basic effects in a binarized image that can lead to problems in the thinning stage. These are (i) breaks in the ridge flow, (ii) close ridges that are merged together, (iii) rough edges on the ridge lines, and (iv) holes in the middle of ridge lines. Any of these binarization errors can lead to the development of structures in the thinned image that can be incorrectly identified as minutia points. The binarization and thinning processing stages are then critical to proper minutiae extraction. Apart from retaining the significant features of the original image, the binarization and thinning processes must eliminate local noise without introducing distortions to the thinned pattern. Thinning operation is necessary to simplify the subsequent structural analysis of the image for the extraction of the fingerprint minutiae. Several methods have been developed in literature dealing 60 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 with the thinning process of binarized fingerprint images [Lam et al., 1992]. A classification of those methods in two main groups is possible: - Pixel neighbourhood-based methods - Distance transform- and analytical-based methods The group of methods based on pixel’s neighbourhood deals with the skeletonization of digital images through iterative deletion of pixels under certain criterion to preserve the connectivity of the image pattern. These kinds of algorithms delete successive layers of the pixels on the boundary of the pattern until only one skeleton remains. The deletion or retention of a ridge pixel depends on the configuration of pixels in its local neighbourhood. It is possible to perform a sub-classification of pixel-based methods according to the way the pixels are examined as: (i) sequential thinning algorithms, in which the pixels are examined for deletion in a fixed sequence in each iteration (e.g. from left to right, and from top to bottom), and the deletion of pixel p in the nth iteration depends on all those operations that have been performed till that moment, that’s it, the (n-1)th iteration, as well as the pixels already processed in the nth iteration; and (ii) parallel thinning algorithms, in which the deletion of a pixel p in the nth iteration uniquely depends on the result that remains after the (n-1)th iteration. Therefore, in such a case, all pixels can be examined concurrently in each iteration. These algorithms are suitable for implementation on parallel processors, and those pixels satisfying the set of removal conditions can be deleted simultaneously at the end of each iteration. Both pixel-based methods use local neighbourhood patterns as sufficient conditions to determine whether a pixel is a contour pixel, which can be removed, or not. This type of algorithms iteratively deletes contour pixels that are removable. The selected patterns must guarantee the connectivity of the resulting skeleton. There are many algorithms of this type that are based on the same principle; the differences among them are usually the set of patterns they use in the iterative processes. Some examples of thinning algorithms are [Gao and Hall, 1989], [Zhang and Wang, 1996], [Zhang and Suen, 1984] and [Zhang, 1997]. On the contrary, the second group, composed of those distance transform- and analytical-based methods, deals with the skeletonization of the image by producing a certain median or center line of the pattern directly, without the need of examining all the individual pixels. In this category, iterative and non-iterative methods for fingerprint image thinning have been suggested based on neural networks, Wavelet transformation, or mathematical morphology operators, among other techniques [Ji et al., 2007], [You et al., 2005], [Hongbin et al., 2007]. As depicted in Figure 38, the thinning process ends when the corresponding one-pixel-wide skeleton of the original fingerprint is obtained. Independently of the selected methodology, the thinning process should be performed without modifying the original ridge structure of the image, keeping endings and bifurcation points, and without breaking any ridges. However, the thinned version of the image can present some structural imperfections in a certain degree, coming from either the original fingerprint or the image processing stages carried out till that moment: (i) false ridge ending pairs facing one to each other and separated a small distance, as a consequence of noisy broken ridge lines; (ii) small spurs or spikes consisting of a ridge bifurcation and a ridge ending placed close one to each other as a result of the thinning process, which leads to false bifurcations and ending points; or (iii) incorrect link between two neighbour ridges, originated by two ridge bifurcations placed close one to each other, which leads to pairs of false bifurcations. It results in breaking ridges, spurious or missing ridges, and holes. Therefore, it is usually required to apply a further purification stage in order to correct as much as possible all those imperfections present on the thinned image: - to remove spikes, any ridge shorter than a given threshold is deleted, and - to correct broken ridges, small gaps between ridge endings with similar orientations are connected. 61 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 38. Image thinning. After this processing step, from the enhanced thinned image, a list of candidate minutia points can be deduced. The minutiae can be extracted using the crossing number (CrN) at a point P, expressed as a function of the pixel values in a 3×3 neighbourhood. The crossing number at ridge point P is defined as half of cumulative successive differences between pairs of adjacent pixels belonging to the 8-neighbourhood of P, where ridge (black) pixels take value 1, and valley (white) pixels 0: P8 P P2 1 1 8 CrN = ∑ Pi − Pi +1 , P9 = P P7 P P3 1 2 i =1 P6 P5 P4 Given a pixel point P, P is considered as a candidate ridge ending when CrN=1, a candidate ridge bifurcation when CrN=3 or CrN=4, and a non-minutia point otherwise. The spatial distribution, the ridge orientation, and the type of minutia points found in the thinned image are the information normally recorded to characterize the fingerprint. Direct minutiae extraction from greyscale images In order to skip such computationally expensive steps linked to fingerprint binarization and thinning processes, other algorithms extract minutiae directly from greyscale images (i.e. without binarization and thinning) or from the thick ridges of binary images (i.e. without thinning). Maio and Maltoni propose in [Maio and Maltoni, 1997] a technique based on ridge line following algorithm to extract the minutiae directly from greyscale fingerprint images. For each ridge in the image, the algorithm keeps following the ridge line until it terminates (ridge ending found) or intersects other ridge lines (ridge bifurcation found). The ridge line following algorithm allows determining the ridge map as well as the minutiae set of the fingerprint impression. Although it is proven that this technique has less computational complexity than those techniques that require binarization and thinning, several drawbacks also exist linked to the feature extraction performance degradation when processing low quality fingerprints (noise and contrast deficiency issues). In [Jiang et al., 2001] authors suggest a modified version of the ridge line following algorithm. The new approach adaptively traces the greyscale ridges of the original fingerprint image and obtains a piece-wise linear skeleton image. The proposed adaptive tracing algorithm is opposed to the fix-step tracing algorithm suggested by Maio and Maltoni. In the new ridge following algorithm, the length of the tracing line between two consecutive ridge points is adaptive to the ridge contrast and to the bending level of the ridge. And a long tracing line will be obtained if there is little variation in the contrast and the bending level of the ridge is low. High bending level of the ridge (possibly facing a ridge bifurcation) or large contrast variation (possibly facing a ridge ending) will result in a short tracing line. Therefore, the algorithm speeds up the tracing process while maintaining the tracing precision. Each ridge in the skeleton is labeled with a number so after obtaining the minutiae set, 62 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 each minutia is associated with one or two ridge numbers depending on whether a ridge ending or a ridge bifurcation is found. A postprocessing procedure is introduced, which is based not only on the spatial and structural relationship between the minutiae, but also on the associated ridge relationship and the orientation certainty level. This makes the task of differentiating spurious minutiae from true minutiae more reliable. Instead of tracing only ridges, Liu et al. propose in [Liu et al., 2000] to trace both ridge and valleys, and observe their structural relationship. A ridge and its two neighbour valleys are traced, and if the two valleys join, a ridge ending is found. Similarly, if the distance between the two valleys grows, a bifurcation is found. In [Wu et al., 2004] authors make use of a new processing algorithm called ridge contour following procedure. This algorithm is used to efficiently extract minutiae features from the thick ridges present on binarized fingerprint images, without the need of thinning the binary images. After binarizing a fingerprint image, a chain code representation technique is able to detect the transitions from white (background) to black (foreground) in the image capturing thus the contour boundary information from the edges of the fingerprint ridges. Tracing the ridge contour in a counterclockwise fashion provides a way of finding significant candidate points to minutiae. When the contour tracing algorithm arrives at a ridge region where a sharp left turn is done, from the turning region a candidate pixel to ridge ending point is deduced. Similarly, when a sharp right turn is performed, from the turning location a candidate bifurcation point is marked. The turning point locations are typically made of several contour points. The location of a minutia point is defined as the center point of the small group of the turning pixels. Based on the state of the art in minutiae extraction techniques covered in this section, it is proven that minutiae extraction becomes a key step to reach high recognition accuracy performance in current personal recognition systems. When an originally noisy image is processed, even after the enhancement process it is still common some imperfections to remain in the form of spurs, holes, short ridges, etc. in the enhanced print. Those imperfections can result in spurious features or false minutiae in the skeleton patterns, which could affect the reliability of the whole authentication system. In order to overcome such effects, a further processing stage called minutiae filtering or minutiae verification is needed. This postprocessing step aims at confirming that only valid minutiae will be taken into consideration as discriminative information of the fingerprint, and those false minutia points that potentially have been extracted are properly filtered in this stage. Most heuristics for minutiae verification are based on local analysis in a neighbourhood whose size is a direct function of the local ridge distance: - two close ridge endings with similar orientations and separated a distance shorter than a given threshold are deleted since they are probably caused by spikes, broken or false ridges; - single-neighbour pixels around the image border are ignored because they are not true minutiae; - if many minutiae are detected in a small area (close bifurcations, facing endpoints, etc.), they are discarded because they are likely caused by noise. The problem of automatic minutiae detection has been thoroughly studied but never completely solved. The main reason is that the quality of fingerprint impressions is sometimes low, and factors such as noise or contrast deficiency can produce fake minutiae or lead to missing valid minutiae. 2.4.5. Pores and Other Ridge Details Apart from all those features already discussed in previous sections, a new level of discriminatory information is available in fingerprints when increasing the detail of analysis. In [Jain et al., 2006] and [Chen and Jain, 2007] authors make use of detailed ridge information called level 3 features, composed of ridge pores distribution, ridge shape, ridge width, ridge contours and other permanent ridge details such as dots (isolated ridge units) and incipients (fragmented ridge units), in order to increase the discriminatory information extracted from a fingerprint. This information can be more easily available when increasing the scan resolution of fingerprint sensors in the range of 1000 dpi. 63 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In order to extract those level 3 features, authors propose a feature extraction algorithm based on the combination of Wavelet transform and Gabor filtering. Pores are characterized by positions naturally distributed along the ridges where intensity values change abruptly from white to black, as shown in Figure 39. Figure 39. Level 3 features: fingerprint pores. In order to extract this sudden change, an image transformation based on a Wavelet band pass filter transform is suggested. After Wavelet-based filtering of the original image, those regions with high intensity variation are emphasized so pore regions are represented by small blobs with low greyscale intensities. A Gabor filter enhancement process is applied afterwards to define the ridge and valley patterns. The linear addition of both filtered images results in an enhanced version from which it is possible to extract the pores by simple greyscale image thresholding of those small blobs regions within the ridges. Apart from locating the pores, Wavelet-based filtering allows the extraction of the ridge contours. Other ridge details based on small ridge units like dots or incipients can be used as additional discriminatory information. Unlike dots, which are normal but small ridge units, incipients are normal forming units that remained “immature” at the time of differentiation when human fingerprint ridge formation stopped. As a result, an incipient is often much thinner than a dot, as depicted in Figure 40. However, for an automated system it is difficult to distinguish between a dot and an incipient from a digital fingerprint impression; therefore, the aforementioned extraction algorithm does not distinguish both feature types. Dot Incipient ridge Dot Incipient ridge Figure 40. Level 3 features: fingerprint dots and incipient ridges. It is proven that the introduction of level 3 features provides additional discriminatory information to play with. The fusion of all those ridge details (pores, ridge contour points, ridge dots and incipients) together with the rest of level 1 and level 2 features (minutiae, ridge map, singular points, etc.) extracted from a fingerprint allows improving the accuracy of fingerprint matcher systems in current AFAS applications. 64 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.4.6. Conclusions The quality of one AFAS application is heavily dependent on the system’s ability to accurately extract those salient features inherent to fingerprints. Most AFAS pass a fingerprint image through several processing stages in order to increase the reliability of the information extracted. Although several methods have been described in literature dealing with the efficient extraction of fingerprint features, there is still scope for improvement. Most of the algorithms rely on the coincidence of extracted features to declare whether two different fingerprint impressions are of the same finger. Although the saliency of fingerprint traits has been definitively proven, the state of the art in biometric systems points out that some further research is needed to improve the reliability of the feature extraction and feature matching algorithms when dealing with low quality fingerprints. 2.5. Feature Alignment Once template and query fingerprint images get characterized by their discriminatory features, the next step consists in the matching process of both feature sets. There are recognition algorithms that directly proceed with the matching process of the extracted features, whereas many others perform one intermediate step consisting in the rotational and translational alignment of both feature sets. This section covers those most important feature alignment or registration methods proposed in literature. A classification based on the intrinsic features used in the alignment process is as follows: a) alignment based on greyscale information b) alignment based on singular and other reference points c) alignment based on minutiae d) alignment based on ridge map e) alignment based on field orientation f) alignment based on pores g) hybrid alignment techniques There are roughly two types of methods for aligning fingerprints: (i) The first type of methods quantizes the transformation parameters into finite sets of discrete values and searches for the best alignment solution in the quantized parameter space. The alignment accuracy of these methods is limited by the discretization. (ii) The second type of methods first detects corresponding features and then estimates the alignment transformation parameters. The alignment accuracy of this second group tends to be better at the expense of a higher computation cost. 2.5.1. Alignment Based on Greyscale Information One of the first approaches for fingerprint alignment (and matching) was based on the direct correlation of pixel intensities. Although alignment techniques based on direct greyscale images make use of very rich information content, they have the disadvantage of being greatly affected by the image contrast variation, the quality, and the image distortions originated in the fingerprint acquisition stage or in the image enhancement phases. Fingerprint greyscale intensities are not stable features, and may change among different impressions. Moreover, this kind of approach deals with large amounts of information, what makes the process complex. Given the enhanced versions of the original template and query fingerprints, both images are superimposed and the correlation between corresponding pixels in different relative positions (taking into consideration different relative displacements and rotations) is computed. For each of the relative positions, a similarity score is deduced based on the greyscale features, and the one with the highest similarity level is used as alignment result. In this direction, in [Bazen et al., 2000] authors propose a correlation-based algorithm using not all the fingerprint image but some regions of fixed size (24×24 pixels) to reduce the computational complexity of the alignment process. Several image blocks are selected from the template fingerprint to be used as fingerprint features, and their corresponding feature blocks in the query fingerprint are found. This is done by shifting 65 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 each template feature block over the query fingerprint. At each relative position, the grey-level distance between the template feature and the corresponding area in the query is determined by summing the squared grey-level differences for each pixel. After having shifted the template feature over the entire query fingerprint, the location where the distance is minimal is chosen as the corresponding position in the query fingerprint. As a result, the corresponding position for each template-query feature pair is found. After this processing stage, N pairs of corresponding feature blocks are found, which provide the basis from which it is possible to determine the best translation parameters that align template and query fingerprints. In [Jain et al., 2000] the representation of fingerprints is done by means of the FingerCode. FingerCode represents greyscale texture characteristics of one image by application of some specific filtering processes. First, a landmark reference point of the image (core point) is identified. The local neighbourhood of the reference point is tessellated into sixteen sectors and is used as the region of interest that defines the fingerprint. A bank composed of eight directional Gabor filters (oriented at 0º, 22.5º, 45º, 67.5º, 90º, 112.5º, 135º, and 157.5º) are applied on the region of interest in order to obtain eight instances of the image. For each filtered image, each sector is coded through the grey level of its pixels. A set of eight coded images result in the FingerCode, which defines the fingerprint in a compact way. Since it is desirable to obtain representations of fingerprints that are translation and rotation invariant, the feature translation invariance is accomplished by establishing a reference point such as the core point of the image. However, FingerCode is not rotationally invariant. Therefore, and in order to deal with the feature rotation invariance, in the enrolment stage the template fingerprint is rotated 11.25º from the reference point, and a total of ten FingerCode templates are generated from both images (the original and the one rotated 11.25º). Each FingerCode corresponds to a different rotated instance in steps of 11.25º (-45º, -33.75º, -22.5º, -11.25º, 0º, 11.25º, 22.5º, 33.75º, 45º, 56.25º), and all ten FingerCodes are stored in the database. In the authentication stage, a new FingerCode is deduced from the query fingerprint. Finally, in the matching stage, the query FingerCode is compared against those ten template FingerCodes and a similarity scored is deduced from each comparison. The final matching score is taken as the maximum of the ten scores, and corresponds to the best alignment of the two fingerprints being matched. The effectiveness of this alignment method mainly depends on the accuracy to find the reference point in both images. Matching fingerprints in this method consists in finding the Euclidean distance between the corresponding FingerCodes, so the matching process is very fast. However, the preprocessing involved in the alignment process is computationally expensive. These techniques based on greyscale information have had limited success due to the wide variation of pixel intensities (generated by noise, dirt, pressure variability, etc.) and the high computational resources (large image storage capabilities and computational power) needed for alignment. 2.5.2. Alignment Based on Singular and Other Reference Points One popular solution for alignment of fingerprints is based on the localization of a unique reference point for each fingerprint. All fingerprints have at least one singular point of type core or delta (with the exception of arch-type fingerprints), or other reference points that can be used as landmark features linked to the fingerprint singularity. The idea consists in detecting a unique reference point for each fingerprint, computing its corresponding reference orientation, and using such features – location and orientation– for alignment purposes. Many reference point-based alignment techniques are focused on the localization of singular points [Nilsson and Bigun, 2002], [Li et al., 2005]. However, plain arch fingerprints do not have singular points. Moreover, in case of dealing with partial prints, it is possible singular points not to be available in the fingerprint impression. Therefore, the usage of singular points needs to be broadened to the usage of any other reference point including any well-defined position available in all fingerprints. Thus, new alignment techniques focused on the localization of other reference points available in all fingerprints have been developed in literature [Porwik and Wieclaw, 2004], 66 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Liu et al., 2005], [Xie et al., 2010]. Most of the methods are based on the analysis of the field orientation consistency of the image. Each approach is able to indicate more than one reference point candidate. Among all potential candidates, a reference point is selected for each fingerprint. The reference orientation is deduced from the local ridge orientation around the reference point. Since reference points can be found in all fingerprints, reference point-based fingerprint alignment methods have the advantage that once the reference points for template and query fingerprints are detected, alignment is relatively simple. However, such techniques have the disadvantage that it is not easy to detect the reference points accurately when the quality of the acquired fingerprints is poor. Moreover, problems occur in such approaches where the alignment method does not consider the inherent elasticity of the skin. Those techniques that consider the images as rigid bodies, and do not take into account those non-linear distortions available in fingerprint impressions, are less reliable than those others that effectively consider the effects of the elasticity of the skin in the search of the reference points. Therefore, a robust fingerprint recognition system should not rely solely on these singularities to perform fingerprint alignment. 2.5.3. Alignment Based on Minutiae The problem of aligning fingerprints is not a trivial task. It is not possible to guarantee the existence of a unique and robust reference point to act as a guide for alignment. No single singular point or no single minutia point in the template fingerprint can be safely used as an absolute reference because there is the probability of not detecting it in future samples (query fingerprints). In order to overcome the limitations that present those alignment methods based on a single point, a new approach is developed dealing with multiple alignment points. This new approach makes use of the minutiae set that characterizes any fingerprint. The minutiae-based alignment methodology improves most of the weaknesses exhibited by previous methods: (i) It considers those elastic deformations that can affect fingerprints. Certain tolerances are allowed in the spatial distribution and orientation of minutia points when aligning two different fingerprint impressions. (ii) It overcomes the problem of not sharing common features in partial prints. Owing to the usage of multiple features –minutiae set– in the alignment stage, the probability of having partial prints without overlapped minutia points is reduced notoriously in comparison with single reference-point alignment methods. A limited number of shared minutia points are needed between two impressions of the same finger to properly align both fingerprints. A classification of those minutiae-based alignment techniques published in literature is done as follows: a) single pair-based alignment methods, where the alignment is based on a single pair of points deduced from template and query minutiae sets; and b) multiple pairs-based alignment methods, where the alignment is done taking into consideration multiple minutia pairs. The reference point used for alignment is either a physical minutia pair, or any kind of average point deduced from the set of corresponding minutia pairs. The task of minutia pairing refers to the process of identifying corresponding minutiae. Once the template fingerprint T and the query fingerprint Q are aligned, corresponding minutiae will generally not exactly overlap due to the non-linear deformations that affect fingerprints, as shown in the example of Figure 41. If a pair of minutia points, from T and Q respectively, is within some tolerance in terms of spatial location and orientation, they are generally identified to be a pair of corresponding minutiae. Minutiae-based alignment methods based on a single pair of points guarantee satisfactory and accurate alignments of regions adjacent to the reference point pair; however, the alignment of regions far away from the reference pair are usually not so satisfactory. In order to avoid this kind of limitations, alignment techniques based on multiple pairs of minutia points are developed. Alignment based on single pairs attends to the best local overlap and lose global overlap; whereas alignment based on multiple pairs attends to the best global overlap and lose local overlap. 67 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a) b) c) Figure 41. Alignment result of template (a) and query (b) fingerprints based on minutiae sets (c). Single pair-based alignment methods The generalized Hough transform for fingerprint alignment, presented in [Ratha et al., 1996], is an example of single pair-based alignment approach. A fingerprint image is characterized by a set of minutiae points, and each minutia point is defined by its spatial coordinates (x,y) and the value of the ridge orientation in the specific point (θ). Given two fingerprint images, the space of all possible transformations is discretized into a finite set of values. For each pair of potentially matched minutiae, the translation and rotation necessary to align them is calculated. Evidence for this translation and rotation is computed and stored. After testing all possible matching minutia pairs, the translation and rotation parameters with the most accumulated evidence lead to the alignment result. This technique deals with the analysis of each possible alignment between template and query minutiae sets, and the algorithm finally selects one of the minutia pairs to proceed with the alignment of both images. Given p(x,y,φ) and q(x,y,φ) the template-query minutia pair that provides the best alignment of both fingerprints, the translation and rotation alignment offsets (X,Y,Ф) are computed as: ( X , Y ) = p ( x, y ) − q( x, y ) Φ = p (φ ) − q(φ ) Similar approaches to single pair-based alignment are developed in works such as [Jain et al., 1997a], [Jain et al., 1997b] and [Jiang and Yau, 2000]. The goal is to find the transformation that minimizes the distance between all corresponding minutiae, and the pair of reference points used for alignment purposes coincide with minutia points in all of them. Multiple pairs-based alignment methods On the other hand, there exist many alignment methods based on the global alignment of multiple pairs of reference minutiae. In this approach, the pair of reference points used for alignment usually relies on the spatial average of those corresponding minutiae points. Given p={p1,…,pa} and q={q1,…,qb} the minutiae sets of the template and query fingerprints respectively, where a and b are the number of minutiae in each image, it is possible to find the corresponding minutia pair set {(p1c,q1c),…,(pic,qic),…,(pnc,qnc)} by using local structural analysis of minutia points. Here, pc={p1c,…,pic,…,pnc} and qc={q1c,…, qic,…,qnc} are matched minutiae sets of p and q, (pic,qic) is the corresponding minutia pair, and n is the number of corresponding minutia pairs. The virtual reference point pair (R0 p,R0 q) is computed as: R0p ( x, y ) = 1 n c ∑ pi (x, y ) n i =1 R0q ( x, y ) = 1 n c ∑ qi (x, y ) n i =1 68 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Both fingerprints are aligned based on the reference point pair. The amount of translation (X,Y) between both images is computed as the difference of position between the reference point pair: ( X , Y ) = R0p (x, y ) − R0q (x, y ) The amount of rotation Φ is calculated by averaging the difference of ridge orientation φ between the corresponding minutia pairs: 1 n Φ = ∑ { pi (φ ) − qi (φ ) } n i =1 The alignment transformation results (X,Y,Ф). Other works that deal with multiple pairs-based alignment methods are [Zhu et al., 2005], [Lee et al., 2002], [Carvalho and Yehia, 2004] and [Udupa et al., 2001]. Apart from using minutiae coordinates and angles for alignment purposes, some algorithms make use of additional traits related to the inherent relationship between minutia points in order to increase the discriminatory information available in the alignment process. In [Carvalho and Yehia, 2004] and [Udupa et al., 2001] authors propose a fingerprint minutiae alignment algorithm that additionally uses line segments formed by pairs of minutiae as reference information for alignment. In [Jiang and Yau, 2000] authors use further information derived from the minutia local structures such as distances between minutiae, relative differences between radial angles and minutiae directions, minutiae types, and ridge counts between minutiae points. All this information is used to find reliable correspondences between template and query minutiae sets. 2.5.4. Alignment Based on Ridge Map Other techniques make use of the fingerprint ridge map information to align fingerprints [Hu et al., 2008]. Ridge maps or thinned ridge maps are compact representations of fingerprint images. Unlike those previously discussed features like singular points or minutia points, whose distribution on a fingerprint seems to be random, ridges cover the whole region of a fingerprint. Therefore, in case of dealing with small sensing surfaces or partial prints, ridge-based alignment suffers less degradation in performance than previous methods. In [Luo et al., 2000] two measurements of ridge similarity based on distance and angle are proposed for fingerprint alignment. Given two ridges Ri and Rj, from template and query fingerprints respectively, with minutia points Mi and Mj, the similarity level between both ridges is computed as: 1 L Diff dist (Ri , R j ) = ∑ Dist (Pik , M i ) − Dist (Pjk , M j ) Lmin k =1 min 1 Diff ang (Ri , R j ) = Lmin Lmin ∑ k =1 Ang (Pik , M i ) − Ang (Pjk , M j ) Both ridges are sampled at the average inter-ridge distance deduced from both fingerprints. Lmin is the minimum distance of ridges Ri and Rj. Dist(P,M) represents the distance between the ridge sampling point P and minutia M. Ang(P,M) represents the angle between the direction of the line segment MP and the orientation of minutia M. If Diffdist (Ri,Rj ) and Diffang (Ri,Rj ) are within two certain thresholds, both ridges Ri and Rj are considered to be similar enough, and such a ridge pair is used for fingerprint alignment. In [Feng and Cai, 2006] fingerprints are characterized by ridge maps. The skeleton of one image is represented in a novel coordinate system called ridge coordinate system (RCS), which is different to traditional Cartesian or polar coordinate systems. Many RCS representations of one single fingerprint are performed, with each representation being based on a fingerprint ridge and an oriented point on that ridge. Invariant to translation and rotation representations are achieved with this method. The set of thinned ridges of one fingerprint are represented by a list of sampled points at a fixed interval. Each point in the RCS representation is defined by a 7-dimension vector that defines each point with regard to the reference ridge and its reference point. The alignment and matching of two fingerprints is done by determining the similarity score among all possible pairs of 69 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 RCS representations for both ridge maps. The alignment result is the one that provides the largest amount of matched point vectors. 2.5.5. Alignment Based on Field Orientation Fingerprint alignment methods based on field orientation maps have the potential to perform alignment more robustly than those previous techniques when dealing with poor quality images. Orientation fields have been widely researched and can be reliably calculated even for low quality prints [Maltoni et al., 2009]. In this new approach, there is no reliance on the existence of common landmarks such as singularities or minutia points for fingerprint alignment. Even in case of partial prints, alignment by orientation field is possible. However, orientation fields are coarse features, so they only provide a coarse alignment. Orientation field elements are based on local averages, and this imposes a fundamental limitation on the accuracy of the alignment. Therefore, for applications requiring a high degree of accuracy, additional fine-tune methods are needed to complement the orientation-based alignment. In [Yager and Amin, 2005] authors investigate the use of orientation fields for fingerprint alignment purposes. Three different algorithms are presented: (i) The first one is based on the generalized Hough transform, and it works by accumulating evidence for transformations in a discretized parameter space. Given the field orientation maps of template (ФT) and query (ФQ) fingerprints, the basic idea consists in discretizing the transformation (translation and rotation) space, and for each pair of potentially matched orientation field elements, to compute the transformation parameters needed for alignment, and also the evidence of these parameters being the right alignment parameters. After evaluating the whole discrete transformation space, the parameters with the most accumulated evidence are identified as the valid ones for fingerprint alignment. (ii) The second approach identifies distinctive singularities of fingerprints based on local orientations, and uses them as landmarks for alignment. The algorithm is not exclusively focused on core and delta points, but it looks for any other distinctive pattern present in the orientation field. A translation and rotation invariant feature vector is defined to characterize fingerprint images. The algorithm first uses a simple search to find similar distinctive patterns between template and query fingerprints, and uses those similar pairs to calculate translation and rotation parameters. The parameters (∆x, ∆y, ∆Ф) that lead to the highest alignment evidence are selected as alignment result. (iii) The third approach is based on the steepest descent algorithm. The algorithm begins with an initial estimate for the alignment parameters (∆x, ∆y, ∆Ф) consisting of the translation parameters (∆x, ∆y) that align the centers of mass of both orientation fields, and the rotation parameter (∆Ф) that gives the best result among a set of three predefined values: -15º, 0º and 15º. It results in a reasonable choice if the rotation parameter is expected to be small. After the initial estimate, the algorithm evaluates a set of transformation triplets close to the starting parameters, and the process is repeated until no closer parameter sets present bigger alignment evidence results. Thus, the alignment algorithm finishes when finding a solution that is locally optimal. In [Lindoso et al., 2007a] authors introduce a coarse alignment method based on the correlation of the orientation field of fingerprints. Cross-correlation (CC), or simply correlation, is a measurement of image similarity. The coherence of the orientation field is used as a measure of quality in the correlation process. Given the field orientation maps of template (ФT) and query (ФQ) fingerprints, and their orientation coherence (Coh), authors compute the correlation of the sine and the cosine of the estimated orientation angle weighted by its coherence when the query image is translated by (x,y) pixels and rotated by and angle α with respect to the template image: x x CCsin ( x, y ,α ) = CC sin(2Φ T ) ∗ Coh (Φ T ) , sin (2Φ Q, y ,α ) ∗ Coh (Φ Q, y ,α ) CCcos CCCoh ( ) (x, y,α ) = CC (cos(2Φ ) ∗ Coh(Φ ) , cos(2Φ )∗ Coh (Φ )) (x, y,α ) = CC (Coh (2Φ ) ∗ Coh(Φ ) , Coh (2Φ )∗ Coh (Φ )) T T x , y ,α Q x , y ,α Q T T x , y ,α Q x , y ,α Q 70 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The correlation maximum determines the best translation (∆x, ∆y) and rotation (∆α) parameters that align template and query fingerprints: (∆x, ∆y, ∆α ) = max CCsin (x, y,α ) + CCcos (x, y,α ) x , y ,α CCCoh (x, y ,α ) A similar approach based on correlation is presented by in [Wakahara et al., 2007]. For the alignment purpose, the direction histogram rather than the dominant direction of each image block is used. The fusion of two similarity scores (one that treats the field orientation map as a rigid body, and another that allows elastic deformation when correlating fingerprints) determines the best alignment taking into account the relative rotation, translation, as well as the non-linear and elastic deformation that may exist between template and query fingerprints. 2.5.6. Alignment Based on Pores In case of dealing with high resolution fingerprint images, some works devote to fingerprint alignment based on pores analysis as an alternative to those alignment methods based on minutiae. Since sweat pores are abundant on even small fingerprint areas, alignment by means of pores is possible in partial prints, where reduced overlap between two fingerprint impressions may exist. In this direction, in [Zhao et al., 2010] authors propose an alignment method based on fingerprint pores-valley descriptors. The set of pores available in a fingerprint are characterized by their location and orientation, as well as by the ridge orientation fields and valley structures around them. Rotation and translation invariant descriptors are defined for each of the pores, and a coarse-to-fine matching strategy is developed in order to find corresponding pairs of pores between two fingerprints. First, the ridge flow pattern in the neighbourhood of each pore is compared as coarse matching. All possible pairs of pores that successfully pass coarse matching are submitted to fine matching. In the subsequent fine matching, pores are grouped in groups of two, and the valley structures in the neighbourhood of pores are compared. From the most corresponding pairs of pores found, the alignment transformation parameters are estimated. The proposed alignment technique based on pores is particularly useful in those scenarios where the acquired fingerprint images are small, and the amount of features available (ridge map, minutia points, etc.) is very limited. 2.5.7. Hybrid Alignment Techniques Apart from all those previous methods based on individual features, there exist a new set of methods based on the fusion of multiple discriminatory information to align fingerprints. A common technique is the usage of minutiae together with any other supplementary information to guide the alignment process. Jain et al. propose in [Jain et al., 1997a] the use of minutia points and ridge maps for fingerprint alignment in an algorithm called Ridge correlation. When extracting one minutia point, the shape and location of its associated ridge is also recorded. For each possible template-query minutia pair, if the associated ridges are similar, the minutiae sets are translated so that the candidate minutia pairs are at the same place, and then rotated so that the associated ridges are aligned. This process is repeated for all possible minutia pairs found in template and query fingerprints, and the alignment that leads to the most global correspondence determines the alignment result. The proposed method is vulnerable to the non-linear deformations of fingerprints. In [Lee et al., 2003] authors present a method that integrates multiple impressions of a finger in the enrolment stage for improving fingerprint verification performance. In the proposed algorithm, multiple impressions of a finger are coarsely aligned using the corresponding minutia pairs, and then are finely aligned using the ridge information. The proposed method overcomes the limitations originated by small sensors or partial prints in AFAS applications. An integrated template is constructed during the enrolment process by mosaicking several acquired images. Since the enrolled template represents the enlarged finger region, the following matching process results more efficient. In the recognition stage, the query image is matched against the enlarged template, what improves the performance of the fingerprint verification system. After a coarse alignment process 71 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 based on minutiae [Lee et al., 2002], a refined alignment process based on ridge distance map computation takes place. The thinned images are used for such a purpose. For each relative alignment of both template and query thinned images, the distance map corresponding to the overlapped area is computed. The optimal alignment is the one that minimizes the distance map between both thinned images. A new fingerprint alignment algorithm called similarity histogram approach (SHA) is presented in [Zhang et al., 2003]. It corresponds to a global alignment technique based on local structures, which are deduced from the minutiae and the ridge information available in the fingerprints. A minutia point Mk, its associate ridge, and its two nearest neighbour ridges form a feature vector as follows: M k = (x k , yk , α k , β k , ϕ1k , ϕ 2 k , d1k , d 2 k ) where xk and yk are the coordinates of the minutia, αk is its local ridge direction, βk is the difference between directions of the two nearest ridges of the minutia point, (φ1k, φ2k) are the angle differences between αk and the lines connecting ridge sampling points with the minutia point, and (d1k, d2k) are the distances from the minutia point to the ridge sampling points. The feature vector set M={M1,…,Mk,…,MN}, which consists of the feature vectors of all N minutia points detected in the finger impression, determines the global minutiae structure of the fingerprint. For every pair of minutia points Mi and Mj (1 ≤ i ≤ j ≤ N) in M, if the Euclidean distance d(Mi ,Mj ) satisfies: Thlow ≤ d (M i , M j ) ≤ Thhigh where Thlow and Thhigh are empirical thresholds, a local structure feature vector Pk is constructed to define the relationship between Mi and Mj as follows: Pk = ( lk , µk , vk ,ϕ1i ,ϕ 2i , d1i , d 2i ,ϕ1 j ,ϕ 2 j , d1 j , d 2 j ) where: lk = d (M ik , M jk ) θ k = arctg ⎜ ⎛ yM j − yM i ⎜ xM − xM i ⎝ j ⎞ ⎟ ⎟ ⎠ µk = α i − θ k vk = α j − θ k The local structure feature vector Pk is an invariant descriptor, independent on the rotation and translation of the fingerprint. A fingerprint is then characterized by a set of local structures P={P1,…,Pk,…,PQ}, where Q is the number of local structures deduced from the fingerprint minutiae set M. Given two images, one template and one query fingerprints, the algorithm extracts the local structure feature vectors from both images and construct a local similarity matrix. For this purpose, a similarity function of local structure vectors is defined. In the similarity matrix, all possible combinations between local structures of both –template and query– fingerprints are taken into account. A similarity histogram function of rotation and translation parameters corresponding to a pair of fingerprints is also defined. From the results of the similarity matrix, a similarity histogram is computed for the translation and rotation parameters, which permits to deduce the optimal translation (X,Y) and rotation (Ф) alignment parameters to be applied to accurately align both fingerprints. In [Gu et al., 2006] authors incorporate the global orientation field information into an original minutiae-based alignment algorithm. The proposed alignment method first determines K pairs of minutia points between template and query bitmaps that can be used as alignment points. For each corresponding pair, the alignment parameters (∆x,∆y,∆θ) are determined. After alignment, the point (x,y) and its orientation O(x,y) is mapped to the new point (x′,y′) with the orientation O′(x′,y′): x ' ⎞ ⎛ cos ∆θ ⎛ ⎟ ⎜ ⎜ y ' ⎟ = ⎜ − sin ∆θ ⎜ ⎜ O ' ( x ' , y ' )⎟ ⎜ 0 ⎠ ⎝ ⎝ sin ∆θ cos ∆θ 0 0⎞ ⎟ 0⎟ 1⎟ ⎠ ⎛ x ⎞ ⎛ ∆x ⎞ ⎟ ⎟ ⎜ ⎜ ⎜ y ⎟ + ⎜ ∆y ⎟ ⎜ O ( x , y )⎟ ⎜ − ∆ θ ⎟ ⎠ ⎠ ⎝ ⎝ 72 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 With this transformation, it is possible to evaluate the similarity level between template and query minutiae sets, as well as the similarity level between template and query field orientation maps. Among the K pairs of candidates, the one with the largest matching score, which is defined as the product of minutiae similarity and orientation field similarity, is selected as the alignment result. In [Yager and Amin, 2006] authors present an optimized alignment approach based on two different stages. In the first stage, non-minutiae features are used to provide a robust initial alignment of fingerprints. The proposed method makes use of orientation field information, ridge curvature, and ridge frequency maps: - the orientation fields contain information about the local direction of fingerprint ridges; - the distribution of curvature values across the fingerprint is also a discriminative feature that measures the rate of change of the tangent along the ridge; and - the ridge frequency is the inverse of the average distance between ridges in a local area in the direction perpendicular to the local orientation. As with orientation fields, curvature maps and ridge frequency maps can be used as a basis for alignment. Given two fingerprint images, their field orientation, ridge curvature and ridge frequency maps are computed. For each possible fingerprint overlap, a set of evenly sampled points within the overlapped region are matched in order to compute the level of similarity between them. A minimum amount of overlap between the two prints is defined in order to avoid the situation that a high similarity level is obtained based on an alignment with little or no overlap. The parameters (∆Ф, ∆x, ∆y) that lead to the highest level of similarity are selected as a coarse alignment result. In the second stage, global minutiae information is used to fine tune the alignment parameters (∆Ф, ∆x, ∆y) initially deduced in the first stage. The spatial position of minutia points can be localized with more accuracy than the features used in the first stage, which were essentially local averages. A minutia p is stored as a 4-tuple ( px ,py ,pθ ,pt ) that represents its coordinates, orientation and type (ridge ending/bifurcation) respectively; and a similarity level between minutiae sets is deduced in each possible alignment. The proposed two-stage alignment algorithm is proven to perform robustly in case of poor quality images (caused by either noise or differing skin conditions) and fingerprint pairs that have relatively little overlap. 2.5.8. Conclusions Fingerprint alignment is a vital stage in most fingerprint recognition systems. In order to determine the degree of similarity between two fingerprints, it is first necessary to find the translation and rotation parameters that align the prints so their corresponding features may be matched. It is desirable to obtain representations of fingerprints that are translation and rotation invariant. Those techniques featuring a low accuracy level in handling rotational alignment tend to solve their inherent limitations by storing several rotated versions of the template fingerprint descriptor, and matching all them against the query fingerprint. This strategy normally incurs in a higher storage capacity to save several feature sets for each enrolled fingerprint. Other alignment strategies typically involve processing various rotated versions of the query feature set for matching against a unique template. This later strategy increases the complexity level with regard to the previous methodology (the rotation of the query print has to be done on-line in the authentication stage, and not in the off-line enrolment stage as in the previous methods), but it reduces the storage costs. Independently of the alignment technique used, it is well known that fingerprint impressions can suffer from noise, poor quality, partial overlap, and non-linear distortions. These agents can disturb the inherent properties of those features extracted from finger impressions (location and orientation of minutiae or singular points, ridge maps, field orientation maps, etc.), making it impossible to find a perfect alignment between two fingerprint feature sets coming from the same individual. Therefore, most algorithms attempt to find an alignment that minimizes these errors. One common approach used to deal with the high complexity of the alignment process is based on providing certain tolerances in the alignment process, as well as incorporating as much discriminatory information as possible, thus evolving to fingerprint hybrid alignment methodologies. 73 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.6. Feature Matching The two main properties of human fingerprints, from which the idea of designing personal recognition methods was originated, are the uniqueness and the permanence of human fingerprints over time. Although the permanence of human finger patterns is a proven fact, the uniqueness of fingerprints is based on an empirical observation. Some statistical studies have been carried out to evaluate the probability of getting two identical fingerprints from different fingers [Maltoni et al., 2009]. It is believed –and nobody has demonstrated the opposite– that do not exist in the world two individuals with identical fingerprints. Despite the believed distinctiveness of fingerprints, reliably matching fingerprint images is an extremely difficult task because of two main factors: (i) The large intra-class variation, or the large variability that may exist among different impressions of the same finger, due to reasons such as displaced, rotated or partially overlapped fingerprint impressions, non-linear distortions originated in the acquisition process, noisy ambient or skin conditions –too dry or too wet fingers–, sensitive sensors, ‘low quality’ fingers –with the ridges worn or the skin scratched due to occupation (manual workers) or age (elderly people) reasons– or inaccurate image enhancement and feature extraction algorithms. (ii) The small inter-class variation, or the high similarity that may exist among pattern impressions coming from different fingers. Apart from the aforementioned reasons, automated fingerprint matching systems do not use the entire discriminatory information available in fingerprints –as human expert methodologies do–, but only a reduced representation extracted by an unsupervised machine. Consequently, the state of the art in automatic matching algorithms do not reach the performance of human expert methodologies, and the fingerprint recognition topic remains as an open research problem today. Most of the automated matching techniques are abstracted from human expert methodologies. In order to claim that two fingerprints are generated from the same finger, human fingerprint examiners evaluate several factors: - the global pattern characteristics of both fingerprints must be coincident, not only the type (left loop, right loop, whorl, arch or tended arch) but also the texture information, the structural ridge shape, etc.; - a minimum set of local structures must also be identical (delta, core and minutia points, their ridge neighbourhoods, etc.); and - the spatial disposition of local and global structures, once both fingerprints are aligned, must be coincident within certain margins fixed by the allowable elastic distortion tolerances. A classification of the most relevant automatic fingerprint matching algorithms can be performed based on the salient features taken into account in the matching process as follows: a) Correlation-based matching techniques, in which both images –template and query fingerprints– are superimposed and a correlation analysis at greyscale pixel level is carried out. This matching technique demands a high computational power. If the images are not aligned, a set of potential template-query fingerprint alignments are evaluated, which cover a wide range of relative displacements and rotations in order to take into account the probability of having any kind of partial overlap or relative rotation between the acquired images to match. b) Minutiae-based matching techniques, where the ridge discontinuities of fingerprints (mainly ridge endings and ridge bifurcations) are extracted and used as discriminatory information for matching. Matching two fingerprints in minutiae-based representation becomes a point pattern matching problem, and it consists of finding the alignment and correspondences between pairs of characteristic points in both –template and query– minutiae sets. Certain elastic distortion (no rigid body transformation is considered) is allowed during the alignment process of both minutiae sets in order to take into consideration the elasticity of the skin. This matching technique is characterized by its high discriminating power and its good accuracy results when both minutiae sets are reliable (neither missing nor spurious minutia points available in both sets), and its relatively reasonable computational load in comparison with previous techniques. However, it requires accurate detection of minutia points, which is a very challenging task in case of dealing with low quality prints. 74 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 c) Ridge or Non-minutiae feature-based matching techniques, in which those other features different to minutia points such as local field orientation, frequency field orientation, ridge shape, texture information, pores, dots, incipient ridges, etc. are extracted from the ridge pattern and used as discriminatory information for matching purposes. A big variety of features can be extracted from the finger impressions when the ridge-valley pattern is inspected at different levels of detail. d) Hybrid matching techniques, which result as the combination of previous techniques. This new category has appeared as a consequence of the insufficient accuracy performance exhibited by individual matching techniques alone. Each aforementioned matching approach features some strengths and some weaknesses. By combining different matching methodologies in hybrid solutions, the scientific community tries to improve the final performances of resultant systems. 2.6.1. Correlation-Based Matching Techniques In correlation-based processing, the degree of similarity between template and query fingerprints is estimated from the analysis of the greyscale ridge-valley patterns. If the relative rotation and displacement of template and query prints are unknown, then the correlation must be computed over a set of possible relative rotations and displacements, which demands a very high computational power. In order to reduce such a computational effort, some approaches firstly perform a coarse alignment of both fingerprints, and afterwards continue with the correlation process. The presence of non-linear distortion and noise significantly reduces the global correlation value between two genuine impressions of the same finger. To overcome such problems, correlation is usually done locally, only in certain distinctive or discriminatory regions (regions of high curvature, minutiae regions, etc.) of the fingerprint instead of the entire image. Since the correlation is done locally, the matching algorithm becomes more tolerant against non-linear distortion. In [Bazen et al., 2000] authors develop a correlation-based matching algorithm on greyscale images. The matching process is split in three main steps: (i) selection of appropriate features from template fingerprint, (ii) search of corresponding features in query fingerprint, and (iii) comparison of the relative position between feature pairs in both fingerprints to align first, and match afterwards both fingerprints. The fingerprint features selected as reliable distinctive information for alignment and matching purposes consist of image blocks of size 24×24 pixels. A combination of three kinds of discriminatory features is proposed: minutiae-based (fingerprint regions around minutiae locations), field orientation coherence-based (low coherence fingerprint regions), and correlationbased (other greyscale distinctive regions) features. Once those reference blocks are extracted from the template fingerprint, their corresponding feature blocks in the query fingerprint are identified. This is done by shifting each template feature block over the query fingerprint. At each relative position, the grey-level distance between both feature pairs is determined by computing the squared grey-level differences for each pixel. After having shifted the template feature over the entire query fingerprint, the location that features the minimal grey-level distance is selected as the corresponding position in the query fingerprint. Following this procedure, for each template feature a corresponding query feature is positioned in the query fingerprint. Although such a matching process is able to deal with relative translations of feature blocks, it is not able to deal with large relative rotations of fingerprints. As a result of the feature correlation process, two sets of corresponding feature locations (xT,yT) and (xQ,yQ) are identified, where x={x1,…,xN} and y={y1,…,yN} are the coordinates of the feature blocks, N is the number of corresponding features found, and the superscripts T and Q refer to template and query fingerprints respectively. In order to align both fingerprint images, the set of features found are handled in groups of two. The relative position between one pair of features of one fingerprint is computed in polar coordinates. Given one pair of feature block positions (xi ,yi ) and (xj ,yj ), its relative position is computed as: (∆x ) + (∆y ) ϕ = ∠(∆x , ∆y ) ρ= 2 i, j i, j i, j i, j 2 75 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 A certain degree of tolerance (ρTh ,φTh ) is allowed when comparing pairs of feature blocks between template and query fingerprints. Such tolerance is able of handling some amount of non-uniform shape distortion caused by the inherent elasticity of fingerprints: (ρ T i, j ,ϕ iT, j ) − (ρ iQj ,ϕ iQj ) , , ≤ (ρ Th ,ϕTh ) for 1 ≤ i, j ≤ N, i ≠ j ? Given N pairs of corresponding feature blocks, a total of N·(N-1)/2 feature pair combinations are analysed, and from them, a global matching score is deduced. The proposed correlation-based feature matching process demands a high computational power, what makes this method to be challenging for real-time applications based on physical platforms with limited processing power. A new approach based on correlation is developed in [Ross et al., 2002]. Instead of directly correlating the greyscale pixel intensities of template and query images, this method correlates several oriented ridge feature maps deduced from the original fingerprints. The proposed method uses as fingerprint descriptor a set of ridge feature maps that captures the local ridge strengths of the finger impression in several orientations. That set of ridge feature maps is used to represent, align and match fingerprints. The local ridge characteristics of a fingerprint are extracted by means of a set of Gabor filters tuned to the fingerprint average inter-ridge frequency, and oriented according to eight predefined directions (0º, 22.5º, 45º, 67.5º, 90º, 112.5º, 135º, 157.5º). For each filtered image, the standard deviation map corresponding to the variation in local pixel intensities is computed considering a pixel neighbourhood kernel of size 16×16 pixels. Eight different standard deviation maps are deduced for each fingerprint. In the enrolment stage, the standard deviation maps of the template fingerprint are sampled at regular intervals (sampling period of 16 pixels) in both horizontal and vertical directions to obtain a two-dimensional ridge feature matrix descriptor of the template fingerprint. The set composed of eight ridge feature maps provides then a compact representation of a fingerprint image. In the authentication stage, when a query print is presented to the system, the standard deviation maps of the query image are also computed, and correlated against the set of ridge feature maps of the template. From the correlation process, the alignment translation offsets between both images are deduced, and a matching score is generated by computing the similarity level between the aligned feature maps. Similarly to the previous work, the correlation-based alignment process of fingerprints does not account for the rotational offset between the query and the template feature maps. To account for rotational offsets, various rotated versions of the template ridge feature map need to be computed and correlated with the query feature map. This permits to improve the alignment and matching stages at the cost of additional computational overhead. In [Nandakumar and Jain, 2004] authors improve the performance of a minutiae-based matcher system by incorporating the spatial correlation of regions around minutia points in the fingerprint matching stage. Since greyscale pixel values around a minutia point capture most of the local information, spatial correlation provides an accurate measure of the similarity between minutia regions. In order to reduce the computational effort required for correlation, a coarse alignment of fingerprints is initially done based on minutia points. The coarse alignment dramatically reduces the correlation search space, and the alignment is further refined in the matching stage. After the coarse alignment process, the overlapped region between both prints is identified. Windows of size 42×42 pixels and centered on each template minutia point are recorded. Windows of size 32×32 pixels, and centered on the same location but in the corresponding aligned query fingerprint are also recorded. A correlation process of pixel intensity levels between corresponding image windows is carried out for the enhanced template and query fingerprints. The size of the template windows around minutiae points is selected to be slightly greater than that of the query windows in order to make the correlation process tolerant to small errors in the location of minutia points. The local correlation of all template windows against those corresponding query windows in the overlapped area is computed, and a similarity matching score can be deduced at the end of the process. 76 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Other correlation-based matching techniques based on grey level information are proposed in [Lindoso et al., 2007a] and [Lindoso et al., 2007c]. The coherence of the orientation field, or the standard deviation of the Wavelet coefficients of fingerprint images are used as valid distinctive features able to determine candidate regions for correlation. After a first coarse alignment process of template and query fingerprints, distinctive local regions in both images are identified. A total of three local regions of size 50×50 pixels are extracted from the query fingerprint, and correlated against those corresponding regions of size 100×100 in the template print. In the correlation process, both translation and orientation factors are taken into account to refine the alignment of both images and deduce a reliable matching score for both fingerprints. 2.6.2. Minutiae-Based Matching Techniques Minutiae mean small details, and they refer to the various ways that the ridges can be discontinuous. Many types of local ridge discontinuities have been identified, but ridge endings and ridge bifurcations are the two most prominent structures usually referred to as minutiae [Jain et al., 2010]. The relative position of minutia points, the direction of the ridges associated with minutiae, and the number of ridges (or valleys) between any pair of minutia points are some of the features that characterize a fingerprint. The vast majority of automated fingerprint verification systems are minutiae-based systems. Proof of that is the fact that there exist standard representations of a fingerprint based on the spatial distribution of minutia points along its pattern such as ANSI/NISTITL, ISO/IEC 19794-2, or ANSI/INCITS 378. Minutia-based representation schemes include minutiae location and orientation, and it might also include one or more global traits such as the field orientation map, the location of singular points (core and/or delta), and the fingerprint classification. Many minutiae-based matching algorithms with varying accuracy and efficiency are described in literature, and a large amount of minutiae-based recognition systems exist in the market. Depending on the recognition system, either standard formats or proprietary formats of minutiae representation are used. Due to frequent non-linear deformation in fingerprint images, directly ensuring global correspondence of minutia points by rigid body considerations is very difficult. Therefore, most matching algorithms tend to first compute local similarity and then perform global consolidation allowing, up to some extent, the elastic deformation of finger patterns. Local minutiae analysis Apart from the basic minutiae set description –based on spatial coordinates x and y, orientation θ, and minutia type t–, there exist several models described in literature that use further information to define local minutiae neighbourhoods: a) Model based on minutia point and greyscale information In [Kovács-Vajna, 2000] author uses the greyscale intensity of pixels in a local neighbourhood of minutia points to help in the fingerprint matching process. Minutiae regions from template and query images are compared to find potential correspondences. However, the grey levels and contrast around a minutia point can vary for several reasons such as acquisition conditions, random noise, distortions, etc. so comparing local pixel intensities does not result in a robust approach. b) Model based on minutia point and associated ridge description Algorithms such as [Jain et al., 1997a], [Jain et al., 1997b], [Luo et al., 2000] and [Cheng et al., 2004] use the ridge information as an aid to complement the minutia information (x, y, θ, t) in the minutiae-based alignment and matching stages of the recognition algorithm. When extracting the minutiae, the shape and location of associated ridges are also used to characterize each minutia point. c) Model based on minutia point and local ridge orientation Algorithms such as [Tico and Kuosmanen, 2003] and [Qi et al., 2005] introduce a fingerprint representation scheme that relies on the orientation field pattern in a local neighbourhood around each minutia point. The orientation field provides a rough description of the fingerprint pattern, and it can be estimated with reasonable accuracy even from noisy input images. According to authors, 77 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 the minutia points capture only a very limited amount of information from the rich information content present in the fingerprint. Therefore, the orientation-based descriptor, consisting of the field orientation of sampling points around each minutia, provides additional discriminatory information. d) Model based on minutia point, associated ridge and adjacent orientation distribution In [Zhang et al., 2007] and [Cheng et al., 2004] authors propose a novel local feature descriptor to represent fingerprints. The local feature descriptor for each salient feature (ridge ending/bifurcation) is composed of the minutia point, some sample points on its associated ridge, and the adjacent field orientation distribution in a broad region around the minutia point. e) Model based on minutia point and neighbour minutiae In this approach, the simplest model consists in defining minutia pairs. For each minutia point, its nearest minutia neighbour is identified, and a 2-D translation and rotation invariant descriptor is constructed to capture relative position and orientation [Udupa et al., 2001]. Additional information such as the ridge count (number of ridges located between minutia points) can also be used. Other methods use a central minutia point and the two nearest neighbours, which constitute a triplet [Jea and Govindaraju, 2005], [Jiang and Yau, 2000]. The triplet descriptor can be characterized by a feature vector that contains minutiae information as well as ridge counts. The local structure is rotation and translation invariant, and can tolerate elastic deformations. The next level of complexity can be generalized to the usage of a central minutia and a set of its K nearest neighbours [Yang and Verbauwhede, 2003]. The feature vector is characterized by the direction of the edge connecting the central minutia and each of its neighbour minutiae, the relative orientation of minutia points, the distance between minutia points, and the relative orientation of the neighbour minutiae with respect to the central minutia. A further level of complexity corresponds to 3-D representations of minutia points and their local neighbourhoods [Cappelli et al., 2010]. In all of them, the comparison of those local descriptors linked to template and query minutiae sets permits to find corresponding minutia points between both sets, and from them, it is possible to align and match fingerprints. Global minutiae analysis Local structural information allows determining the similarity among low-level features. However, the fingerprint matching stage has to result in a global similarity score between complete fingerprints. For such a purpose, usually a global structural analysis takes also place. Once template and query fingerprints are properly aligned, the spatial distribution of both minutiae sets in the whole patterns is further analysed. The number of corresponding minutia pairs between template and query fingerprints in the overlapped area, considering certain tolerances to afford those nonlinear deformations inherent to fingerprints, is deduced; and from it, a global matching score is computed. Many methods use not only the global distribution of minutiae sets but also the local structural features to deduce the similarity matching score [Jiang and Yau, 2000]. The minutiaebased matching score S can be computed in multiple forms: S= S= 100 ⋅ M TQ ⋅ M TQ M ⋅N 100 ⋅ M TQ max as in [Jain et al., 1997b]; or as in [Jain et al., 1997a], [Yager and Amin, 2006]; or { M,N } S= 100 ⋅ ∑ ml (i, j ) i, j max S = 100 ⋅ { M,N } M TQ ⋅ M TQ as in [Jiang and Yau, 2000], [Qi et al., 2005]; or as in [Youssif et al., 2007]; or S= M ⋅N 100 ⋅ 2 ⋅ M TQ M +N as in [Wakahara et al., 2007]; 78 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 where M and N are the number of minutia points in template and query fingerprints respectively, MTQ is the number of corresponding minutia pairs found after alignment, and ml(i,j) represents the matching certainty level –considering both local and global structure similarities– between minutia i of template fingerprint and its corresponding minutia j of query fingerprint. In general, these algorithms use local structures to find an initial alignment, and global features are used to refine the alignment and to calculate the matching score: (i) Few of the proposed methods are based on rigid body models, and do not have a proper way to handle the non-linear distortion problem in fingerprint matching. Those algorithms suffer from bad accuracies due to the inherent elastic deformations that affect fingerprints. (ii) Some others tolerate the elastic distortion by using bounding boxes of a predefined size around the template minutiae locations, and permitting the query minutiae points to be located inside such tolerant boxes once the prints get aligned. The disadvantage of this approach is that large bounding boxes are necessary when large distortions are present, increasing the probability of a false match. (iii) Finally, the third group deals with adaptive elastic bounding boxes whose sizes grows as the distance from the reference alignment regions increases, or by explicitly modelling the non-linear distortions of fingerprint images. If that scenario, the size of the bounding boxes can be adjusted in each region, decreasing both false accepts and false rejects ratios [He et al., 2003]. Finally, a matching score is calculated based on the number of matched minutiae within each bounding box. 2.6.3. Ridge or Non-Minutiae Feature-Based Matching Techniques Non-minutiae feature-based matching techniques refer to all those other techniques based on image or ridge features, different from minutia points, which can be extracted from fingerprints and used as discriminatory information for matching purposes. Among them, a classification of those mainly used features is as follows: a) matching approaches based on field orientation map, b) matching approaches based on ridge map, and c) matching approaches based on level 3 features. Matching approaches based on orientation field The field orientation map that describes the global structure of a fingerprint pattern is normally defined as a two-dimensional matrix whose elements are the ridge-valley directions coded in the range [-π/2,π/2) or [0,π) radians. The orientation field has several advantages over the minutiae: - it is almost continuous and smooth everywhere except in those regions near the singular points or minutia points, and it is less sensitive to noise; and - compared with the minutiae, it is less sensitive to those non-linear deformations of fingerprints originated by the acquisition process and the inherent elasticity of the skin. A fingerprint matching algorithm based on orientation field is proposed in [Kulkarni et al., 2006]. Template and query fingerprint images are translationally and rotationally aligned based on the location of core points. Once determined the position of core points in both prints, window areas of size 100×100 pixels around each core point are identified for fingerprint matching. Each window area is tessellated into m×n non-overlapped blocks. The size of the blocks depends on the image resolution of the fingerprint images. Each block includes at least one ridge and one valley of the fingerprint pattern. The dominant orientation of the ridges in each block is computed trough greyscale pixel gradient analysis. In order to estimate the amount of rotation between template and query images, the following equation is used: (θ (1,1) − θT (1,1)) + (θQ (m, n ) − θT (m, n )) θr = Q 2 where θ(1,1) refers to the orientation of the top-left window blocks, and θ(m,n) corresponds to the orientation of the bottom-right blocks. The query fingerprint (Q) is then rotated by θr with respect to its core point in order to align the query image with the template fingerprint (T); and a new square window around the core point is identified. The field orientation feature vector for each block of the 79 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 rotated query fingerprint is computed. Once both overlapped windows are aligned, a feature vector ν based on the variance of the orientation field in each column of the window is computed for each fingerprint: ν = [σ12, σ22, … , σm2] where: σ k 2 = ∑ (θ (i, k ) − µk ) for k = 1, … , m i =1 n 2 θ(i,j) is the orientation field of each block (i,j) with 1 ≤ i ≤ m, 1 ≤ j ≤ n; µk is the average value of the orientation field in the kth column of the window, and m and n are the number of columns and rows respectively. After determining the orientation-based feature vectors of the aligned query (νQ) and template (νT) prints, a matching score is computed based on the norm d of both feature vectors: d = νQ −νT 2 2 2 = ∑ (σ Q , k − σ T , k ) m k =1 2 The variance of the orientation field in the previously aligned fingerprint regions proves to be a valid alternative or complement to other correlation or minutiae-based matching techniques. Although field orientation presents lower discriminating power than minutia points, orientationbased features are more robust against noise distortions and non-linear deformations of fingerprints. Matching approaches based on ridges Ridge shape is a valuable feature that can be used to characterize fingerprints. Ridges vary in width, from 100 µm –thin ridges– to 300 µm –thick ridges–, and the relationship among ridges is invariant with time and robust to elastic distortions, therefore the ridge map constitutes a distinctive information inherent to fingerprints. In [Xie et al., 2005] authors present a matching technique in which two fingerprint ridge maps are directly compared. To do this, first the whole ridge pattern of template and query fingerprints is extracted. One fingerprint is characterized by a set of ridges, and each ridge is defined by its local orientation and the existent relationship with its neighbour ridges. The ridge map is converted into the skeleton of the image, which is composed of a set of single-pixel-wide curves. The skeleton image contains not only all of the minutiae information, but also the whole ridge pattern from which it is possible to discern the relationship that exists among ridges. A ridge R is defined by one starting point Pm and one end point Pn. Pm and Pn can be ridge ends or ridge bifurcations. The direction from the starting point to the end point is defined for each ridge, and each ridge is sampled at a fixed interval from its starting point to its end point. A set Θ of sampling points Pi is deduced for each ridge R. In each sampling point Pi, a line normal to the ridge R in Pi is drawn. The line intersects at the left-hand-side and right-hand-side ridge neighbours in points Qi and Si respectively. Qi and Si are called associate points of Pi. After inspecting ridge R with such a technique, two sets ψ of left-hand-side and right-hand-side associate points are deduced. The left-hand-side neighbours of ridge R are called its upper neighbours (ψup), and the right-handside its down neighbours (ψdown). Each ridge in the pattern is therefore characterized by its starting point Pm, its ending point Pn, its sampling points Pi (set Θ), and its associate points Qi and Si deduced from the upper (ψup) and down (ψdown) ridge neighbours respectively. The curvature γ of a ridge f is defined as: γ =∫ Pn Pm d2 f Given a ridge R, its curvature parameter γ is an invariant descriptor to image rotation and translation. Therefore, the similarity measure of two ridge curves is defined by their ridge lengths and their ridge curvatures. Given two ridges f1 and f2, with lengths d1 and d2 respectively, these two ridges are pre-matched to each other if the following conditions are satisfied: (d 2 − d1 ) d2 ≤ Th 1 80 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 d1 γ − γ f2 d2 ⋅ f1 η =1− d γ f1 + γ f 2 1+ 1 d2 1− ≥ Th 2 where Th1 and Th2 are thresholds set experimentally, and η is the similarity measure of both ridges. Given two fingerprint images, pairs of similar ridges are identified for proper alignment of both images. Taking as alignment reference each of the pre-matched ridge pairs found, the similarity level for any possible alignment is computed and the best fingerprint alignment is selected by finding the maximal matched length N among all possible alignments. The similarity measure of two aligned ridge patterns is defined as: score = N C ⋅ distortion where N is the total length of all matched ridges in the best alignment scenario, C is a scaling constant, and the distortion factor describes the level of losing performance generated by the resultant ridge structures (formed by well-matched and wrong-matched ridge pairs). The maximum matching score resulted from the different initial pre-matched pairs gives the final result. The algorithm takes the elastic distortion of fingerprints into account and fully utilizes the ridge information to assess the fingerprint matching score. Other similar approaches are [Feng et al., 2005] and [Feng et al., 2006]. Both algorithms establish ridge-based feature correspondences between two fingerprints. The methodology consists of three stages: preprocessing, alignment and matching. In the preprocessing stage, ridge-based features are extracted from the thinned images, and relations between ridges in template and query fingerprints are analysed. In the alignment stage, a set of K most similar feature pairs is found. In the matching stage, for each of the K feature pairs, fingerprint matching is performed to produce a match score. The maximum of the K scores is selected as the final matching score of both fingerprints. In the former work, the ridges are represented by a list of points sampled equidistantly on the ridges. The matching score is computed as the ratio of corresponding ridge points to that of all ridges in the overlapped region between template and query prints: N2 score = NT ⋅ NQ where NT and NQ are the numbers of ridge points of template and query fingerprints present in the overlapped region respectively, and N is the number of matched ridge points found. In the later work, minutiae information is incorporated to the ridge-based method. During the process of ridge matching, minutiae are also paired, and the matching score is computed according to both, the matched minutiae and the matched ridges, as follows: score = λ ⋅ sm + (1 − λ ) ⋅ sr = λ ⋅ ⎛ N⎞ ⎛ M M N2 ⋅ min ⎜ 1, ⎟ + (1 − λ ) ⋅ ⋅ min ⎜ 1, ⎜ T ⎜ T ⎟ MT ⋅ MQ NT ⋅ NQ ⎝ r ⎝ m⎠ ⎞ ⎟ ⎟ ⎠ where λ (0 ≤ λ ≤ 1) is used to weight the scores of matched minutiae and matched ridges, NT and MT are the numbers of minutiae and sampled points of the template fingerprint in the overlapped region, NQ and MQ are the numbers of minutiae and sampled points in the query fingerprint, N and M are the number of matched minutiae points and matched ridge points, and Tm and Tr are the thresholds of the numbers of matched minutiae and matched points below which corresponding scores are discarded, respectively. The ridge structure in a fingerprint can be viewed as an oriented texture pattern having a dominant spatial frequency and orientation in a local neighbourhood. The frequency is due to inter-ridge spacing present in the fingerprint, and orientation is due to the flow pattern exhibited by the ridges. By capturing the frequency and orientation of ridges in local regions in the fingerprint, a new representation and matching technique of fingerprints is possible. In this direction, in [Lee and Wang, 1999], authors propose a new matching technique based on ridge features. The method 81 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 makes use of Gabor filtering, and it only needs to perform core point detection before the feature extraction process, without any other preprocessing steps such as smoothing, binarization, thinning and minutiae extraction. The features extracted from the ridge pattern consist of the Gabor features, directly deduced from the convolution of the greyscale image with m 2-D Gabor filters oriented according to θk, where θk = π·(k-1)/m, k = 1… m. The greyscale fingerprint images are tessellated into blocks of size W×W pixels, and each block of the image is then sampled by m Gabor matrices, resulting in m Gabor features Gθk. In the recognition stage, the comparison of two fingerprints must be based on the same reference point. Authors use the core point of fingerprints –the point with the maximum curvature of the concave ridges in the fingerprint image– as the reference point in order to align those fingerprint impressions that need to be matched. A fast method based on Gabor features is used to detect the core point and align the fingerprints. Taking the core point as reference, the fingerprint images are divided into a set of N non-overlapped blocks of size W×W pixels. Sampling the images blocks by a set of m Gabor filters, each image is characterized by N*m Gabor features. A k-nearest neighbour classifier (k-NN), properly trained and conditioned to the input fingerprint characteristics, is then used to match fingerprints. In k-NN experiments, a total of k images per individual are used as the training database. The k-NN classifier is used to match those previously aligned blocks corresponding to template and query fingerprints. It is proven that Gabor features can be successfully applied to fingerprint recognition. Other similar methods based on Gabor filtering of the ridge pattern are [Jain et al., 2000] and [Munir and Javed, 2005]. Matching approaches based on level 3 features In latent fingerprints matching, when template-query level 1 and level 2 features are similar, a forensic expert often investigates level 3 details. Level 3 features include all dimensional attributes of a ridge such as ridge width and shape, pores, incipient ridges, breaks, ridge contours, creases and scars. Similarly to level 1 and level 2 features, level 3 features are permanent and unique, and provide additional information to individualize fingerprints. The matching techniques based on level 3 features are abstracted from those methodologies carried out by forensic examiners. Their skills permit to discriminate fingerprints by detailed visual inspection of ridge structures. In this direction, in [Chen and Jain, 2007] authors suggest the usage of level 3 information like dots and incipient ridges in combination with level 2 features (minutiae) for fingerprint matching. Dots and incipient ridges are distinctive features composed of short and thin ridge formations between normal ridges. It is proven that, in a system that is in charge of matching partial query fingerprints with full template fingerprints, when additional ridge features are fused with minutiae, the recognition accuracy performance of the system is notoriously improved, specially in those scenarios where partial prints feature a very limited amount of minutia points. The supplementary information provided by level 3 features gives further benefits in accuracy. Other research works dealing with level 3 features are [Jain et al., 2006], [Jain et al., 2007] and [Zhao and Jain, 2010]. 2.6.4. Hybrid Matching Techniques Many authentication systems consider only minutiae features when matching prints, but there exist some limitations in minutiae-based fingerprint recognition: (i) the accuracy of the minutiae extraction process relies on the quality of fingerprints, (ii) the minutiae set cannot characterize the global fingerprint pattern since it represents only a portion of the discriminative information available in a fingerprint, (iii) it is possible that impostor prints are labeled as matches even when the overall shape of the ridge patterns is very different, and (iv) it is hard to further improve the performance of purely minutiae-based matching systems. Human examiners use minutiae together with general ridge information to match fingerprints so a simple but effective approach consists in combining other features as additional matching features. Researchers have proposed other representations of fingerprints based on texture, ridge structure, ridge shape, orientation field, pores, dots, ridge incipients, etc. to resolve the inherent limitations of purely minutiae-based systems. In this direction, many research works have proven the efficiency of hybrid matching systems. 82 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Park et al., 2008], a new representation and matching scheme for fingerprints is presented based on minutiae and SIFT (Scale Invariant Feature Transformation) feature points. SIFT extracts stable characteristic feature points from an image and generates descriptors representing the texture around those feature points. SIFT feature points are based on texture analysis of the entire scale space. A scale space is constructed by applying a variable scale Gaussian operator on a fingerprint image. Difference of Gaussian (DOG) images are obtained by subtracting subsequent scales. The set of Gaussian-smoothed images and DOG images are called an octave. A set of such octaves is constructed by successively down sampling the original image. Characteristic SIFT feature points are extracted from fingerprints in scale space by observing each image point in DOG space. A point is decided as SIFT if it is a local minimum or maximum in a local neighbourhood, and its derivative in scale space is stable. Each characteristic point is then defined by an orientation-invariant descriptor. A window of size 16×16 pixels is used to generate a histogram of gradient orientation around each local extremum point. This technique permits to capture additional discriminatory information from fingerprint impressions. Typical fingerprints may contain up to a few thousand SIFT feature points, much more than the total content of minutia points (<100). Apart from minutiae points and minutiae-based matching, a matching process based on texture information around SIFT feature points is carried out. The fusion of minutiae-based matching and SIFT-based matching results in significantly better accuracy performance than any of both individual matcher systems alone, as indicated in Table 6. Ross et al. suggest in [Ross et al., 2003] the usage of both minutiae and texture information to represent and match fingerprints. The matching process consists of two stages: a) Minutiae matching process. Template and query fingerprints are represented by a set of minutia points. The comparison of both minutiae sets permits to deduce the transformation parameters (∆x,∆y,∆θ) to be applied in order to align the fingerprints, as well as a similarity score between both minutiae sets. b) Texture-based matching process. The ridge flow information is used to represent and match fingerprints. A set of Gabor filters, oriented into eight equally spaced directions, is applied to the rotationally-aligned template and query fingerprints. A total of eight filtered images are obtained that define the ridge strength at different orientations for each fingerprint. The resulting filtered images are tessellated into non-overlapped blocks, and the variance of the greyscale information in each block is computed. An eight-dimensional ridge feature vector descriptor represents thus each fingerprint. After alignment of both ridge-based feature vectors, the computation of the Euclidean distance between both ridge feature maps in the overlapped area measures the similarity level of both ridge patterns. Figure 42. Recognition accuracy performance of single and multiple feature-based matcher systems. 83 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Finally, the fusion of minutiae-based and ridge-based scores results in a global measure of the similarity level between both fingerprints. The experiments show that minutiae information and ridge flow information complement each other. The fusion of those features provides higher recognition performances than any of those matching techniques used separately, as indicated in Figure 42 and Table 6. The performance of minutiae-only algorithms is progressively degraded as the size of the acquired images, or the quality of the fingerprint impressions are reduced. Since the amount of minutia points decreases in partial prints, and the difficulty to reliably obtain minutia points from poor quality images increases, some research works incorporate, additionally to minutia points, other fingerprint features based on ridge patterns such as ridge shape descriptors [Marana and Jain, 2005], ridge counts between minutia points [Jiang and Yau, 2000], or ridge points [Fang et al., 2007] in order to increase the discriminating power of fingerprints and avoid hence loosing performance when dealing with small scanners or poor quality prints. In [Jain et al., 2007] authors introduce a new direction for hybrid matching techniques. Authors propose a hierarchical matching system that utilizes fingerprint features at three different levels: (i) level 1 or pattern level, consisting of the ridge and field orientation maps; (ii) level 2 or point level, consisting of the fingerprint minutiae; and (iii) level 3 or shape level, consisting of pores and ridge shape information. Given two fingerprint impressions, the system first extracts level 1 and level 2 features and establishes the alignment of both images. If the correspondence level between the field orientation maps of both images (quantified as Score_1) is less than a certain threshold Threshold_1, the matcher rejects the query and the process stops at level 1. The similarity level between both images is set to Score_1 in such a situation. However, if the orientation correspondence is good enough, a matching technique based on minutia points is carried out, and from it, a matching score Score_2 is deduced. If the number of matched minutiae within the overlapped region of template and query fingerprints is less than a threshold Threshold_2 (equal to 12 minutia points in this work), the matcher rejects the query and the process stops at level 2 with a similarity response of Score_2. Otherwise, the matching process continues by inspecting level 3 features. The matched minutiae sets are further examined in the context of ridge pores and ridge contours around minutia points, and a refined similarity score Score_3 is deduced for template and query fingerprints. It is proven that level 3 features introduce additional discriminatory information, which allows improving the performance of the whole recognition system. Similarly in [Chen and Jain, 2009], authors incorporate all three levels of features in the characterization process of a fingerprint: (i) the pattern class type –whorl, left loop, right loop, arch, and tented arch– (level 1); (ii) the minutiae, the ridge period and the ridge curvature associated with each minutia (level 2); and (iii) the distribution of pore features (level 3). Authors evaluate the probability of matching two impostor fingerprints, that is, the probability of matching K minutia points not only taking into consideration their relative positions and directions, but also their local ridge periods and curvatures around each minutia, as well as the spatial distribution of pores in the neighbourhood of every minutia. Based on theoretical and empirical approaches, it is proven that the probability is higher when matching two fingerprints of the same class. Moreover, independently of the classes under match, the probability decreases when more features (level 1, level 2 and level 3) are incorporated in the individuality model of fingerprints. Several hybrid matching techniques are developed in literature merging level 1, level 2, and/or level 3 features. In general, research works based on multiple features assess that hybrid matching techniques perform better than each algorithm individually, as demonstrated by the results shown in Table 6. Certain tolerances are allowed when matching multiple features like orientation fields, ridge maps, minutiae, pores, etc. in order to tolerate those elastic deformations inherent to fingerprints. The goal of hybrid techniques is to fuse several similarity scores into a single and more reliable matching result. 84 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Research Work 1 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Fingerprint Feature Descriptor (Single EER) Singular Points Minutiae ~11.39% 1.79% 2.13% 7.50% 4.30% 4.74% 4.20% 4.00% 4.20% 7.10% 7.70% 4.81% 1.82% ~9.50% >10.00% ~6.92% 1.07% 0.97% 4.97% 2.85% 9.19% 3.58% 1.80% 3.53% 4.15% ~10.00% 1.20% 8.25% 4.92% 2 Minutiae and Ridge Features Ridge Features ~7.23% 8.44% 10.76% Field Orientation Ridge Pattern Spectrum Level 3 Hybrid Matching (Combined EER) ~4.00% 0.99% 1.07% 3.40% 2.50% 3.49% 3.40% 2.80% 2.90% 4.50% 3.00% 0.94% 0.78% ~7.50% ~3.50% 3.01% 0.46% 0.61% 3.58% 2.04% 7.49% 2.83% 1.00% 3.03% 3.30% ~4.28% 5.90% 5.09% 6.84% 4.83% ~5.83% 4.50% 2.20% 2.20% 6.10% 3.20% 0.70% 1.00% 7.80% 3.40% 2.00% 1.10% 4.20% 3.10% 1.60% ~3.00% ~2.00% 10.00% 2.97% 9.25% ~4.00% 8.60% 6.40% 6.53% 4.90% 3.50% 3.60% 5.70% 4.50% 1.18% 3.12% 2 ~4.00% ~8.46% 0.85% 0.71% 3.76% 2.65% 7.68% 2.97% [12] [13] [14] [15] [16] [17] [18] [19] 8.10% 8.10% 8.10% ~8.33% 5.10% 6.90% 6.90% 6.90% 2 2 2 2 2 2 2 2 8.86% 8.86% 8.86% [20] >10.00% 5.60% 3.50% 3.00% 8.10% 3.90% 1.00% 1.20% 8.50% 3.70% 4.00% 1.60% 7.10% 7.70% 2.60% ~4.50% ~4.00% [21] [22] [23] 3.30% 2.50% 5.80% 4.30% 2 2 ~4.00% ~3.50% 2 [24] 15.00% [25] ~6.15% 3.01% 2 [26] 7.31% [27] ~4.50% Note 1: EER values in each work are given for different databases. Note 2: Not detailed value 2 [1]: [Ross et al., 2003] [2]: [Park et al., 2008] [3]: [Yager and Amin, 2005] [4]: [Gu et al., 2006] [5]: [Zhang et al., 2007] [6]: [Jain et al., 2006] [7]: [Nanni and Lumini, 2007] [8]: [Ito et al., 2005] [9]: [Shi et al., 2004] [10]: [Jain et al., 2001] [11]: [Qi et al., 2005] [12]: [Wang et al., 2007a] [13]: [Feng et al., 2005] [14]: [Marana and Jain, 2005] [15]: [Jain et al., 2007] [16]: [Chen and Jain, 2007] [17]: [Helfroush and Ghassemian, 2007] [18]: [Ross et al., 2002] [19]: [Nandakumar and Jain, 2004] [20]: [Fang et al., 2007] [21]: [Nanni and Lumini, 2008] [22]: [Yager and Amin, 2006] [23]: [Jain et al., 2000] [24]: [Xie et al., 2008] [25]: [Wang et al., 2008] [26]: [Shi et al., 2006] [27]: [Krivec et al., 2003] Table 6. Hybrid matching algorithms. 2.6.5. Conclusions A rich literature on fingerprint matching algorithms exists, and today, research on this topic is very active. The state of the art in biometrics-based recognition systems points that a number of design 85 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 factors creates some bottlenecks that limit the recognition accuracy performances exhibited by current systems. The performance of correlation-based techniques is affected by non-linear distortions and noise present in the acquired images. The performance of minutiae-based feature matching techniques relies on the accurate detection of minutia points and other singularities, and is affected by a wide variety of factors. In this direction, it is well known that it is difficult to automatically and reliably extract minutia-based representations from poor quality fingerprints arisen from dry or wet fingers, or from fingers affected by scars, scratches due to accidents, injuries, or profession-related factors (e.g. farmer, miner, musician, etc.). In addition, there is proven evidence that a fraction of the population may have fingers that have relatively small number of minutia points, thereby making fingerprint-based verification (based on minutiae) more vulnerable to failures for those individuals. Apart from minutia points, there exists a set of additional characteristics, derived from the fingerprint ridge pattern and known as level 3 features, which can be also used in fingerprint matching. Latent fingerprint experts usually rely on level 3 features, but they have not been yet massively introduced in commercial AFIS/AFAS applications. A major trend, focused on combining or fusing several fingerprint features (texture, minutiae, ridges, orientation field, pores or other small ridge details) in the matching stage, is pointed to be a valid design solution to improve the performance of those matching algorithms focused on single fingerprint features. Nevertheless, current automatic systems often cannot match the performance of forensic experts, specially when prints are small (partial fingerprint impressions), contain insufficient number of features (minutiae points, ridges, etc.), or are affected by noise and nonlinear distortions. Although big advances have been made in recent years with the introduction of additional discriminatory information (pores, ridge shape information, etc.) and the exploitation of hybrid matchers, the implementation of physical systems in charge of efficiently automating the fingerprint-based personal recognition process still remains a major challenge. 2.7. Conclusions Fingerprints have been used for over a century and are the oldest and most widely used forms of biometric identification. In spite of the widespread use of fingerprints in AFAS/AFIS systems today, and the popular misconception that fingerprint-based personal recognition is a fully solved problem, the state of the art points out that this is not entirely true. The latest international performance competitions on automatic systems such as Fingerprint Verification Competition contests (FVC2000, FVC2002, FVC2004, FVC2006, and the on-line FVC-onGoing edition) and NIST evaluation programs (FpVTE2003, MINEX2004, OMINEX2005, MINEX2007, ELFT2007, ELFT2009) make evident that the state of the art in automatic personal recognition systems present a limited accuracy performance in terms of indicators like FAR/FRR/EER. The performance of AFAS/AFIS systems can only be estimated from empirical data, and the estimates are very data dependent. Therefore, they are only valid for a specific database in a specific test environment. Nevertheless, the typical achieved results, expressed in terms of performance indicators like EER in the aforementioned evaluation programs, are set in the range of % (thousands of ppm’s), and not yet in one-digit ppm’s format. Consequently, the accuracy performance reached by commercial fingerprint recognition applications today cannot rival that achieved by a dedicated, well-trained, human fingerprint expert team. Those automatic applications that deal with fingerprint biometric systems are still far from that almost ideal system able to differentiate each inhabitant of the world based on his inherent fingerprint characteristics. The root cause of this issue stems from the fact that fingerprint images are rarely of perfect quality. They may be degraded and corrupted due to variations in skin and acquisition conditions. A critical step in studying the uniqueness of human fingerprints is to reliably extract discriminative feature information from fingerprint images in all possible environmental conditions. Research in automatic fingerprint recognition has been mostly an exercise of imitating the performance of human 86 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 fingerprint experts without access to the many underlying information-rich features an expert is able to glean by visual examination. Moreover, the automation of the fingerprint recognition process carries out technological challenges linked to image processing and pattern recognition. Biometrics has become the subject of intense research by both industry and academic institutions. Despite the excellent research work done in the latest three decades, there is a great deal of research yet to be done. Considering both the immense interest in the field and the numerous opportunities for innovations and advancements, automated fingerprint verification continues to be an important and exciting area of research in the coming years. In the area of recognition algorithms, the main issues that need to be addressed are: (i) The right processing of poor quality fingerprints. When dealing with noisy and low quality fingerprint impressions (owing to factors such as wet/dry finger conditions, sensor limitations, age, etc.) the performance of the recognition algorithms goes dramatically down. (ii) The non-linear deformations available in fingerprint impressions owing to the inherent elasticity of the skin and the problems related to the design of distortion-tolerant matcher systems. (iii) The issues linked to the reduced sensing area of current live-scan sensing devices (partial prints, which originates the possibility of having small overlapped areas, with no common singularities present in both –template and query– fingerprints), which makes things more difficult. The characteristic fingerprint features are generally categorized into three levels: - Level 1 features, or patterns, are the macro details of fingerprints such as ridge flow, field orientation map, ridge map, singular points, etc. - Level 2 features, or minutiae, refer to the Galton characteristics, such as ridge endings and ridge bifurcations. - Level 3 features, or shape, include all dimensional attributes of the ridge such as ridge path deviation, ridge width, shape, pores, edge contours, dots, incipient ridges, breaks, creases, scars and other permanent details. Level 1 features are useful for fingerprint classification and coarse alignment, while level 2 features have sufficient discriminating power to establish the individuality of fingerprints so they are used for fine alignment and matching purposes. Moreover, level 3 features can provide additional discriminatory information for fingerprint matching in authentication/identification applications. Unfortunately, commercial AFAS/AFIS systems rarely use level 3 features. This is because, in order to extract fine details, higher resolution acquisition devices –above the FBI fingerprint acquisition resolution standard of 500 dpi– are required. In the previous sections, the main fingerprint processing techniques and their peculiarities have been discussed. Most of the fingerprint matching algorithms developed in the last decades are minutiae-based. Non-minutiae features such as ridge-valley local textures, ridge geometry, ridge spatial relationship, pores, etc. are however receiving more interest now. Certainly, each technique has its pros and cons, each of these approaches has its own strengths and weaknesses, and can be better or worse than others depending on the weight given to different factors: accuracy, computational cost, real-time performance, system cost, robustness against low quality fingerprint impressions, and so on. In terms of recognition performance, the combination of multiple features (level 1, level 2 and level 3) and the sensing of multiple fingers (from two to ten fingers) from the same user seem to be the most promising way to significantly improve the accuracy of fingerprint recognition systems till the levels required by real-world applications. In the current technological age, and in order to combat the continuing increase in identity fraud, we are more and more being asked to prove our identity to any kind of governmental or commercial applications. If fingerprint technology matures till the required levels, with robust and reliable fingerprint matching algorithms and other associated issues like security and privacy concerns properly overcome, there will be an increasing demand of personal recognition systems based on fingerprints in the near future applied to a big range of market segments. Fingerprint-based recognition can have a profound influence on the way we conduct our daily business in the not-toodistant future. 87 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 88 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 3. Architectures for Real-Time Digital Signal Processing Nowadays, the digital era more and more demands the deployment of innovative, powerful and efficient architectures to deal with all those high-performance applications that turn continuously up in many technological fields such as wire and wireless communication systems, military applications, audio and video signal processing, medical image diagnosis, etc. Special attention needs to be paid to those key factors such as application performance, real-time execution, power consumption, time-to-market and associated end product costs before selecting the suitable system architecture for any application. Furthermore, factors such as the existence of the proper EDA development tools, hardware/software support, design flexibility and other product maintenance aspects need to be taken into consideration. In this section, a survey of those available technological solutions for digital signal processing is presented. The state of the art in semiconductor technology applied to modern computing is highlighted, without skipping those traditional solutions that are still in use today. A brief classification assumes eight different system architecture approaches: • MPUs / MCUs • DSPs • ASICs / ASSPs • Structured ASICs • FPGAs • Arrays of processing elements • GPUs or multi-core chips • Multiprocessor systems A comparative analysis of the main benefits and trade-offs featured by each of the architectures is presented in the next sections. This benchmark focuses on suitable architectures for low-cost embedded system platforms where to develop biometric applications. Therefore, HPC platforms such as mainframe computers, supercomputers, or personal computers are outside the scope of this chapter. 3.1. Microprocessor / Microcontroller Chips This group refers to those general-purpose devices developed in order to satisfy the requirements of a broad band of applications. The brain of a general-purpose digital computer is normally called processor or CPU (central processing unit). The term microprocessor or MPU (microprocessor unit) refers to a processor or CPU that is implemented on a single integrated circuit device. Additional memory and input/output peripheral chips are required to support any application. The term microcontroller or MCU (microcontroller unit) refers to the combination of a general-purpose processor or CPU along with some memory and input/output peripherals embedded in the same chip in order to cut down on system size, costs and power consumption while improving in aspects such as system integration, reliability and execution speed for the whole application. The heart of a MCU/MPU chip is the ALU (arithmetic logic unit), which is the processing block responsible for performing those arithmetical and logical operations with the input data. In order to distinguish the MPU/MCU family of devices discussed in this section from other more complex chips covered in coming sections, the MPU/MCU group refers to uniprocessors or single-core devices, that is, each chip embeds one single CPU or ALU. Once defined those technical aspects such as the instructions set (RISC versus CISC), the buses architecture (Von Neumann, Harvard, or Modified Harvard), the operands size of the ALU, the width of the address and data buses, the amount of pipeline stages per instruction cycle, the system clock frequency, the size of the cache and on-chip memories, as well as the number of peripherals, I/O ports, etc., the hardware architecture of the processor keeps fixed. Therefore, the flexibility is given by the software functionality that the CPU has to execute. Thus, the same processor can cover a big range of applications by modifying the program code and the input data that the CPU has to deal with. 89 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The first commercial microprocessor chip was the Intel 4004, which was introduced in 1971, featuring a maximum system clock frequency of 740 kHz, a 4-bit CPU with a 4-bit data bus and a 12-bit address bus (both buses were multiplexed through the same 4 pins of the chip since the package was pin-limited). The design was fully developed with 2300 transistors. From the release of the first commercial microprocessor to date, numerous performance improvements have been achieved: - the width of the data buses has been increased from 4 to 8, 16, 32 or even 64 bits; - the address buses (and the addressable memory map) have been largely extended; - local high-speed cache memory has been added to the microcontrollers, as well as other large volatile and non-volatile memories; - additional features have been integrated on-chip in order to increase their computational power: more efficient pipelining stages, mathematical coprocessors oriented to cover specific fields of application like floating-point units, and other peripherals like cryptoprocessors, timers, communication controllers, etc.; - the system clock frequency has been notoriously increased till the range of several GHz; and - the size of the transistors has been shrunk allowing thus the increase in the number of transistors per device right ahead of Moore’s Law without impacting the size of the chip. Today’s processors can contain hundreds of millions of transistors. MCUs offer a big variety of on-chip peripherals and memory. Furthermore, MCU families often contain dozens of derivatives. For applications that require only modest integration, this often makes possible to find one MCU with just the right mix of on-chip integration. However, in case of dealing with stringent high-performance applications, even with all the advances recently made based on the above traditional techniques, it seems that the processor technology has reached its own limit. If a conventional processor cannot meet the needs of a target application, it becomes necessary to evaluate alternative architecture solutions based on the usage of multiple processing units on-chip or on-system able to work concurrently. Parallelization of the processing results the most efficient way of acceleration. 3.2. Digital Signal Processor Chips A digital signal processor or DSP is a special-purpose CPU that is developed in order to process certain forms of digital data more efficiently than those general-purpose CPUs (MPUs/MCUs). Apart from the ALU to perform arithmetical and logical operations, in DSP devices it is often required to perform multiply-accumulate (MAC) operations in which two operands are multiplied together and the result is added to an accumulator where intermediate results can be stored. For this reason, DSP chips often contain special hardware MAC units. Therefore, DSPs consist of one or more ALUs and MACs, instruction decoders and some data path logic to move data between the arithmetic units and the data store. Besides, memory and many input/output peripherals can be embedded in the same chip to gain system integration performance, simplify product development and reduce time to market. DSP chips combine those attributes of traditional MCUs with dedicated signal processing cores. DSPs offer medium signal processing speed. Much of their efficiency comes from their relatively high levels of parallelism. For example, most DSPs can execute arithmetical operations in parallel with load and store operations. Complex algorithms can be implemented by means of the sequence of instructions manipulating data in the desired way. In the last years, the main DSP vendors such as Texas Instruments, Freescale or Analog Devices have evolved from general-purpose DSPs, intended to serve a wide range of applications, to DSP families focused on specific-purpose digital signal processing applications such as image or audio equipments. A classification of DSP chips according to their performance provides three main categories: (i) low-cost fixed-point DSPs, (ii) high-performance fixed-point DSPs, and (iii) floatingpoint DSPs. The cost, performance and data set requirements associated with the target application dictate the need for fixed-point or floating-point processing. Table 7 presents a benchmark of the main DSP families available in the market today. 90 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Supplier DSP Family Processing Cores Frequency On-chip features ADC CAN DMA Ethernet FLASH/ROM GPIO PWM Real Time Clock SPI SRAM Timer UART USB Watchdog Others Data ciphering DMA FIR/IIR/FFT accelerators GPIO PWM Real Time Clock ROM SPI SRAM Timer UART Watchdog Others DMA DRAM GPIO SRAM Timer Others ADC DAC FIR/IIR accelerators GPIO ROM SPI SRAM Others DMA Filtering accelerator GPIO RAM ROM SPI Timer Watchdog Others DMA DFT/FFT accelerator Other accelerators Ethernet GPIO SPI Timer UART Others ADC CAN DAC DMA FLASH/ROM GPIO LIN SPI SRAM Timer Watchdog Others Main applications Analog Devices Blackfin processors Fixed-point Single-core Dual-core 200MHz-600MHz Automotive Communications Digital media appliances Industrial instrumentation Medical equipment Networks Others Analog Devices SHARC processors Fixed-point Floating-point Single-core 60MHz-450MHz Audio systems Industrial instrumentation Medical equipment Telephony Others Analog Devices TigerSHARC processors Fixed-point Floating-point Single-core 250MHz-600MHz Automated test equipment Industrial instrumentation Medical equipment Military equipment Wireless communications Others Analog Devices SigmaDSP processors Fixed-point Single-core Up to 172MHz Audio systems Freescale DSP56K/Symphony DSP Fixed-point Single-core Dual-core 100MHz-250MHz Audio systems Communications Industrial applications Networks Freescale StarCore DSP Fixed-point Single-core Dual-core Triple-core Quad-core Six-core 250MHz-1200MHz Communications Industrial applications Medical equipment Security Space/Avionics Video & Audio applications Freescale Digital Signal Controller Fixed-point Single-core 32MHz-120MHz Audio systems Communications Industrial applications Medical equipment Motor control Security Telephony Note: Table 7 continues in next page. 91 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Supplier DSP Family Processing Cores Frequency On-chip features ADC DMA FFT accelerator SRAM USB GPIO Real Time Clock ROM UART SPI Timer Watchdog Others DMA Ethernet FFT accelerator Other accelerators GPIO ROM SPI SRAM Timer UART Others Ethernet GPIO PWM ROM SRAM Timer UART USB Others DAC DMA GPIO Graphics accelerator Other accelerators PWM Timer UART USB Others Main applications Texas Instruments C5000 Ultra Low Power DSP Fixed-point Single-core Quad-core 50MHz-300MHz Audio systems Automotive Communications Consumer electronics Industrial applications Medical equipment Security Others Texas Instruments C6000 Multicore DSP Fixed-point Floating-point Single-core Dual-core Triple-core Quad-core Octal-core 150MHz-1250MHz Automated test equipment Communications Medical equipment Military equipment Security Video & Audio applications Others Texas Instruments C6-Integra DSP+ARM Fixed-point Floating-point DSP+ARM 200MHz-1500MHz (ARM & DSP) Communications Consumer electronics Energy Industrial applications Medical equipment Military equipment Space/Avionics Others Texas Instruments DaVinci Digital Media Processors Fixed-point Floating-point DSP+ARM 135MHz-1200MHz (ARM) 300MHz-1100MHz (DSP) Digital video systems Table 7. DSP processors. Low-cost fixed-point DSP This group comprises the families Analog Devices Blackfin, Analog Devices SigmaDSP, Freescale DSP56K/Symphony, Freescale Digital Signal Controllers, and Texas Instruments C5000 DSPs. They normally operate at modest clock speeds, in the range till 600 MHz, and they are usually single- or dual-MAC devices so they can execute one or two instructions per cycle. Today, many modern embedded CPUs are actually faster than low-cost fixed-point DSPs. But in signal processing applications, embedded CPUs typically cannot compete with DSP processors on power and cost efficiency, and they usually lack the specialized on-chip integration and development tools needed for specific signal/image processing purposes. For instance, in those applications such as motor control, where processing speed is not the most important metric but dedicated peripherals are requested, low-cost fixed-point DSPs are often a good choice thanks to their cost-effectiveness, power saving, and efficiency. Low-cost fixed-point DSPs are marketed for those applications requiring a combination of low cost, moderate DSP performance and high energy efficiency such as portable audio players and other consumer applications. High-performance fixed-point DSP This group of processors normally supports multiple MACs per cycle and executes multiple instructions using VLIW (very long instruction word) techniques. It also includes SIMD (single 92 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 instruction, multiple data) techniques so it is possible to perform multiple operations in parallel. There are three mainstream competitors: Freescale StarCore, Analog Devices Blackfin, and Texas Instruments C6000 DSP families. Some of them include multi-core DSP devices and/or specialized hardware and instructions to accelerate those critical tasks of the processing. Typical applications are wireless base stations, packet telephony media gateways, two- and three-dimensional imaging applications, video multi-conferencing, and radar and sonar systems. Floating-point DSP Only five main floating-point DSP processor families are available today: the Analog Devices SHARC and TigerSHARC, and the Texas Instruments C6000, C6-Integra, and DaVinci Digital Media Processor DSPs. Some of them include multi-core DSP devices or ARM+DSP core structures. They combine VLIW with SIMD to support multiple floating-point operations in parallel. Working frequencies go till 1500 MHz. Because of their additional circuit complexity and higher operating frequencies, floating-point DSP processors are not as energy-efficient as highperformance fixed-point DSP processors. Typical applications are radar and medical imaging. Traditionally, floating-point DSP processors have been more expensive than fixed-point DSPs, but this has been shifting in recent years. Both vendors Analog Devices and Texas Instruments offer low-cost floating-point chips, with prices comparable to some fixed-point DSPs. This has opened up new application areas for floating-point DSPs, including automotive and consumer audio. 3.3. Application-Specific Chips Apart from those general-purpose devices such as microprocessors, microcontrollers or digital signal processors, which are intended for a wide range of applications, there exists in the market another brand of devices intended for very-specific applications. Application-specific chips refer to application-specific integrated circuits (ASIC) and application-specific standard products (ASSP). ASIC and ASSP devices feature the finest level of granularity. Their functional circuits are built on silicon at the level of individual logic gates. The main difference between both is that an ASIC is a device that is custom-created for a particular application and is intended for use by only one or very few companies. However, ASSPs are devices that are created by ASIC technologies as well, but they are intended to be sold as standard parts to anybody who wants to use them. Both are intended for high-volume applications since their development cycles are long and expensive, and they need of high-volumes to pay off the involved costs and to reach competitive unit prices. Although Moore’s law forecasts predictable evolution towards finer and finer process geometries, shrinking mask lithography and increasing the number of metal layers to develop highly integrated devices increase the non-recurring engineering (NRE) costs in the development of ASIC technology products. The increased mask expenditures, and the long verification cycles needed to minimize any risk of design errors (either in the full mask set or in the full metal layer set) imply ASIC technology to become only accessible to high-volume applications, making it more prohibitive to low- and mid-range volumes. Two types of ASIC technologies are supported in today’s marketplace: (i) Full-custom ASIC and (ii) Semi-custom ASIC devices. Full-custom ASIC Full-custom ASIC devices normally contain both analog and digital circuitry in the same chip, and are designed based on a customer specification. This solution is generally used for high-volume applications where factors such as cost, power and performance become critical. This approach usually has the longest design-cycle times due to the custom nature of the methodology used. Semi-custom ASIC Semi-custom ASIC devices are mostly a digital solution where the customer writes his code in RTL and provides such information together with timing constraints to the ASIC provider, who is finally in charge of developing the chip. This solution is the most widely used approach today since it 93 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 looks for a low-cost product that features low power and high performance with a faster time-tomarket than a full-custom ASIC. All digital logic is constructed from two basic elements: transistors and metal lines for interconnecting them. The growing ASIC mask costs require ASIC vendors to spend more and more in verification tools and engineering efforts in order to increase the likelihood that silicon design works right at first time. Because ASICs and ASSPs are inherently hard-wired in a fabrication environment, they can provide a level of security (prevention of tampering) that other programmable products like structured ASICs or FPGAs simply cannot. Moreover, the custom design allows advantages in terms of performances like power consumption and operating frequencies with regard to the aforementioned alternative devices covered in the next sections. 3.4. Structured ASIC Chips The high costs and long design cycles of custom silicon chips are putting ASIC technology out of reach for those applications featuring only low- or mid-range volumes that demand short time-tomarket cycles. In this direction, many applications in market segments such as communications, industrial or medical do not drive volumes large enough to justify the development of ASICs, but they normally request the power, performance and costs of ASIC devices. Therefore, in those application-specific scenarios where the high NRE costs and long design/manufacturing cycles make ASIC devices not appropriate, other alternative technologies like structured ASICs or FPGAs can optimize costs and performances for lower volume applications. ASIC devices require the customization of two separate elements: the logic and the routing. The customization of both items in the way of direct-mapped hardware allows maximizing the speed – the working frequency– and the performance of the functional circuitry while minimizing its power consumption. However, once the ASIC is built, no flexibility in the way of programmability is allowed in the design. An option to increase the flexibility is the structured ASIC technology approach, where routing is customized but the logic is programmable by means of SRAM LUTbased logic cells configured at device power-up. The customized interconnect is “hard”, and like ASIC technology is very fast, but once customized it cannot be altered. And the design logic is “soft”, composed of fine- and medium-grain programmable structured elements distributed along the logic fabric, and implemented using programmable SRAM-based lookup tables that are configured by means of a bitstream loaded at device power-up. Structured ASIC technology bridges the large gap between FPGA and ASIC technologies. Features Routing Logic ASIC technology Mask customized routing Mask customized logic Structured ASIC technology Mask customized routing SRAM programmable logic FPGA technology SRAM programmable routing SRAM programmable logic Table 8. Logic fabric & interconnect. As indicated in Table 8, structured ASIC becomes an intermediate solution between fully-custom ASIC and fully-flexible (programmable) FPGA technologies: (i) As opposed to ASICs, which require a new mask set for each new design, structured ASIC consists of a pre-fabricated, common base array containing logic structures, memory, and other IPs designed to serve multiple customers. Customization occurs only at routing interconnect level, keeping the design logic fully programmable. Fewer new masks lead to lower NRE costs, lower risks and faster manufacturing times. With few custom masks, the barriers to access the technology are lowered dramatically so structured ASICs become a good alternative to ASICs from a cost point of view. The number of gates that can be embedded in a given area results slightly lower in a structured ASIC than in an ASIC device, also the performance in terms of operating frequency is slightly reduced, as the power consumption efficiency does. (ii) Structured ASICs combine programmable logic and custom routing in a device. The routing is customized at the foundry, and the programmability of the design logic is achieved by means of 94 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 SRAM-based LUTs distributed along the device. The design task for structured ASICs consists thus of mapping the application-specific functionality into a fixed arrangement of known cells. The gate density of a structured ASIC results slightly higher than an FPGA, as well as the reached operating frequency and the power consumption savings. However, the NRE of structured ASICs is definitely worse than FPGAs. Based on these attributes, the demand for structured ASICs continues to grow in several areas: military and aerospace, FPGA conversions for cost reduction, and low- to mediumvolume ASIC requirements. 3.5. Field Programmable Gate Array Chips As already detailed in the previous section, Structured ASIC technology features a customized routing interconnect and a programmable logic fabric. The next step in programmability performance is covered by FPGA technology, which provides flexibility at both logic and routing levels. The basic concept of Field Programmable Gate Arrays or FPGA chips is depicted in Figure 43. FPGAs can conceptually be considered as a set of I/O blocks and an array of configurable logic blocks that can be connected together through a configurable interconnection matrix. In addition, standard memory and dedicated hardware blocks can be available along the interconnection matrix to build complex digital circuits. Logic Blocks I/O Blocks Routing Interconnect Figure 43. FPGA chip – Basic blocks. It is common to categorize FPGA resources as being either fine-grained or coarse-grained. In case of fine-grained architectures, each logic block can be used to implement only a very simple function (primitive logic function: NAND, AND, OR, etc. or a storage element: D-type flip-flop, D-type latch, etc.). Fine-grained architectures are particularly efficient when executing functions that benefit from massively parallel implementations. In case of coarse-grained architectures, however, each logic block contains a relatively large amount of logic compared to their fine-grained counterparts. A logic block might contain several LUTs, multiplexers, D-type flip-flops, and some fast carry logic. An important consideration with regard to architectural granularity is that finegrained implementations require a relatively large number of connections into and out of each block compared to the amount of functionality that can be supported by those blocks. By comparison, as the granularity of the blocks increase to medium-grained and higher, the amount of connections into and out of the blocks decreases compared to the amount of functionality they can support. This is important because the programmable inter-block interconnect accounts for the vast majority of the delays associated with signals as they propagate through an FPGA. Nowadays, most of the FPGAs deal with coarse-grained architectures. 95 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Instead of freezing the silicon functional blocks in the factory, as it happens with general-purpose or application-specific chips such as ASICs, ASSPs, DSPs, MPUs, MCUs, etc.; FPGAs are left in a pseudo form that can be moulded in the field according to the requirements of the final application. Based on the field programmability performance inherent to FPGA technologies, it is possible to develop totally flexible designs. However, that polymorphism comes at a cost that is usually the unit price, increased power consumption, slower clock speeds, and almost absence of analog capabilities of FPGAs with regard to those aforementioned devices. The continuous advances in FPGA technologies try to mitigate all these factors to the extent that current FPGAs are increasingly competitive against hard silicon solutions. Moreover, apart from the soft-hardware definition, FPGAs offer other advantages like the minimization of the development costs when compared against those of custom silicon devices. FPGA Technologies An FPGA is an integrated circuit that contains many identical logic cells distributed along a matrix of wires and programmable switches. The logic architecture varies among different device families, but generally, each logic cell combines a few binary inputs and one or two outputs according to a Boolean function. In most families, there exists the option of registering the combinational output of the cell so that clocked logic can be easily implemented. The cell’s combinational logic may be physically implemented as a small lookup table memory, or as a set of multiplexers and gates. FPGAs have become larger as the processes shrink. They have begun to include buses, special I/Os, embedded memory, embedded multipliers, dedicated adders, MAC units, other mathematical or DSP hardware blocks, etc. with the FPGA logic and the programmable interconnect. Any design is implemented by specifying the simple logic function for each cell, selecting those available functional blocks needed, and selectively closing the switches in the interconnect matrix. Complex designs are created by combining all the features embedded in the FPGA (the I/O blocks, the array of logic cells, the set of embedded blocks, and the matrix of wires and programmable switches). Three technologies are available in the deployment of FPGA products: (i) Antifuse technology, (ii) FLASH technology, and (iii) SRAM technology. Depending on the particular technology used, the function implemented in the FPGA is either “burned” in permanently (Antifuse), or semipermanently (FLASH), or is loaded from an external memory or a microprocessor each time the device is powered up (SRAM). A benchmark based on the most important vendors of FPGA devices is shown in Table 9. The relevant features of each family are detailed. FPGA Vendor Logo Company Name FPGA Family Technology Relevant Features Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES ADC performance Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES ADC performance Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES ADC performance Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES ADC performance URL Artix-7 SRAM, 28-nm Kintex-7 SRAM, 28-nm Xilinx Inc. Virtex-7 SRAM, 28-nm www.xilinx.com Spartan-6 SRAM, 45-nm Virtex-6 SRAM, 40-nm Note: Table 9 continues in next page. 96 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FPGA Vendor Logo Company Name FPGA Family Technology Stratix V SRAM, 28-nm Arria V SRAM, 28-nm Altera Corporation Cyclone V SRAM, 28-nm Stratix IV Arria II Cyclone IV Igloo ProAsic 3 Fusion Mixed Signal Microsemi SoC Products Group RT ProAsic 3 RTAX RTSX-SU Axcelerator SX-A eX MX AT40K AT40K-AL Atmel Corporation AT40KEL040 ATF280 LatticeECP3 Lattice Semiconductor Corporation LatticeECP2/M LatticeSC/M LatticeXP2 SRAM, 40-nm SRAM, 40-nm SRAM, 60-nm FLASH, 130-nm FLASH, 130-nm FLASH, 130-nm FLASH, 130-nm Antifuse, 150-nm Antifuse, 250-nm Antifuse, 150-nm Antifuse, 220-nm Antifuse, 250-nm Antifuse, 220-nm Antifuse, 450-nm SRAM, 600-nm SRAM, 350-nm SRAM, 350-nm SRAM, 180-nm SRAM, 65-nm SRAM, 90-nm SRAM, 90-nm SRAM, 90-nm Relevant Features Dynamic reconfiguration High-speed transceivers Embedded HardCopy blocks Embedded hard IPs Configuration AES Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES Dynamic reconfiguration High-speed transceivers Embedded hard IPs Configuration AES High-speed transceivers Embedded hard IPs Configuration AES High-speed transceivers Embedded hard IPs Configuration AES High-speed transceivers Embedded hard IPs Embedded non-volatile memory Configuration AES Embedded non-volatile memory Configuration AES Embedded non-volatile memory Configuration AES ADC performance Embedded non-volatile memory Radiation tolerance Radiation tolerance Radiation tolerance URL www.altera.com www.actel.com Dynamic reconfiguration Dynamic reconfiguration Dynamic reconfiguration Radiation tolerance Radiation tolerance High-speed transceivers Configuration AES High-speed transceivers Configuration AES High-speed transceivers Embedded hard IPs High-speed transceivers Configuration AES On-chip FLASH Dynamic reconfiguration High-speed transceivers Embedded hard IPs On-chip non-volalitle configuration memory On-chip non-volalitle configuration memory High-speed transceivers On-chip non-volalitle configuration memory High-speed transceivers On-chip non-volalitle configuration memory High-speed transceivers Embedded hard IPs High-speed transceivers www.atmel.com www.latticesemi.com Tabula, Inc. ABAX SRAM, 40-nm www.tabula.com iCE65 L iCE65 P SiliconBlue Technologies Corporation iCE40 LP iCE40 HX Achronix Semiconductor Corporation Speedster 22i Speedster SRAM, 65-nm SRAM, 65-nm SRAM, 40-nm SRAM, 40-nm SRAM, 22-nm SRAM, 65-nm www.siliconbluetech.com www.achronix.com Table 9. FPGA devices. 97 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Antifuse-based devices are programmed off-line using a special device programmer. These devices are one-time-programmable (OTP) and non-volatile, which means that their configuration data remains when the system is powered-down and they are immediately available as soon as power is applied to the system. In EEPROM or FLASH-based FPGAs, once programmed, the data they contain is non-volatile so these devices are also available as soon as power is applied to the system. These devices can be either configured off-line using a device programmer or, alternatively, some versions are in-system programmable (ISP). EEPROM and FLASH cells are smaller than their SRAM counterparts. This means that the rest of the logic can be much closer together, thereby reducing interconnect delays. However, the majority of FPGAs are SRAM-based devices. The main advantage of this technique is the fact that SRAM configuration cells can be programmed over and over again. This programmability is done by the user or in the field rather than by the manufacturer of the device. This provided flexibility is one of the main advantages in front of ASIC technologies, and gives the user access to complex integrated designs without the high engineering costs associated with application specific chips. That field programmability feature can be used to dynamically reprogram in the field the content of the device to adapt the current design to changing trends/standards, to upgrade or add new functionality to the system, or to fix known bugs easily without the need of changing the physical device or the whole electronic board where the FPGA device is installed. Hard-core versus Soft-core processors In recent years, FPGAs have become increasingly attractive as signal processing engines, sometimes used alone and sometimes in conjunction with a processor chip. Nowadays, there is a wide range of FPGA products being offered by many semiconductor vendors, including Xilinx Inc. (www.xilinx.com), Altera Corporation (www.altera.com), Atmel Corporation (www.atmel.com), Microsemi SoC Products Group (www.actel.com), and Lattice Semiconductor (www.latticesemi.com). Each FPGA vendor also offers its own selection of hard and soft IPs. On the one hand, hard IPs come in the form of hard-wired blocks such as adders, multipliers, MAC functions, FFT operators, or even microprocessor cores, among others. These blocks are designed to be as much efficient as possible in terms of power consumption, silicon area, and performance. Each FPGA family features different combinations of such blocks together with some quantities of programmable logic. Unlike programmable logic blocks, hard IP blocks only contain the amount of transistors needed to implement the required functions. There are no programmable interconnections so the routing capacitance is as small as possible to minimize signal propagation delays. Therefore, designs that use hard IPs can see relevant power consumption reduction and performance enhancements in comparison with the same implementation on programmable logic. On the other hand, soft IPs refer to a source-level library of high-level functions that can be included into users’ designs. These functions are typically defined using hardware description languages such as Verilog or VHDL at register transfer level (RTL) of abstraction, and cover a wide range of standard processor cores, configurable processor cores, and other vendor IP cores. MPU/DSP chip vendors such as Intel Corporation (www.intel.com), AMD Inc. (www.amd.com), Freescale (www.freescale.com), Texas Instruments (www.ti.com), Fujitsu (www.fujitsu.com), etc. develop an extensive portfolio of standard devices, with different performances in order to cover a complete range of applications. However, those commercial devices usually do not fit 100% with the real needs of the applications, so it is normal to pay an extra cost for a bigger device with more features than those really needed. A possible way of coping with this kind of experiences is the instantiation of soft-cores under structured ASICs, FPGAs or system-on-chip ICs: a) MPU/DSP core vendors such as ARM (www.arm.com), Tensilica (www.tensilica.com), CoWare (www.synopsys.com), etc. offer HDL descriptions of standard processor cores as an alternative to physical standard chips, ready to be embedded into programmable logic devices like FPGAs. b) Other alternatives to HDL descriptions of standard processor cores are the so called configurable processor cores or application-specific instruction processors (ASIPs). An ASIP is a processor that 98 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 can be tailored for specific applications in order to be much faster and more efficient than standard processor architectures such as ARM, MIPS or PowerPC. The way to do so consists of adding to the processor those special registers and execution units that efficiently perform task-specific algorithms. There are three general ways to configure a processor: (i) by selecting from standard configuration options such as bus widths, interfaces, memories, and preconfigured execution units (floating-points, DSPs, etc); (ii) by adding new registers, register files, and custom task-specific instructions that support custom data types and operations; or (iii) by using programs that automatically analyse C code and determine the best processor configuration and instruction-set architecture extensions. The result in all of them is a new processor described in RTL that features an optimum configuration, specifically tailored to target applications. c) In addition to standard or configurable processor cores, there exist libraries of other components suitable to be used in real applications in the way of IP cores described in RTL. A growing number of dedicated IP cores that perform specific digital signal processing functions are available today. Those predefined modules can be tuned as necessary, and provide a variety of standard DSP algorithms such as filters, correlators, sine/cosine operators, and other mathematical functions. Those IP cores can be used in conjunction with standard or configurable processor IP cores to support a wide variety of applications such as communications, imaging, multimedia, video, etc. d) Apart from using off-the-shelf soft IPs, it is always possible to add into the design custom descriptions for those specific parts of the application that request more computational power but are not covered by commercially available IPs. These additional cores and glue logic is known as user IPs. The user IP cores can be seen as special-purpose processors tuned by the own design team according to the functional needs of the application. User IP cores can be (i) defined by means of hardware description languages such as VHDL or Verilog, or (ii) written in C/C++ derivatives such as Handel-C or SystemC, or (iii) synthesized with high-level design tools like Simulink/MATLAB. Reconfigurable FPGAs FPGAs are generic products customized at the point of use thanks to their programmability performance. In SRAM technology, the configuration and interconnection of logic elements, I/O pins and the rest of resources of the FPGA are accomplished by downloading a bitstream into an on-chip configuration memory at device power-up. The bitstream defines (i) the functionality of each of the logic elements, and (ii) the connection or routing established among those logic elements. Many SRAM-based FPGAs can be infinitely reprogrammed in-circuit in only a fraction of a second. Design revisions, even for a fielded product, can be implemented quickly and painlessly without any modification of the electronic board where the FPGA is installed, just by changing the bitstream lodged in the configuration memory of the device. In addition to their programmability performance, some FPGA families feature dynamic reconfigurability performance. It refers to the fact that the functionality of the chip can be reprogrammed on-the-fly, while the application keeps running, with minimal impact on performance and power consumption. Modifying or changing the functional configuration of a device during its operation is a feature unique to FPGAs that permits to reduce the hardware content needed in one application by multiplexing in time the functionality instantiated into the device. In the same device, different functionalities can be instantiated along the application execution time by making use of the same hardware resources at the expense of the reconfiguration time. When talking about dynamically reconfigurable computing, it is needed to distinguish between partial and complete reconfiguration, according to the portion of the device that is reconfigured at a time. In partial reconfiguration, only one portion of the device is reconfigured while the other portions of the device continue to perform their tasks without being stopped by the reconfiguration process [Becker et al., 2007]. In complete reconfiguration, however, the whole programmable logic fabric is reconfigured on-the-fly while the rest of hard-cores available in the device keep running. Dynamic reconfiguration helps to improve the device functional density by removing the need to place in the FPGA functions that do not operate simultaneously. Instead, those 99 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 mutually exclusive functions can be stored in memory and loaded in the programmable logic array when needed. It makes possible to reduce the size of the FPGA, the size of the electronic board where to install the FPGA and deploy the application, and the power consumption of the embedded system. FPGAs provide a valuable solution for those application areas that need flexibility in prototyping and quick design and manufacturing cycles to fastly launch new products to the market. The key advantage of FPGAs resides in their flexibility, and their nature focus on parallel processing. Applications in continuous evolution, projects that tend to have low to medium production volumes (too low to amortize the cost of a full ASIC implementation), and complex designs that require frequent hardware field upgrades or multiple customizations are good examples where to exploit the parallelism and flexibility of programmable/reconfigurable FPGAs. 3.6. Multi-core Chips The current interest in multi-core or multiprocessing elements on-chip is driven by the fact that the semiconductor industry has reached the point where the benefits of improving uniprocessor or single-core architectures do not get optimum throughputs due to many technological issues linked to silicon geometries, lithographic constraints, processing bottlenecks, clock frequency and memory bandwidth limitations, or power consumption aspects. Multi-core solutions provide the ability to drive simpler and slower cores consuming less power but performing more efficiently. The increase in processing performance comes from the number of cores able to work in parallel, not from their operating frequency or the use of super scalar pipelines, etc. to make them faster. It is important to remark that multiprocessing techniques only apply to applications with inherent parallelism; multicore performance on a sequential application might be worse than that achieved through a singlecore system. Notable challenges in the development of multi-core systems include defining efficient communication mechanisms and shared resource (e.g. memory) access arbitration protocols, designing appropriate parallel software architectures, creating a mapping of the application that makes good use of the processing resources, and deploying an efficient methodology to debug software in a multi-core environment. Developers need better tools (multi-core compilers, debuggers, etc.) to exploit multi-core processing in the implementation of real-world real-time embedded applications. The state of the art points out several types of multiprocessing devices, either in the way of multiple cores embedded in the same chip, or multiple chips inside the same package (multi-chip package). A classification of the existing muti-core chips can be done based on the nature of the processors that are embedded in the IC device as follows: a) Homogeneous multi-core chips, which refer to ICs composed of several processing units of the same type. In case of homogenous cores on-chip, multiple identical cores work under symmetric multiprocessing –SMP– strategies, thus the workload can be distributed among all those available processors. All have the same instruction sets/data structures so the application is scheduled in parallel tasks carried out by identical processors able to work concurrently. b) Heterogeneous multi-core chips, which refer to ICs composed of several different processing units. In case of heterogeneous cores on-chip, a mixture of dissimilar cores, oriented to the execution of specific tasks, work normally under asymmetric multiprocessing –AMP–. The application is composed of a set of computational tasks of different nature. Those computational tasks or threads are divided by type, and each group of processors aims at the execution of one type of tasks. Each group of processors features different instruction sets/data structures, and the scheduling of the application looks for parallel processing as much as possible. Homogeneous Multi-core Chips Intel Corporation and AMD Inc. are the mainstream manufacturers of microprocessors. Both produce many multi-core processors for laptop/desktop environments, for servers/workstations as well as for embedded system applications. Concerning to embedded processor devices, Intel 100 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Corporation features many multi-core families such as Intel Core (i7, i5, i3), Intel Pentium, Intel Celeron, Intel Atom, Intel Xeon, Intel Itanium, and Intel Centrino. Similarly, AMD Inc. also features an extensive set of multi-core (dual-, quad-, six- and eight-core) processors covering a wide range of operating frequencies, cache memory sizes and memory interfaces in order to provide scalable solutions for any kind of embedded system. Different families of devices are available such as Opteron, Athlon, Turion, Geode, Radeon, Phenom, and Sempron. Other homogeneous multi-core chips are ARM Cortex (www.arm.com) and SUN Niagara (www.oracle.com) family of devices. Heterogeneous Multi-core Chips One set of heterogeneous multi-core chips is covered by the hard system-on-chip OMAP-5 family of processors from Texas Instruments, which is specially oriented to support mobile embedded system applications. Its multi-core architecture features two 2 GHz ARM Cortex-A15 processors for symmetric multiprocessing, and two Cortex-M4 processors for low-power offload and real-time control. Other dedicated cores embedded on-chip are DSPs, multimedia accelerators, 2-D and 3-D image and video processing units based on 28-nm CMOS technologies, together with memory blocks and other standard peripherals like UART, SPI, USB, GPIO, memory controllers, etc. Another set of heterogeneous multi-core chips is covered by the IBM CELL family of processors (www.ibm.com). CELL is a non-homogeneous multi-core processor based on one power-based RISC processor core dedicated to operating system and other control functions, and eight SIMD processors optimized for compute-intensive operations. The CELL processor supports many of today’s programming models by introducing the concept of heterogeneous tasks or threads. The family of CELL processors is deeply used in embedded game systems like PlayStation. Another type of heterogeneous multi-core chips is addressed by the programmable system-on-chip family of devices. Programmable SOCs embed different processing units like one general-purpose single-core or multi-core MPU, some memory blocks, other standard peripherals such as FPU, UART, ADC, Timer, etc., and one programmable logic fabric in the form of an FPGA in the same silicon die. The state of the art in this technology is driven by SOC vendors like Xilinx Inc. and Altera Corporation, which have recently announced the release to the market of system-on-chip devices based on 28-nm semiconductor technologies. Figure 44. Xilinx Zynq-7000 SOC block diagram. 101 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The Zynq-7000 Extensible Processing Platform (EPP) from Xilinx Inc. combines, in a single device, an industry-standard ARM processor-based system with programmable logic fabric. The architecture of Zynq-7000 EPP is based on an ARM dual-core Cortex-A9 processor implemented in hardware and enhanced with a NEON engine –a single- and double-precision vector FPU unit–, and other peripherals including memory controllers, CAN, USB, SPI, UART, ADC, Ethernet, GPIOs, etc. The programmable logic portion of Zynq-7000 EPP consists of either an Artix or a Kintex Xilinx series-7 FPGA. In the FPGA, it is possible to instantiate custom peripherals and accelerators for any kind of digital signal processing application. The programmable logic fabric features dynamic reconfiguration capabilities. The hard-core processor system can operate at a maximum frequency of 800 MHz, and it boots first, prior to the configuration of the programmable logic. The added value of the Zynq-7000 EPP lies in the tight integration of the processing system with its programmable logic fabric through an AMBA interface provided with DMA and interrupt channels that permit to eliminate any bottleneck in the communications between both parts. The system diagram of Xilinx Zynq-7000 EPP device is depicted in Figure 44. In the same direction, Altera Corporation develops a new family of programmable SOC devices based on 28-nm technology known as SoC FPGA. It integrates an ARM-based hard processor system –consisting of a dual-core Cortex-A9 processor, a rich set of hardware peripherals, and memory blocks– and one FPGA fabric –consisting of the programmable 28-nm Altera Ciclone V or Arria V devices–, efficiently interconnected by means of a high-bandwidth interface. The dual-core ARM processor system can operate at up to 800 MHz. In the processor system, embedded peripherals such as NEON/FPU, CAN, UART, DMA, GPIO, etc. are implemented in hardware in order to reduce power consumption and leave more programmable logic resources free in the FPGA fabric for application-specific purposes. In the FPGA, either custom or off-the-shelf IPs can be synthesized. It is possible to configure the FPGA fabric and boot the processor system independently, in any order, to provide flexibility in any design. Figure 45 shows the block diagram of Altera SoC FPGA family of devices. Figure 45. Altera SoC FPGA block diagram. Other programmable system-on-chips are the family of devices PSoC-5 from Cypress Semiconductor Corporation (www.cypress.com), which integrates one ARM Cortex-M3 CPU, configurable analog and digital peripheral functions, and memory in a single chip; the Atom E6X5C Configurable Processor from Intel Corporation (www.intel.com), which combines an Atom MPU/SOC with an Altera FPGA into a single package; or the SmartFusion customizable SOC device from MicroSemi –formerly Actel– (www.actel.com), which integrates an ARM Cortex-M3 CPU, flash-based FPGA logic, and programmable analog circuitry in the same chip. 102 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 3.7. Array of Processing Elements Multi-core chips, or devices that embed a few processors on-chip, have been widely deployed for many years. A different approach refers to the growing number of chips that contain tens or even hundreds of processing elements in the same silicon die. In order to clearly distinguish this kind of devices from those multi-core chips already covered in the previous section, this new group of ICs is identified as arrays of processing elements in this section. Many different ICs of this category exist in the market when comparing the nature and granularity of the processing elements. Similarly to multi-core chips, a classification based on homogeneous and heterogeneous devices can be made. Homogeneous Arrays of Processing Elements ICs Among those available homogeneous devices, the Tilera (www.tilera.com) TILE64TM and TILEPro64TM families of processors are composed of a set of up to 64 general-purpose MIMD (multiple instructions, multiple data) processors on-chip able to operate under VLIW techniques. Tilera’s architecture eliminates the on-chip bus interconnect and places communication switches on each processor core arranging them in a grid on-chip. The combination of a core and a switch forms the basic building block called tile. Each tile is a complete full-featured 32-bits processor with integrated L1 and L2 cache memory. This means that each tile can independently run a full operating system, or multiple tiles can run all together a multiprocessing operating system. This creates an efficient two-dimensional on-chip network that avoids data congestion and does not limit performance scalability with the increased number of cores. As depicted in Figure 46, apart from the array of tiles, the chip integrates four DDR2 memory controllers and one complete array of high-speed I/O interfaces. Unused tiles can be put into sleep mode in order to decrease the total power consumption of the device while they are not needed by the application. Figure 46. Tilera TILEPro64 block diagram. Ambric Inc. (www.ambric.com) offers its homogeneous multiprocessor device Am2045, which comprises an array of 336 32-bit CPU/DSP cores and 336 2-kB memory blocks in a 2-D mesh interconnect. Am2045 is a massively parallel processor array chip under a MIMD architecture. Memory is distributed and accessed locally, not cached and not shared globally. CPUs in the chip are not multithreaded or multitasked; each processor is strictly encapsulated, accessing only its own code and memory. Point-to-point communication between processors is directly performed in the configurable interconnect, which presents a bandwidth of 792 Gbps. Rapport Inc. (www.rapportincorporated.com) offers its KC256 chip, which acts as a coprocessor. It includes 256 parallel processing elements on a single chip that allow computer-intensive tasks to be offloaded from the main processor of a system. 103 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Another example comes from Aspex Semiconductor (www.aspex-semi.com), with its AsProCore device. AsProCore features an array of up to 4096 processing elements provided with one ALU, memory, and access to a high-speed communications network. All processors work in parallel, and execute instructions on data held in their local memory under SIMD configuration. The ultra-high speed communication network present on-chip allows the processing elements to share data at bandwidths of 3200 Gbps. The AsProCode array can act as a single memory-mapped coprocessor for any 32-bit CPU. The block diagram of AsProCode IC is depicted in Figure 47. Figure 47. Aspex AsProCore block diagram. Nvidia Corporation (www.nvidia.com) revolutionized the parallel computing world in 2006-2007 by introducing its massively parallel architecture of GPU devices known as CUDA (Compute Unified Device Architecture). The CUDA architecture consists of hundreds of processor cores able to operate in parallel. Any application based on CUDA is normally composed of a general-purpose processor (single-core or multi-core MPU) and the GPU acting as a companion chip. GPUs are coprocessors, always requiring a host processor; and multiple GPUs can also be linked to a single host. The success of GPUs is mainly driven by the ease of their parallel programming model based on CUDA. In this programming model, the application developer has to identify those computeintensive tasks that can be operated in parallel and map all them to the GPU, while the rest of the processing remains on the MPU. Mapping a task to the GPU involves writing a function to expose the parallelism and specific keywords to move data to and from the GPU. The application has to launch hundreds of threads simultaneously, and the GPU manages those threads and does thread scheduling. Although GPU devices were initially thought as graphic chips only, mainly oriented to medical or other imaging applications, over the years they have started to be used in generalpurpose computational applications as well. Nowadays, the CUDA parallel hardware architecture together with the CUDA parallel programming model provide an excellent floating point performance to GPUs, and permits their usage in a wide range of applications like image processing, supercomputing, fluid dynamics, statistical physics, cryptography or financial analysis. Heterogeneous Arrays of Processing Elements ICs In recent years, a number of companies have started to offer more exotic heterogeneous architectures comprising one or a small number of CPU cores coupled with an array of processing elements of similar or dissimilar nature. Depending on the implementation, each of those processing elements can contain multipliers, adders, ALUs, MACs, counters, synchronizers, memory, etc. Additionally, other dedicated hardware processors can be embedded in the same chip. Many examples exist: a) DAPDNA-2 device, developed by IPFlex Inc. DAPDNA stands for Digital Application Processor/Distributed Network Architecture, and comprises two DAP processors (RISC CPUs) and one DNA (a reconfigurable hardware with 376 32-bit processing elements) in the same chip. 104 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 b) CSX700 processor device, developed by ClearSpeed Technology Ltd. (www.clearspeed.com). CSX700 is a floating-point coprocessor that comprises two multi-threaded array processor (MTAP) cores, each one containing an array of 96 processing elements. Each processing element features double and single precision floating-point acceleration. The block diagram of CSX7000 is depicted in Figure 48. Figure 48. ClearSpeed CSX700 block diagram. c) Another example comes from Picochip Ltd. (www.picochip.com). Its picoArray architecture features several hundreds of 16-bit CPU/DSP cores connected to a sea of programmable interconnect. The picoArray is a massively parallel MIMD architecture. It may be further equipped with a complete ARM9 processor subsystem and other hardware accelerators like FFT and cryptographic engines in the same chip. d) AMD Fusion is a new approach that delivers powerful x86 CPU cores and GPU capabilities in a single die under a chip that is called Accelerated Processing Unit (APU). Higher levels of performances in terms of multi-functionality, high-speed signal processing, lowlatency or real-time responsiveness, and low power consumption rates are claimed by real-world applications. Putting more than one core in a chip is a good way to elevate performance while keeping power under control. However, it is important to note that in parallel applications the performance increase is not guaranteed simply by increasing the number of cores; it is also needed to keep the right balance of resources within a core. Moreover, the advantages of architectures based on multi-cores or arrays of processing elements might be lost unless efficient tools are developed for deploying multi-core systems, able to overcome inherent issues in multiprocessing architectures like application partitioning, shared resources contention and arbitration techniques, tasks synchronization, cache coherency, load imbalance and load overheads, system bandwidth limitations, system programming/debugging, and reliable inter-core communication mechanisms. 3.8. Multiprocessor Systems Adding many devices in a system can be costly, specially when considering the effects on system reliability, I/O bandwidth, sustainable thermal and power budgets, as well as packaging constraints (i.e. physical size and weight of the product). The integration of many processing devices in the same circuit or system means that data must transfer from one device to others over some sort of interconnect (e.g. standard bus), which slows communication bandwidth and system performance. However, in those applications where such limitations are not relevant or they can be afforded, 105 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 adding more devices to the embedded system is an option. Also in case of supercomputing applications, the usage of scalable multiprocessor systems is a must. Many designs use a general-purpose processor (GPP) coupled with a separate DSP in the same PCB to accelerate signal processing tasks. Other designs use a GPP (single-core or multi-core chip) coupled with a GPU under a heterogeneous coprocessing computing model. Other possibilities are the usage of a GPP together with one or more FPGAs as in computer motherboards market. In all those scenarios, the GPP is normally used to execute application flow control tasks and to manage user interface communications; whereas GPUs/FPGAs are configured to perform computationallyintensive data processing tasks by exploiting parallel acceleration. GPUs are normally used for floating-point operations, and FPGAs are mainly intended for fixed-point computations. Good examples of this technical approach can be found in the high-performance computing o supercomputing markets. Companies like XtremeData Inc. (www.xtremedatainc.com) or Computer Corporation (www.drccomputer.com) combine Intel or AMD multi-core processors with FPGAs from Altera Corporation or Xilinx Inc. Similar examples are provided by Cray Inc. (www.cray.com), which uses the same approach but with GPUs from Nvidia Corporation instead of FPGAs. Other examples can be found in the military market. Military systems are increasingly moving away from homogeneous platforms based on arrays of processors to heterogeneous systems with FPGAs. Applications such as radar, cryptography, software defined radio or real-time surveillance video processing can often benefit from being run on FPGAs tightly coupled with multi-core GPPs. GPPs contribute with advantages of software solutions: easier to develop, can scale from small to large systems, and can be easily ported to new processor generations. And hardware accelerators on FPGAs contribute with real-time responsiveness and higher performances. In general terms, processor-centric solutions –including general-purpose MPUs/DSPs/GPUs, application-specific ASICs/ASSPs, and programmable FPGAs/SOCs– are a dominant force in today’s semiconductor market. For cost-sensitive, high-volume applications like cellular phones and PC graphic cards, the development of extremely specialized ASSPs and ASICs has occurred. However, for many other applications, the valid options for implementing high-performance digital signal processing consists in the mixture of MPUs, MCUs, DSPs, GPUs, FPGAs or SOC processors. When dealing with complex applications that merge multiple and different processing tasks, the development of hybrid multiprocessing systems in the form of multiple chips of different nature assembled in one single electronic board makes possible to reach the desired balance of programmability and specialization. 3.9. Conclusions Once reviewed the state of the art in semiconductor technology and the most popular architectures for real-time digital signal processing, Figure 49 shows the comparison of each architecture in the way of performance versus flexibility basis. (∗) DSP/MPU refers to either single-core or multi-core devices. Figure 49. Digital signal processing architectures. 106 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 49 ranges from approaches based on the high performance and the reduced flexibility of dedicated ASIC/ASSP chips, to those other approaches with great flexibility but reduced performance offered by general-purpose single-core or multi-core MPUs/MCUs. Market requirements increasingly demand powerful platforms, but always with the constraint that they may not exceed the cost limits of the target applications. Having access to the technology is a key factor to reach balanced cost-power-performance solutions. The current technological age is known for an almost unlimited range of solutions that are in turn based on a huge variety of architectural implementations. Each new architecture generation is based on finer silicon geometries, which results in a higher integration level with smaller transistors and bigger amounts of them on-chip in the form of multi-core or multiprocessor devices. However, Moore’s law gradually becomes more expensive. The costs linked to the fabrication of advanced CMOS semiconductors increases with coming process technologies. Therefore, higher gross profits are needed to justify the investments for developing advanced devices making use of those newer technologies. Outside of markets like consumer electronics, mobile handsets and PCs, there are few application markets of large volumes so programmable or general-purpose standard solutions tend to be more usual than custom or specialized devices. The high integration and density improvements achieved with advanced CMOS semiconductors allows reducing the number of ICs required to build a system and the number of off-chip interconnects, which leads to the optimization of the PCB surface and the consequent cost improvement. The on-chip and off-chip interconnections among the different parts of the circuits can be shorter, which makes possible to decrease the parasitic capacitances and to reduce the power consumption or increase the communication bandwidth in the data transfers. Therefore, embedding multiple cores in the same chip results energy- and performance-efficient with regard to the scenario of having the same functionality implemented by means of multiple discrete chips. The trends in system architecture points multi-core and system-on-chip approaches as the ones featuring the best advantages in terms of cost-power-performance and reliability aspects. Thanks to the continuous advances in semiconductor technology, we are seeing that even more complex digital signal processing applications become realizable today. As the complexity of systems increases, system reliability is no longer solely defined by the hardware platform, typically quantified in terms of MTBF (mean time between failures). Nowadays, system reliability is being increasingly determined by hardware and software aspects along design, development and verification processes. All these aspects need to be taken into consideration in the deployment of any complex application like the one based on fingerprint biometrics addressed in this thesis. The application addressed in this work is oriented to consumer electronics so low-cost and short time-to-market are key features to meet. In this direction, flexible solutions based on system-onchip technologies are suggested. System-on-chip technologies bring EDA tools able to easily and rapidly define the system architecture under a user-friendly environment. It allows embedding hardcore and soft-core CPUs and other hard-core or soft-core coprocessors in the same device. In this kind of development platforms, those high bandwidth processing elements of the code can be moved out of the processors and instantiated into the programmable logic portion of the device. This has value up since the FPGA portion can exploit parallelism to run algorithms at a slower clock speed, using less power, but with higher performance than the one exhibited by a purely software solution. Prime candidates for moving to hardware (instantiated in the FPGA portion of the SOC device) are those processing bottlenecks –subroutines or functions– that consume the most processor cycles. Dedicated hardware coprocessors can be instantiated in the FPGA region of SOC devices leading thus to hardware-software codesign implementations. In one SOC device there can be one or more (hard-core or soft-core) processors running legacy C code, closely coupled to parallel hardware coprocessors that offload those performance-critical computations through parallelism and pipelining acceleration techniques. Such kind of flexible platforms ready for dynamic evolution (reduced risk of component obsolescence) help designers to gain significant competitive advantages in the deployment of today demanded products. 107 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 108 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 4. Fingerprint-Based Biometric Systems There exist many physical systems in the market aiming at personal recognition based on fingerprint biometrics. In technical terms, those recognition systems try to automate the measurement of the inherent fingerprint characteristics of an individual and the comparison of those measured traits with others previously recorded in either a centralized or an individualized database. Those systems are oriented to authentication or identification applications. In case of authentication approaches, the individual asserts his/her identity in the authentication stage so the recognition system only needs to match the acquired features against those of the claimed identity to validate any individual. In case of identification applications, however, there is no identity claim per part of the user so the recognition system needs to match the measured traits of the individual against one subset or all the templates registered in a database (depending on whether fingerprint classification techniques are applied in the matching process or not) to finally confirm if the individual is recorded in such a database. The accuracy exhibited by the recognition system in any of its modalities –authentication and identification– is not exclusively a matter of the recognition algorithm that is implemented in the application; all those surrounding aspects such as the quality of the acquired images –which depends on the acquisition technique, the fingerprint sensor used in the system, and the environmental conditions (temperature, pressure, moisture, lighting conditions, etc.)–, the interface between the user and the system, the habituation of the user to the system, the age or job of the user, the finger conditions (too wet, to dry, etc.), the human guidance or supervision in the acquisition process, the existence or not of quality checking points in the enrolment and recognition steps, etc. have some influence on the final system performance. Efficiently and reliably determining the identity of one individual in an automatic way is an open research problem today. In this chapter, the state of the art in the physical implementation of fingerprint-based personal recognition systems, the fingerprint sensor techniques and fingerprint scanner devices available in the market, and the main research institutions and disclosure media present in this area are covered. 4.1. Fingerprint Sensors The fingerprint sensor, also known as fingerprint reader or fingerprint scanner, is an image acquisition device considered as the very front end of the biometric fingerprint identification or authentication system. When dealing with automatic fingerprint-based recognition systems, and depending on the application, users can simply touch a sensing surface, slide their finger over a sensing surface, or even place his finger on a specific area without touching any sensing device for access to their PDAs, laptops, PCs, wireless devices, mobile phones, workplaces or homes. As already introduced in section 2.2, there exist many fingerprint sensing techniques covering onetouch, sweeping and touchless devices. Sensors based on any of those technologies are available in the market. Some of the most important fingerprint sensor suppliers and devices are listed in Table 10 and Table 11 respectively. Company Logo Company Name 3M Cogent Systems Atmel Corporation AuthenTec Inc. BioLink Solutions Biometrika srl. Note: Table 10 continues in next page. Sensing Technology Optical sensing technique Sweep thermal sensing technique RF modulation and active capacitance sensing technique Optical sensing technique Optical sensing technique URL www.cogentsystems.com www.atmel.com www.authentec.com www.biolinksolutions.com www.biometrika.it 109 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Company Logo Company Name CrossMatch Technologies Dermalog Digent Co., Ltd. DigitalPersona, Inc. Egis Technology Inc. Fingerprint Cards AB Fujitsu Futronic Technology L-1 Identity Solutions Lumidigm Inc. NEC Corporation Nitgen & Company Co., Ltd. Safran Morpho SecuGen Corporation Suprema Inc. Tacoma Technology Inc. Tooan TST Biometrics Ultra-Scan Corporation (Upek) AuthenTec Inc. Validity Sensors, Inc. Veridicom International, Inc. ZKSoftware Inc. Sensing Technology Optical sensing technique Optical sensing technique Optical sensing technique Optical sensing technique Active capacitance sensing technique Active capacitance sensing technique Capacitive fingerprint sensors (no longer supported) Optical sensing technique Optical sensing technique Optical Multispectral sensing technique Hybrid sensing techniques Optical sensing technique Optical sensing technique Optical sensing technique Optical sensing technique Several sensing techniques Optical sensing technique Optical Multispectral touchless sensing technique Ultrasound sensing technique RF modulation and active capacitance sensing technique RF sensing technique Capacitive sensing technique Optical sensing technique URL www.crossmatch.com www.dermalog.de www.digent.com www.digitalpersona.com www.egistec.com www.fingerprints.com www.fujitsu.com www.futronic-tech.com www.l1id.com www.lumidigm.com www.nec.com www.nitgen.com www.morpho.com www.secugen.com www.supremainc.com www.tacoma.com.tw www.tooan.cn www.tst-biometrics.com www.ultra-scan.com www.upek.com www.validityinc.com www.veridicom.com www.zksoftware.com Table 10. Fingerprint sensor suppliers. Technical Specifications Vendor 3M Cogent 3M Cogent 3M Cogent 3M Cogent Sensor Model Sensor Technology Capture Method One-touch One-touch One-touch One-touch Sweep Resolution (dpi) 500 500 500 500 500 Pixel Array 320 × 480 500 × 500 500 × 500 750 × 800 280 × 8 Image Resolution 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 4-bit greyscale Sensor Interface USB interface USB interface USB interface USB interface 8-bit parallel interface CSD 200 Optical (reading of skin surface) CSD 301 Optical (reading of skin surface) CSD 330 Optical (reading of skin surface) CSD 450 Optical (reading of skin surface) FingerChip Atmel Thermal (reading of skin surface) AT77C101 Note: Table 11 continues in next page. 110 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Technical Specifications Vendor Sensor Model FingerChip AT77C102 FingerChip AT77C104 FingerChip AT77C105 AES850 AES1510 AES1610 AES1660 AES1710 AES1711 AES1750 AES2500 AES2501 AES2510 AES2550 AES2660 AES2810 (match-on sensor) AES3400 AES3500 AES4000 AFS2 AFS8500 AFS8600 ATW310 TCS1 TCS2 TCS4 TCS5 Sensor Technology Capture Method Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep Sweep One-touch One-touch One-touch One-touch One-touch One-touch Sweep One-touch One-touch Sweep Sweep One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch Resolution (dpi) 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 250 250 250 250 360 508 508 508 508 508 508 508 569 569 569 500 500 500 500 Pixel Array 280 × 8 232 × 8 232 × 8 92 × 8 128 × 8 128 × 8 128 × 8 128 × 8 128 × 8 128 × 8 192 × 16 192 × 16 192 × 16 192 × 8 192 × 8 192 × 8 128 × 128 128 × 128 96 × 96 128 × 128 96 × 96 96 × 96 124 × 8 256 × 360 208 × 288 192 × 4 144 × 4 360 × 510 360 × 510 360 × 510 296 × 560 296 × 560 400 × 560 500 × 500 600 × 600 750 × 800 750 × 800 Image Resolution 4-bit greyscale 4-bit greyscale 4-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 4-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale Sensor Interface 8-bit parallel interface SPI interface SPI interface I2C interface SPI interface 4-bit parallel interface SPI interface USB interface SPI interface USB interface 8-bit parallel interface SPI interface 8-bit parallel interface SPI interface SPI interface Serial interface USB interface USB interface Serial interface 8-bit parallel interface Serial interface USB interface SPI interface USB interface Serial interface USB interface 8-bit parallel interface Serial interface USB interface 8-bit parallel interface Serial interface USB interface 8-bit parallel interface Serial interface USB interface Serial interface 8-bit parallel interface Serial interface 8-bit parallel interface Serial interface 8-bit parallel interface SPI interface 8-bit parallel interface 8-bit parallel interface USB interface SPI interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface Atmel Atmel Atmel AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec BioLink BioLink BioLink Biometrika Biometrika Thermal (reading of skin surface) Thermal (reading of skin surface) Thermal (reading of skin surface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) Capacitive (reading of skin surface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) U-Match 3.5 U-Match 5.0 U-Match BI USB Fx1000 Fx2000 Fx3000 Biometrika Optical (reading of skin surface) (match-on sensor) Biometrika HiScan Optical (reading of skin surface) CrossMatch Verifier 300 LC 2.0 Optical (reading of skin surface) CrossMatch Verifier 310 LC Optical (reading of skin surface) CrossMatch Verifier 320 LC Optical (reading of skin surface) Note: Table 11 continues in next page. 111 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Technical Specifications Vendor Dermalog Dermalog Digent Digent Digent Digent Digent Digent Digital Persona Digital Persona Digital Persona EgisTec EgisTec Fingerprint Cards Fingerprint Cards Fingerprint Cards Fingerprint Cards Fujitsu Fujitsu Fujitsu Fujitsu Fujitsu Fujitsu Futronic Futronic Futronic Futronic Futronic L-1 Identity Solutions L-1 Identity Solutions L-1 Identity Solutions Lumidigm Lumidigm Lumidigm NEC Nitgen Safran Morpho Safran Morpho Sensor Model LF1 ZF1 FS10 FS11 FS20 FD1000 FM1000 FD2000 U.are.U 2000 U.are.U 4000 U.are.U 4500 ES603-WB SS801U FPC1010 FPC1011 FPC1030 FPC1031 MBF110 MBF200 SPF200 MBF300 MBF310 MBF320 FS50 FS80 FS81 FS88 FS90 BioTouch 500 DFR 2100 DFR 2130 DFR 2300 M301 M311 Venus J100 J110 HS100-10 Sensor Technology Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Capacitive (reading of skin surface) Capacitive (reading of skin surface) Capacitive (reading of skin surface) Capacitive (reading of skin surface) Capacitive (reading of skin surface) Capacitive (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical multispectral (reading of skin surface and subsurface) Optical multispectral (reading of skin surface and subsurface) Optical multispectral (reading of skin surface and subsurface) Hybrid (reading of skin surface and finger veins) Optical (reading of skin surface) Capture Method One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch Sweep Sweep One-touch One-touch Sweep Sweep One-touch One-touch One-touch Sweep Sweep Sweep One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch Touchless One-touch One-touch One-touch One-touch Resolution (dpi) 500 500 500 500 500 500 500 500 500 512 512 508 508 363 363 363 363 500 500 500 500 500 500 500 500 500 500 500 340, 500 686 500 500 500 500 > 600 500 500 500 500 Pixel Array 400 × 480 320 × 480 280 × 320 280 × 320 280 × 320 280 × 320 280 × 320 280 × 320 Ellipsoidal area Ellipsoidal area Ellipsoidal area 192 × 4 192 × 8 152 × 200 152 × 200 152 × 32 152 × 32 300 × 300 256 × 300 256 × 300 256 × 32 218 × 8 256 × 8 750 × 800 320 × 480 320 × 480 320 × 480 300 × 440 – 720 × 720 500 × 500 274 × 342 Ellipsoidal area Ellipsoidal area – 600 × 600 453 × 453 276 × 433 260 × 300 Image Resolution 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale – 4-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale Sensor Interface USB interface USB interface – – – USB interface USB interface USB interface USB interface USB interface USB interface SPI interface USB interface SPI interface USB interface Serial interface SPI interface Serial interface Serial interface 8-bit parallel interface 8-bit parallel interface SPI interface USB interface USB interface 8-bit parallel interface SPI interface USB interface 8-bit parallel interface SPI interface SPI interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface RS-232 interface USB interface Ethernet interface RS-232 interface USB interface USB interface RS-232 interface USB interface RS-232 interface USB interface USB interface eNBioScan-F MorphoSmart Optical (reading of skin surface) MSO Series 300 MorphoSmart Optical (reading of skin surface) MSO Series 1300 FDU02R SecuGen Optical (reading of skin surface) (Hamster III) Note: Table 11 continues in next page. 112 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Technical Specifications Vendor Sensor Model FDU03 SDU03M SDU03M2 (Hamster Plus) FDU04 (Hamster IV) BioMini BioMini Plus SFR300-S TCP2000 TCP3500 TMP1000 TMP2000 TOP1000 OP-100R OP-200 BiRD 3 BiRD 4 Model 203 (single finger) TCS1 TCS2 TCS3 TCS4 TCS5 VFS201 VFS202 VFS301 VFS302 FPS110 5th Sense FPS200 ZK4000 ZK7000 ZK8000 Sensor Technology Capture Method One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch One-touch Touchless Touchless One-touch One-touch One-touch Sweep Sweep Sweep Sweep Sweep Sweep Sweep One-touch One-touch One-touch One-touch One-touch Resolution (dpi) 500 500 500 500 500 500 500 500 500 500 400 – 600 400 – 600 500 500 250 or 500 508 508 254, 381 or 508 508 508 508 508 508 508 500 500 500 500 500 Pixel Array 260 × 300 258 × 336 288 × 320 260 × 340 288 × 288 256 × 256 128 × 128 256 × 256 256 × 256 256 × 256 240 × 320 480 × 640 240 × 320 480 × 640 480 × 640 256 × 360 375 × 375 187 × 187 256 × 360 208 × 288 248 × 4 192 × 4 144 × 4 200 × 2 200 × 2 200 × 2 200 × 2 300 × 300 256 × 300 280 × 360 288 × 360 288 × 360 Image Resolution 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale 8-bit greyscale Sensor Interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface USB interface 8-bit parallel interface – Several (USB, I2C, RS-232, Ethernet, SPI, etc.) Several (USB, I2C, RS-232, Ethernet, SPI, etc.) USB interface 8-bit parallel interface 8-bit parallel interface SPI interface USB interface SPI interface USB interface USB interface USB interface SPI interface SPI interface 8-bit parallel interface Parallel interface SPI interface USB interface USB interface USB interface USB interface SecuGen SecuGen Suprema Suprema Suprema Tacoma Tacoma Tacoma Tacoma Tacoma Tooan Tooan TST Biometrics TST Biometrics Ultra-Scan Corporation UPEK Authentec UPEK Authentec UPEK Authentec UPEK Authentec UPEK Authentec Validity Validity Validity Validity Veridicom Veridicom ZKSoftware ZKSoftware ZKSoftware Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Capacitive (reading of skin surface) RF Field (reading of skin surface and subsurface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Ultrasound (reading of skin surface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) Active Capacitive (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) RF Field (reading of skin surface and subsurface) Capacitive (reading of skin surface) Capacitive (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Optical (reading of skin surface) Table 11. Fingerprint sensor devices. 4.2. Automatic Fingerprint Authentication Systems (AFAS) Nowadays there is a wide offer of biometric products in the market based on fingerprints. Many companies exist focused on different niches such as processor chips oriented to those computational stages involved in the processing (e.g. image enhancement, feature extraction or fingerprint matching); autonomous embedded systems in charge of the whole recognition process that can be easily integrated in high-level applications; or full applications in the form of SDKs (Software Development Kits) to be executed under computer platforms/networks. Table 12 shows some of the companies in the fingerprint biometrics arena, with an indication of the type of products they offer. 113 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Technological Products Company Logo Company Name Processor Chips Embedded Systems AFIS/AFAS HPC SDKs URL 3M Cogent Systems Analog Devices Inc. AuthenTec Inc. BioEnable Technologies Pvt. Ltd. BIO-key International, Inc. BioLink Solutions Biometrika srl. Digent Co., Ltd. DigitalPersona, Inc. Egis Technology Inc. Fingerprint Cards AB Fujitsu Futronic Technology Green Bit S. p. A. (DSP) www.cogentsystems.com www.analog.com www.authentec.com www.bioenabletech.com www.bio-key.com www.biolinksolutions.com www.biometrika.it www.digent.com www.digitalpersona.com www.egistec.com www.fingerprints.com www.fujitsu.com (no longer supported) www.futronic-tech.com www.greenbit.com id3 Semiconductors Idencom Innovatrics L-1 Identity Solutions Neurotechnology Nitgen & Company Co., Ltd. OKI Semiconductor Co.,Ltd. Precise Biometrics Privaris Inc. Safran Morpho SecuGen Corporation Suprema Inc. Note: Table 12 continues in next page. www.id3.eu www.idencom.com www.innovatrics.com www.l1id.com www.neurotechnology.com www.nitgen.com www.okisemi.com www.precisebiometrics.com www.privaris.com www.morpho.com www.secugen.com www.supremainc.com 114 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Technological Products Company Logo Company Name Processor Chips Embedded Systems AFIS/AFAS HPC SDKs URL Tacoma Technology Inc. Texas Instruments Inc. Veridicom International, Inc. ZKSoftware Inc. Zvetco Biometrics (DSP) www.tacoma.com.tw www.ti.com www.veridicom.com www.zksoftware.com www.zvetcobiometrics.com Table 12. Companies that support the development of AFAS applications. Technical Specifications Vendor Processor Architecture 32-bit ARM940 MCU + D2SP (Dimensional DSP) engine 16/32-bit RISC DSP Throughput Frequency DNA (1) On-chip Memory DNA (1) Host Interface DNA (1) Tasks Management of external fingerprint sensor Image acquisition Feature extraction Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Template storage Feature matching Encrypted communication with host Management of external fingerprint sensor Image acquisition Feature extraction Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Template storage Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Template storage Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Template storage Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Template storage Feature matching Management of external fingerprint sensor Image acquisition Feature extraction Feature matching DNA(1): Data Not Available 3M Cogent SecurASIC Analog Devices ADSP-BF531 400 MHz 52 kB SRAM 1 kB ROM Parallel interface SPI interface UART interface SPI interface UART interface USB interface AuthenTec TCD50 32-bit RISC processor 144 MIPS 128 kB FLASH Fingerprint Cards FPC2000 RAM-based RISC processor 20–60 MHz Fingerprint Cards FPC2020 RAM-based processor 7–96 MHz 8 kB Program RAM 32 kB Data RAM Program RAM Data RAM SPI interface to external FLASH for firmware and template storage 128 kB FLASH 16 kB SRAM 8-bit parallel I/F SPI interface UART interface OKI ML67Q5250 32-bit ARM7 MCU + DFT (Discrete Fourier Transform) Accelerator 32-bit ARM7 MCU + DFT (Discrete Fourier Transform) Accelerator 32-bit ARM7 MCU + DFT (Discrete Fourier Transform) Accelerator 16-bit fixed-point CISC DSP (on-chip FFT hardware accelerator) Floating-point DSP 32 MHz Serial interface Smart card I/F SPI interface USB interface Serial interface Smart card I/F SPI interface USB interface 8-bit parallel I/F Serial interface Smart card I/F SPI interface USB interface I2C interface SPI interface UART interface USB interface Serial interface OKI ML67Q5260 32 MHz 128 kB FLASH 16 kB RAM 8 kB ROM 128 kB FLASH 16 kB RAM 8 kB ROM OKI ML67Q5270 32 MHz Texas Instruments TMS320C5515 100–120 MHz 320 kB RAM 128 kB ROM Texas Instruments TMS320C6712D 100–150 MHz 72 kB RAM Table 13. Fingerprint-based processor chips. 115 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Table 13 lists some of the processor chips in charge of any of the processing tasks carried out in an AFAS application. For each device, basic technical information is provided: its architecture, its operating frequency, the on-chip memory capabilities, its communication interfaces, and the list of computational tasks of the recognition algorithm that can be executed by the processor device. Vendor 3M Cogent AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec AuthenTec BioEnable Module SA-4ii fingerprint module TCS4K Chipset TCS5D Chipset TouchChip TCEFC1 TouchChip TCEFD1 TouchChip TCEFD2 TouchChip TCESC1 TouchChip TCESD1 TouchChip TCESD2 FDA01M Processor DSP 600 MHz TCD50 TCD50 TCD50 TCD50 TCD50 TCD50 TCD50 TCD50 32-bit RISC MPU 32-bit RISC MPU SAMSUNG S3C2410 ARM9 200–266 MHz 32-bit ARM9 MPU 200 MHz Off-chip Volatile Memory 32 MB SDRAM – – – – – – – – 1 MB SDRAM Off-chip Non Volatile Memory 8 MB FLASH – – – – – – – – 1 MB FLASH Sensor TCS1 TCS4 TCS5 TCS1 TCS1 TCS2 TCS1 TCS1 TCS2 Optical One-touch 292 × 356 500 dpi Optical One-touch 300 × 398 500 dpi Optical One-touch 296 × 560 569 dpi AFS8600 AT77C101 FS11 FS20 MBF200 TCS1 TCS2 TCS3 FS11 FS20 AFS8600 AT77C101 FS11 FS20 MBF200 TCS1 TCS2 TCS3 FPC1011 FPC1011 MBF200 Optical One-touch 320 × 480 500 dpi Optical One-touch 320 × 480 500 dpi FingerChip FingerChip Host Interface Parallel interface RS-232 interface UART interface USB interface SPI interface UART interface USB interface USB interface USB interface USB interface SPI/UART interface SPI/UART interface SPI/UART interface RS-232 interface BioEnable FIM10 8 MB SDRAM 1 MB FLASH RS-232 interface Biometrika FxIntegrator 32 MB RAM 8 MB FLASH RS-232 interface Digent FC2055A1 TMS320C5502 DSP 200–300 MHz DNA (1) 2 MB FLASH RS-232 interface Digent FC2062A TMS320C6211 DSP 150–167 MHz DNA (1) 2 MB FLASH RS-232 interface Digent FC2062BL TMS320C6205 DSP 200 MHz DNA (1) 8 MB FLASH RS-232 interface Fingerprint Cards Fingerprint Cards Fujitsu FPC-AM3 Development Kit FPC-AMD3 MDFP200 FPC2020 FPC2020 32-bit RISC MPU Fujitsu MB91302 68 MHz ADSP-BF532 DSP 400 MHz ADSP-BF532 DSP 400 MHz Dedicated cryptographic and imaging processor Dedicated cryptographic and imaging processor – – 8 MB SDRAM 1 MB FLASH 1 MB FLASH 2 MB FLASH UART interface SPI interface UART interface SPI interface RS-232 interface Futronic FS83 16 MB SDRAM 16 MB FLASH RS-232 interface Serial interface RS-232 interface Serial interface Futronic FS84 16 MB SDRAM 16 MB FLASH id3 Semiconductors id3 Semiconductors Biomodule STA Light BIOmodule MOS DNA DNA (1) DNA DNA (1) Serial interface Serial interface (1) (1) Note: Table 14 continues in next page. 116 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Vendor Idencom Idencom Idencom L-1 Identity Solutions L-1 Identity Solutions Nitgen Nitgen Nitgen Nitgen Nitgen Nitgen Nitgen Nitgen OKI Module BioKey 3000 BioKey 3010 BioKey 3020 Bioscrypt MV1210 Bioscrypt MV1250 Processor Atmel ARM9 MPU Atmel ARM9 MPU Atmel ARM9 MPU TMS320C6712D DSP 100–150 MHz TMS320C6712D DSP 100–150 MHz S3C2450 ARM9 MPU 533 MHz S3C2450 ARM9 MPU 533 MHz S3C2450 ARM9 MPU 533 MHz S3C2450 ARM9 MPU 533 MHz S3C2410 ARM9 MPU 266 MHz S3C2410 ARM9 MPU 266 MHz S3C2410 ARM9 MPU 266 MHz S3C2410 ARM9 MPU 266 MHz ML67Q5250 Off-chip Volatile Memory DNA DNA DNA DNA DNA (1) Off-chip Non Volatile Memory DNA DNA DNA (1) Sensor FingerChip FPC1031 FingerChip FPC1031 FingerChip AFS2 AFS8500 Optical One-touch 266 × 300 500 dpi TCS1 TCS2 TCS4 Optical One-touch 266 × 300 500 dpi TCS1 TCS2 TCS4 AES2510 Host Interface Serial interface Serial interface Serial interface Parallel interface RS-232 interface Parallel interface RS-232 interface RS-232 interface Serial interface RS-232 interface Serial interface RS-232 interface Serial interface RS-232 interface Serial interface RS-232 interface Serial interface RS-232 interface Serial interface RS-232 interface Serial interface RS-232 interface Serial interface Serial interface Smart card I/F SPI interface USB interface Serial interface Smart card I/F SPI interface USB interface Serial interface Smart card I/F SPI interface USB interface Serial interface Smart card I/F SPI interface USB interface Serial interface Smart card I/F SPI interface USB interface Bluetooth interface Other RF interfaces USB interface Serial interface USB interface RS-232 interface USB interface RS-232 interface Serial interface USB interface RS-232 interface Serial interface USB interface (1) (1) (1) (1) (1) >1 MB FLASH >1 MB FLASH (1) FIM 4060 FIM 4110 FIM 4120 FIM 4140 FIM 5060 FIM 5110 FIM 5120 FIM 5140 MK67Q5250 32 MB SDRAM 32 MB SDRAM 32 MB SDRAM 32 MB SDRAM 16 MB SDRAM 16 MB SDRAM 16 MB SDRAM 16 MB SDRAM – 16 MB FLASH 16 MB FLASH 16 MB FLASH 16 MB FLASH 8 MB FLASH 8 MB FLASH 8 MB FLASH 8 MB FLASH – OKI Evaluation Kit ML67Q5250-SDK-2510 Evaluation Kit ML67Q5250-SDK-1711 Evaluation Kit ML67Q5260-SDK-1751 Evalutation Kit ML67Q5260-SDK-1711 plusID60 plusID75 plusID90 MorphoSmart CBM MorphoSmart CBM-E MorphoSmart CBM V MorphoSmart SO XX1 OEM MorphoSmart SO-OEM ML67Q5250 – – AES2510 OKI ML67Q5250 – – AES1711 OKI ML67Q5260 – – AES1751 OKI ML67Q5260 BCM5890 ARM9 MPU 150 MHz – – AES1711 Privaris DNA (1) 2 MB FLASH AES2510 Optical One-touch 276 × 433 500 dpi Optical One-touch 453 × 453 500 dpi Optical One-touch 260 × 300 500 dpi Optical One-touch 258 × 336 500 dpi Safran Morpho DNA (1) DNA (1) DNA (1) Safran Morpho DNA (1) DNA (1) DNA (1) SecuGen SDA03M ARM MPU 400 MHz 32 MB RAM 32 MB FLASH SecuGen SDA04 ARM MPU 533 MHz 128 MB RAM 128 MB FLASH Note: Table 14 continues in next page. 117 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Vendor Module Processor Off-chip Volatile Memory 128 MB RAM Off-chip Non Volatile Memory 128 MB FLASH 1 MB FLASH 1 MB FLASH Sensor Optical One-touch 260 × 300 500 dpi AFS2 Thermal Sweep 280 × 8 500 dpi Optical One-touch 280 × 320 500 dpi Optical One-touch 288 × 288 500 dpi Optical One-touch 256 × 302 500 dpi TCS1 TCS2 AFS2 Thermal Sweep 280 × 8 500 dpi Optical One-touch 280 × 320 500 dpi Optical One-touch 288 × 288 500 dpi Optical One-touch 256 × 302 500 dpi TCS1 TCS2 TCS4 Optical One-touch 280 × 320 500 dpi Optical One-touch 272 × 320 500 dpi TMP1000 OP100R ATW310 FPS200 Host Interface RS-232 interface Serial interface USB interface Serial interface Serial interface SecuGen SDA04M ARM MPU 533 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz ADSP-BF531 DSP 400 MHz Suprema Suprema SFM3000-FL SFM3010-FC 2 MB SDRAM 2 MB SDRAM Suprema SFM3020-OP 2 MB SDRAM 1 MB FLASH Serial interface Suprema SFM3030-OD 2 MB SDRAM 1 MB FLASH Serial interface Suprema Suprema Suprema Suprema Suprema SFM3040-OL SFM3050-TC1 SFM3050-TC2 SFM3050-TC2S SFM3500-FL SFM3510-FC 2 MB SDRAM 2 MB SDRAM 2 MB SDRAM 2 MB SDRAM 2 MB SDRAM 1 MB FLASH 1 MB FLASH 1 MB FLASH 4 MB FLASH 4 MB FLASH Serial interface Serial interface Serial interface RS-232 interface RS-485 interface RS-232 interface RS-485 interface RS-232 interface RS-485 interface RS-232 interface RS-485 interface RS-232 interface RS-485 interface RS-232 interface RS-485 interface RS-232 interface RS-485 interface Serial interface Serial interface Suprema SFM3520-OP 2 MB SDRAM 4 MB FLASH Suprema SFM3530-OD 2 MB SDRAM 4 MB FLASH Suprema Suprema Suprema Suprema Suprema SFM3540-OL SFM3550-TC1 SFM3550-TC2 SFM4000-TS4 SFM4020-OP 2 MB SDRAM 2 MB SDRAM 2 MB SDRAM 2 MB SDRAM 2 MB SDRAM 4 MB FLASH 4 MB FLASH 4 MB FLASH 512 kB FLASH 512 kB FLASH Suprema SFM5020-OP DSP 533 MHz 2 MB SDRAM 1 MB FLASH Serial interface RS-232 interface RS-485 interface RS-232 interface USB interface Serial interface USB interface Tacoma Texas Instruments Veridicom TME1000 Development Kit TMDXBDKFP5515 MatchBoard DSP 396 MHz TMS320C5515 DSP 100–120 MHz 32-bit RISC MPU NEC V832/833 144–180 MHz 16 MB SDRAM – 1 MB FLASH 512 kB FLASH 64 kB EEPROM 1 MB FLASH 8 MB SDRAM DNA(1): Data Not Available Table 14. Fingerprint-based embedded system modules. 118 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In Table 14, a benchmark of the existing embedded system platforms in charge of fingerprint-based personal recognition is provided. For each embedded system, some technical details such as the main processor core used, the amount of off-chip memory available in the system, its fingerprint sensor device, and the kind of communication interface built to establish a link between the embedded system and one external host are given. Table 15 shows some of the most relevant SDKs developed in the area of fingerprint biometrics. Vendor 3M Cogent AuthenTec AuthenTec BIO-key BioLink Biometrika Biometrika Biometrika Biometrika BioTrust TrueSuite TrueSuite Mobile Vector Segment Technology BioLink FxISO FX3 BioCard PKCS#11 FxGate SDK O.S. Compatibility Windows XP Windows Vista Windows 7 Windows Windows Windows Windows CE Linux Win32 WinCE Linux Windows 95 / 98 / NT / 2000 Windows XP / Vista Linux Windows 9 / 2000 Windows XP / Vista Windows 98 / NT Windows 2000 / XP Windows 98 / NT / 2000 Windows XP / Vista Windows 7 Windows Vista Windows XP Windows XP Embedded Windows Server 2003/2008 Windows RDP Linux Windows Vista Windows XP Windows 7 Linux Windows 2003 Windows 2008 Windows Vista Windows XP Windows 7 WinCE 5.0 WinCE 6.0 Linux Windows 2000 Windows XP Windows Vista Windows 7 Windows XP Windows Vista Windows 7 Windows XP 32/64 Windows Vista 32/64 Windows 7 32/64 Linux 32/64 Windows Mobile 2003 Windows Mobile 5.0 Windows Mobile 6.0 Embedded Linux Windows NT Windows 2000 Windows XP Windows XP Window Vista Windows 7 Linux Tasks Identity management software for PC workstations Identity management software for PC workstations Identity management software for mobile embedded systems Identity management software for PC workstations Identity management software for PC workstations Identity management software for PC workstations Identity management software for smart cards Identity management software for smart cards (custom solution) Identity management software for Access Control and Time&Attendance embedded systems (custom solution) Digital Persona One Touch Identity management software for PC workstations EgisTec BioExcess Identity management software for PC workstations Futronic Fingerprint Recognition Identity management software for PC workstations Green Bit DactyMatch Identity management software for PC workstations id3 Semiconductors id3 Fingerprint Identity management software for PC workstations Innovatrics IDKit Identity management software for PC workstations Innovatrics L-1 Identity Solutions L-1 Identity Solutions IDKit Mobile IDKit Embedded BioEngine Fingerprint TouchPrint Live Scan Advanced Identity management software for mobile embedded systems Identity management software for PC workstations Identity management software for PC workstations Note: Table 15 continues in next page. 119 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Vendor SDK O.S. Compatibility Windows 2000 Windows 7 Windows Vista Windows XP Linux Windows CE Linux Windows 98SE / ME/ NT Windows 2000 / XP / Vista Linux Windows 95 / 98 / ME Windows 2000 / XP Windows XP Windows Vista Windows 7 Windows XP Windows Vista Windows 7 Windows Server 2003 / 2008 Visual Studio 2005 / 2008 Windows XP / 2000 / ME / 98SE Windows Vista Windows 7 Windows 7 / Vista / Server 2003 Windows XP / 2000 / ME / 98SE Windows 7 / Vista / Server 2003 Windows XP / 2000 Windows 98 / ME / 2000 Windows XP / Vista Windows 7 Linux Windows 98 / ME / 2000 Windows XP / Vista Windows 7 Windows 98SE / ME Windows 2000 Windows XP WinCE 3.0 WinCE 4.1 Windows 2000 Windows XP Windows Vista Windows 7 Linux Unix Windows 2000 Windows XP Windows 2000 Windows 2003 Server Windows XP Tasks Neurotechnology VeriFinger Identity management software for PC workstations Neurotechnology Nitgen Nitgen Precise Biometrics FingerCell eNBSP eNBioAPI Precise BioMatch Identity management software for mobile embedded systems Identity management software for PC workstations Identity management software for PC workstations Identity management software for PC workstations, smart cards and embedded systems Safran Morpho MorphoKit Identity management software for PC workstations SecuGen SecuGen SecuGen Suprema FDx SecuBSP BioAPI SecuSearch BioMini Identity management software for PC workstations Identity management software for PC workstations Identity management software for PC workstations Identity management software for PC workstations Suprema Image Identity management software for PC workstations Tacoma Tacoma Identity management software for PC workstations ZKSoftware ZK Finger Identity management software for PC workstations Zvetco Biometrics Zvetco Biometrics Authentec IdentiFi Identity management software for PC workstations Identity management software for PC workstations Table 15. Fingerprint-based SDKs. 4.3. Technology Evaluation The performance in terms of recognition accuracy of a fingerprint-based biometric system needs to be determined empirically. For such a purpose, matching algorithms are submitted to standard evaluation processes under public fingerprint databases that are accessible to whoever in the scientific community, in both the industry and the academy domains. The databases used in the evaluation processes present sample sizes large enough for each category in order to be representative of the whole population and to provide reliable results with proper confidence levels. Several technology evaluation contests have been carried out along the last decade to compute, in a fair way, the progress performed in the area of fingerprint recognition. The state of the art in fingerprint matching is deduced through the analysis of the results derived from different public evaluation processes. The final aim of such evaluation contests is to spread the current status of this active field of research. This section covers those recent evaluation programs mainly oriented to personal authentication applications in the fingerprint-based biometrics arena. 120 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 4.3.1. Fingerprint Verification Competition 2000 (FVC2000) FVC2000 was the first of the international FVC open contests for fingerprint verification algorithms jointly organized by the Biometric Systems Laboratory (University of Bologna), the U.S. National Biometric Test Center (San Jose State University), and the Pattern Recognition and Image Processing Laboratory (Michigan State University). FVC2000 attempted to establish a first common benchmark for companies and academic institutions to unambiguously compare performances of their fingerprint recognition algorithms when submitted to evaluation under the same conditions and with the same fingerprint databases. In this way, FVC2000 aimed at disseminating the progress done in this field of research to date. Up to four different databases were used for evaluation purposes. The first three databases were collected using commercial fingerprint scanners of different technologies like optical and capacitive, covering large-area and small-area touch sensors. The fourth database was synthetically generated using the software tool SFinGe (Synthetic Fingerprint Generator) developed by the Biometric Systems Laboratory of University of Bologna (http://biolab.csr.unibo.it/). Personal computer platforms based on Intel Pentium III processor running at 450 MHz under Windows NT 4.0 and Linux RedHat 6.1 operating systems were used in the evaluation process. The maximum time for enrolment and matching tasks was limited to 10 s and 5 s respectively under the aforementioned processing platforms. One specific protocol was defined and followed by each of the participants to adapt their algorithms to the contest evaluation criteria. Table 16 shows the list of participants in FVC2000 contest. The recognition accuracy performance (in terms of EER), and the execution time performance exhibited by each of the algorithms (in the enrolment and matching stages) are indicated in Table 17 for each of the databases used in the evaluation process. FVC 2000 ID CETP CSPN CWAI DITI FPIN KRDL NCMI SAG1 SAG2 UINH UTWE NEUR TKEY FPI2 Participant Institution CEFET-PR / Antheus Technologia Ltda – Brazil Centre for Signal Processing, Nanyang Technological University – Singapore Centre for Wavelets, Approximation and Information Processing, Department of Mathematics, National University of Singapore – Singapore Ditto Information & Technology Inc. – Korea FingerPin AG – Switzerland Kent Ridge Digital Labs – Singapore Natural Sciences and Mathematics, Institute of Informatics – Macedonia Sagem S.A. – France Sagem S.A. – France Inha University – Korea University of Twente, Electrical Engineering – Netherlands Neurotechnologija Ltd. – Lithuania TeKey Research Group – Israel FingerPin AG – Switzerland Table 16. List of participants FVC2000. FVC 2000 ID CETP CSPN CWAI DITI FPIN KRDL NCMI SAG1 SAG2 UINH UTWE NEUR TKEY FPI2 DB1 Enrol time 0.81s 0.17s 0.22s 0.65s 0.83s 1.00s 1.13s 2.48s 0.88s 0.53s 10.40s 0.80s 0.02s 0.88s DB2 Enrol time 0.85s 0.17s 0.23s 1.21s 1.16s 1.16s 1.28s 2.63s 0.93s 0.60s 10.42s 0.83s 0.02s 1.34s DB3 Enrol time 1.49s 0.35s 0.46s 2.59s 2.13s 1.48s 2.25s 5.70s 1.94s 1.28s 10.44s 1.89s 0.02s 4.07s DB4 Enrol time 0.65s 0.11s 0.16s 0.52s 0.77s 0.70s 1.08s 1.90s 0.69s 0.42s 10.42s 0.62s 0.02s 0.77s EER 5.06% 7.60% 7.06% 23.63% 13.46% 10.66% 49.11% 0.67% 1.17% 21.02% 7.98% 0.97% 3.41% 4.53% Match time 0.89s 0.17s 0.32s 0.72s 0.87s 1.06s 1.34s 0.96s 0.88s 0.56s 2.10s 0.92s 4.77s 1.43s EER 4.63% 2.75% 3.01% 13.83% 11.14% 8.83% 46.15% 0.61% 0.82% 15.22% 10.65% 0.58% 2.44% 6.21% Match time 0.98s 0.17s 0.30s 1.28s 1.24s 2.88s 1.57s 1.03s 0.93s 0.65s 2.12s 0.94s 4.77s 2.04s EER 8.29% 5.36% 11.94% 22.63% 23.18% 12.20% 47.43% 3.64% 4.01% 16.32% 17.73% 2.92% 6.17% 12.78% Match time 1.66s 0.36s 0.57s 2.67s 2.19s 1.60s 2.75s 2.13s 1.94s 1.36s 2.31s 2.14s 5.20s 5.67s EER 7.29% 5.04% 6.30% 23.80% 16.00% 12.08% 48.67% 1.99% 3.11% 24.77% 24.59% 1.01% 22.36% 6.27% Match time 0.72s 0.11s 0.20s 0.60s 0.80s 0.79s 1.19s 0.77s 0.69s 0.45s 4.17s 0.71s 5.48s 1.14s Table 17. Performance results FVC2000. 121 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Although FVC2000 was not intended to predict performance in a real environment, based on the results from Table 17, it can be deduced that the execution time for the enrolment and authentication tasks is in general quite acceptable: it could meet real-time performance expectations in typical one to one matching applications. However, the recognition accuracy level exhibited by most of the algorithms in terms of EER is far from the expected targets in those real-world applications demanding stringent security levels. 4.3.2. Fingerprint Verification Competition 2002 (FVC2002) The interest arisen from FVC2000 by both industrial and academic communities induced the organizers to schedule a new competition in 2002. FVC2002 was following the same methodology than FVC2000, and different databases and processing platforms were used in this new contest. Personal computer platforms based on Intel Pentium III processor running at 933 MHz under Windows 2000 operating system were used for evaluation purposes. Two optical and one capacitive one-touch sensors were used for collecting the first three databases. The fourth database was synthetically generated with SFinGe application. As in the previous FVC contest, the maximum time for enrolment and matching tasks was limited to 10 s and 5 s respectively. Similarly to FVC2000, the main aim of FVC2002 was to track recent advances in fingerprint verification, and to provide up to date state of the art in fingerprint technology. The list of participants and the performance results exhibited by each of the algorithms are shown in Table 18 and Table 19 respectively. Anonymous participation was accepted in order to extend the amount of submissions of this contest with regard to the previous contest. FVC 2002 ID PA02 PA03 PA07 PA08 PA10 PA12 PA13 PA14 PA15 PA16 PA18 PA19 PA20 PA21 PA22 PA24 PA25 PA26 PA27 PA28 PA29 PA31 PA32 PA34 PA35 PA42 PA45 PB05 PB15 PB27 PB35 LA22 Participant Andrey Nikiforov (Independent Developer) – United States Anonymous (Academy) Antheus Tecnologia Ltda – Brazil Neurotechnologija Ltd. – Lithuania Aldebaran Systems – United States Anonymous (Industry) Deng Guoqiang (Independent Developer) – China ActivCard Canada – Canada Bioscrypt Inc. – United States University of Tehran, Electrical and Computer Department – Iran Anonymous (Industry) Anonymous (Industry) Department of Computer Science and Information Engineering, Da-Yeh University – Taiwan Inha University – Korea AILab, Institute of Automation, The Chinese Academy of Sciences – China Biometrics System Lab, Beijing University of Posts and Telecommunications – China Anonymous (Other) Suprema Inc. – Korea Anonymous (Industry) FingerPIN AG – Switzerland HZMS Biometrics Co. Ltd. – China TeKey Research Group – Israel DATAMICRO Co. Ltd. – Russia Anonymous (Industry) Sagem – France Digital Fingerpass Corporation – China IDENCOM AG – Switzerland Siemens AG – Germany Bioscrypt Inc. – United States Anonymous (Industry) Sagem – France Digital Fingerpass Software Co.Ltd. & AI labs of Chinese Academy of Sciences – China Table 18. List of participants FVC2002. The obtained results provided some reference and guidance to participants and other developers in the industry and the academy for improving their algorithms. 122 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FVC 2002 ID PA02 PA03 PA07 PA08 PA10 PA12 PA13 PA14 PA15 PA16 PA18 PA19 PA20 PA21 PA22 PA24 PA25 PA26 PA27 PA28 PA29 PA31 PA32 PA34 PA35 PA42 PA45 PB05 PB15 PB27 PB35 LA22 EER 2.57% 50.00% 3.74% 0.98% 4.37% 1.91% 1.46% 2.70% 0.10% 16.28% 6.12% 2.15% 6.71% 4.97% 17.34% 2.36% 35.00% 1.63% 0.25% 4.78% 2.72% 3.85% 3.02% 1.85% 0.67% 8.27% 1.17% 0.52% 0.63% 0.24% 0.61% 0.91% DB1 Enrol time 0.83s 7.59s 0.21s 0.52s 1.91s 0.26s 0.17s 0.80s 0.12s 1.22s 0.70s 0.19s 0.13s 0.69s 0.59s 0.58s 0.56s 0.53s 2.26s 0.52s 0.67s 0.01s 0.34s 0.56s 4.82s 0.51s 0.55s 0.54s 0.07s 1.30s 0.92s 0.35s Match time 1.29s 5.01s 0.61s 0.52s 1.91s 0.27s 0.44s 1.84s 2.47s 1.24s 0.71s 0.20s 0.16s 0.74s 0.68s 0.60s 0.67s 0.61s 1.99s 0.87s 0.68s 3.16s 0.59s 0.68s 1.97s 0.52s 0.67s 0.58s 0.24s 1.15s 0.76s 0.37s EER 0.85% 50.00% 2.78% 0.52% 7.86% 2.89% 1.85% 3.03% 0.17% 18.69% 4.24% 1.99% 15.92% 3.48% 16.29% 2.35% 35.18% 1.28% 0.14% 4.33% 2.71% 4.71% 3.71% 1.61% 1.18% 4.12% 1.21% 0.69% 1.03% 0.21% 1.21% 0.30% DB2 Enrol time 0.99s 7.52s 0.29s 0.67s 1.61s 0.30s 0.23s 0.80s 0.13s 1.63s 0.76s 0.26s 0.19s 0.99s 0.78s 0.68s 0.73s 0.70s 2.01s 0.67s 0.92s 0.01s 0.46s 0.59s 5.15s 0.66s 0.70s 0.59s 0.08s 1.15s 0.96s 0.50s Match time 1.62s 5.01s 0.84s 0.68s 1.60s 0.32s 0.76s 1.85s 2.44s 1.67s 0.79s 0.27s 0.23s 1.05s 0.89s 0.70s 0.85s 0.80s 2.03s 1.11s 0.95s 3.19s 0.96s 0.72s 2.02s 0.69s 0.88s 0.65s 0.24s 1.12s 0.84s 0.52s EER 1.54% 50.00% 7.75% 1.78% 6.69% 14.46% 3.99% 10.85% 0.37% 18.65% 27.84% 8.17% 10.03% 9.36% 14.96% 6.62% 42.25% 5.08% 0.72% 9.85% 6.59% 5.36% 14.25% 6.77% 1.78% 5.97% 3.48% 1.48% 0.81% 1.02% 2.17% 2.05% DB3 Enrol time 0.62s 6.31s 0.15s 0.35s 1.91s 0.16s 0.12s 0.53s 0.08s 0.83s 0.62s 0.13s 0.10s 0.90s 0.41s 0.36s 0.37s 0.37s 1.93s 0.32s 0.47s 0.01s 0.26s 0.48s 2.77s 0.35s 0.38s 0.33s 0.07s 1.31s 0.53s 0.27s Match time 0.89s 5.01s 0.30s 0.35s 1.91s 0.17s 0.35s 1.46s 2.24s 0.84s 0.63s 0.13s 0.12s 0.92s 0.47s 0.36s 0.45s 0.44s 1.86s 0.48s 0.47s 3.18s 0.31s 0.58s 1.21s 0.36s 0.43s 0.36s 0.25s 1.20s 0.48s 0.27s EER 0.28% 50.00% 7.56% 0.68% 5.71% 9.21% 1.42% 4.28% 0.10% 13.53% 10.17% 4.45% 3.50% 6.45% 10.05% 3.70% 43.96% 1.99% 0.21% 5.24% 4.91% 8.97% 5.89% 3.03% 1.10% 7.26% 3.03% 0.98% 0.61% 0.17% 1.71% 1.14% DB4 Enrol time 0.80s 6.79s 0.17s 0.68s 1.81s 0.21s 0.14s 0.61s 0.12s 0.98s 0.65s 0.14s 0.10s 0.62s 0.49s 0.69s 0.44s 0.58s 2.28s 0.40s 0.54s 0.01s 0.27s 0.50s 3.44s 0.42s 0.47s 0.47s 0.07s 1.17s 0.65s 0.28s Match time 1.14s 5.01s 0.40s 0.68s 1.81s 0.36s 0.37s 1.89s 0.73s 1.00s 0.66s 0.15s 0.11s 0.65s 0.56s 0.68s 0.54s 0.66s 2.04s 0.60s 0.54s 3.06s 0.38s 0.61s 1.42s 0.42s 0.52s 0.50s 0.17s 1.06s 0.56s 0.29s Table 19. Performance results FVC2002. 4.3.3. Fingerprint Verification Competition 2004 (FVC2004) Since FVC2000 and FVC2002 became successful initiatives, the organizers scheduled a new competition in 2004. FVC2004 contest was using four new databases, two of them based on optical one-touch scanners, the third one collected by means of one thermal sweep sensor, and the latest one synthetically generated with SFinGe application. More powerful personal computer platforms based on AMD ATHLON 1600+ processor operating at 1.41 GHz under Windows XP Professional operating system were used in this contest. Some more changes were applied in the organization of FVC2004 with regard to the previous contests. Two different sub-competitions, called open category and light category, were introduced. The open category was following the same criteria than previous contests: only a limitation on the maximum enrolment and matching execution times was set to 10 s and 5 s respectively. However, additional constraints were imposed to the light category. The light category was intended for algorithms executed under embedded system platforms like token systems or smart cards, mainly characterized by reduced memory and processing power capabilities. In order to emulate such low performances, the maximum memory size allocated by the application in that scenario was limited to 4 Mbytes, and the maximum size for the encoding of the templates was limited to 2 Kbytes. Additionally, the execution time performance for the enrolment and authentication stages was limited to 0.5 s and 0.3 s respectively. Under these conditions, less complex algorithms had to be designed to meet the imposed requirements in the light category. Table 20 presents the list of participants in FVC2004. Each participant was allowed to submit one algorithm to each category. The performance results in open and light categories are shown in Table 21 and Table 22 respectively. 123 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FVC 2004 ID P002 P004 P006 P009 P011 P016 P026 P027 P039 P041 P047 P048 P049 P050 P051 P052 P065 P067 P068 P071 P072 P075 P078 P079 P080 P083 P087 P093 P097 P099 P101 P103 P104 P105 P106 P107 P108 P109 P111 P112 P113 P118 P120 Participant Anonymous (Academy) Gevarius – Russia Anonymous (Industry) Suprema Inc. – Korea Anonymous (Academy) Deng Guoqiang (Independent Developer) – China Anonymous (Industry) Zaklad Techniki Mikroprocesorowej – Poland Jan Lunter (Independent Developer) – France Miaxis Biometrics Co., Ltd. – China Sonda Ltd. – Russia Anonymous (Industry) Neurotechnologija Ltd. – Lithuania Anonymous (Independent Developer) Nyoun – Korea Integral Ltd. – Ukraine ActivCard – Canada Anonymous (Industry) IDENCOM Germany GmbH – Germany Institute of Automation, The Chinese Academy of Sciences – China DATAMICRO Co., Ltd. – Russia Ariel Unanue (Independent Developer) – Argentina Anonymous (Industry) Anonymous (Industry) Anonymous (Industry) Anonymous (Industry) Anonymous (Industry) Anonymous (Independent Developer) NITGEN Co., Ltd. – Korea Anonymous (Academy) Bioscrypt Inc. – Canada Testech Inc. – Korea Anonymous (Independent Developer) Li Lijuan (Independent Developer) – China Anonymous (Academy) Ji Hui (Independent Developer) – China Beijing HanWang Technology Co., Ltd. – China Anonymous (Academy) Changsha XingTong technology development Co., Ltd. – China Futronic Technology Company Limited – China Anonymous (Industry) Anonymous (Industry) Morphosoric – Germany Table 20. List of participants FVC2004. FVC 2004 (Open) ID DB1 Enrol Match time time P002 10.85% 2.13s 2.84s P004 4.10% 0.77s 0.75s P006 19.36% 0.34s 0.32s P009 3.62% 0.25s 0.23s P011 10.39% 0.70s 0.69s P016 8.87% 0.16s 0.35s P026 5.54% 2.60s 3.56s P027 9.78% 1.05s 3.26s P039 7.18% 1.16s 1.32s P041 7.61% 0.17s 0.20s P047 1.97% 1.95s 1.87s P048 7.47% 0.44s 0.40s P049 3.91% 0.32s 0.47s P050 10.06% 0.67s 0.66s P051 11.93% 0.14s 0.18s P052 8.41% 0.28s 0.26s P067 12.48% 0.07s 0.08s P068 7.68% 0.63s 0.65s P071 4.37% 0.39s 0.77s P072 10.81% 0.12s 0.12s P075 5.64% 0.41s 0.44s Note: Table 21 continues in next page. EER DB2 Enrol time 0.96s 0.61s 0.29s 0.23s 0.37s 0.14s 2.24s 0.72s 0.72s 0.12s 2.12s 0.37s 0.34s 0.51s 0.12s 0.23s 0.04s 0.34s 0.32s 0.09s 0.25s DB3 Enrol time 1.18s 0.76s 0.40s 0.30s 0.45s 0.18s 3.11s 1.05s 0.82s 0.16s 2.31s 0.45s 0.40s 0.71s 0.15s 0.28s 0.04s 0.42s 0.40s 0.08s 0.33s DB4 Enrol time 0.87s 0.63s 0.31s 0.22s 0.34s 0.13s 2.13s 0.66s 1.35s 0.12s 1.91s 0.35s 0.31s 0.46s 0.11s 0.21s 0.04s 0.32s 0.30s 0.08s 0.22s EER 7.89% 2.79% 13.63% 4.01% 5.48% 4.39% 5.75% 5.50% 1.58% 5.15% 3.49% 4.67% 3.62% 6.35% 12.92% 6.25% 8.65% 3.75% 2.59% 7.47% 5.68% Match time 1.36s 0.62s 0.29s 0.23s 0.37s 0.31s 3.26s 2.46s 0.83s 0.14s 2.18s 0.36s 0.34s 0.51s 0.16s 0.22s 0.04s 0.39s 0.57s 0.10s 0.30s EER 3.82% 1.89% 7.30% 3.74% 10.07% 2.39% 14.39% 7.07% 1.78% 4.19% 1.18% 2.70% 4.03% 5.97% 7.07% 5.94% 3.86% 3.39% 1.64% 7.11% 1.85% Match time 2.04s 0.80s 0.41s 0.30s 0.45s 0.39s 5.10s 2.61s 1.09s 0.19s 2.30s 0.45s 0.40s 0.78s 0.22s 0.28s 0.05s 0.47s 0.81s 0.12s 0.68s EER 10.59% 1.01% 12.02% 1.87% 3.99% 1.42% 3.38% 3.32% 1.07% 2.61% 1.76% 2.78% 1.40% 2.54% 8.21% 3.89% 3.68% 1.29% 0.61% 5.18% 5.99% Match time 1.46s 0.65s 0.32s 0.22s 0.35s 0.30s 3.20s 2.17s 1.53s 0.14s 1.93s 0.35s 0.32s 0.48s 0.14s 0.20s 0.04s 0.37s 0.53s 0.09s 0.29s 124 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FVC 2004 (Open) ID P078 P079 P080 P083 P087 P093 P097 P099 P101 P103 P104 P105 P106 P108 P109 P111 P112 P113 P118 P120 EER 13.65% 41.94% 13.53% 6.49% 14.03% 15.30% 3.38% 9.55% 2.72% 4.18% 50.00% 11.25% 35.68% 6.13% 35.30% 8.31% 10.57% 7.65% 25.69% 8.94% DB1 Enrol time 0.19s 1.92s 0.62s 0.37s 0.28s 0.08s 0.74s 0.46s 0.10s 0.16s 7.93s 0.37s 0.26s 0.35s 1.47s 0.64s 0.35s 0.67s 2.90s 1.90s Match time 0.18s 2.78s 0.61s 0.37s 0.28s 0.09s 0.75s 0.44s 3.19s 0.16s 7.99s 0.36s 0.34s 0.35s 1.24s 0.64s 0.36s 0.69s 3.85s 2.00s EER 14.36% 36.02% 8.03% 6.00% 8.68% 11.77% 3.23% 8.87% 3.56% 4.99% 6.66% 7.45% 31.05% 4.83% 33.80% 5.54% 7.73% 3.17% 9.11% 6.80% DB2 Enrol time 0.12s 1.77s 0.33s 0.40s 0.22s 0.07s 0.49s 0.47s 0.07s 0.12s 2.84s 0.36s 0.20s 0.30s 1.31s 0.34s 0.26s 0.33s 2.23s 1.01s Match time 0.11s 2.70s 0.33s 0.43s 0.24s 0.08s 0.53s 0.48s 0.81s 0.12s 2.93s 0.36s 0.26s 0.31s 1.25s 0.34s 0.29s 0.35s 2.65s 1.08s EER 7.56% 37.92% 11.41% 11.64% 8.73% 11.46% 4.16% 9.46% 1.20% 5.38% 6.77% 8.22% 19.74% 3.54% 43.95% 4.65% 9.60% 2.05% 6.15% 6.97% DB3 Enrol time 0.16s 2.54s 0.40s 0.46s 0.30s 0.09s 0.17s 0.51s 0.07s 0.15s 2.60s 0.42s 0.29s 0.34s 1.61s 0.41s 0.29s 0.46s 2.79s 0.93s Match time 0.16s 2.93s 0.40s 0.52s 0.34s 0.15s 0.22s 0.54s 0.85s 0.15s 2.73s 0.42s 0.40s 0.35s 1.53s 0.42s 0.33s 0.50s 3.54s 1.26s EER 13.59% 31.20% 5.30% 3.06% 7.05% 12.19% 1.75% 5.09% 0.80% 2.78% 3.93% 6.60% 27.29% 1.68% 38.25% 4.13% 5.29% 1.98% 4.53% 3.53% DB4 Enrol time 0.12s 2.05s 0.31s 0.35s 0.19s 0.06s 0.48s 0.35s 0.09s 0.11s 1.50s 0.34s 0.18s 0.26s 1.39s 0.31s 0.24s 0.34s 2.02s 0.79s Match time 0.12s 2.79s 0.31s 0.37s 0.21s 0.09s 0.52s 0.35s 1.06s 0.12s 1.60s 0.34s 0.25s 0.26s 1.35s 0.32s 0.27s 0.37s 2.36s 0.87s Table 21. Performance results FVC2004 (open category). FVC 2004 (Light) ID P006 P009 P016 P027 P041 P049 P050 P052 P065 P067 P068 P071 P072 P078 P087 P093 P097 P099 P101 P103 P104 P105 P107 P108 P111 P112 DB1 Enrol time 0.33s 0.25s 0.16s 0.18s 0.11s 0.12s 0.30s 0.27s 0.16s 0.07s 0.16s 0.20s 0.12s 0.19s 0.28s 0.08s 0.21s 0.16s 0.08s 0.16s 1.56s 0.25s 0.12s 0.24s 0.25s 0.34s DB2 Enrol time 0.29s 0.23s 0.16s 0.17s 0.10s 0.12s 0.30s 0.23s 0.15s 0.04s 0.16s 0.18s 0.09s 0.12s 0.22s 0.06s 0.19s 0.17s 0.09s 0.12s 1.57s 0.22s 0.13s 0.21s 0.22s 0.26s DB3 Enrol time 0.40s 0.30s 0.18s 0.20s 0.11s 0.14s 0.31s 0.28s 0.15s 0.04s 0.15s 0.22s 0.09s 0.16s 0.30s 0.09s 0.17s 0.20s 0.09s 0.15s 1.56s 0.23s 0.15s 0.24s 0.23s 0.29s DB4 Enrol time 0.31s 0.22s 0.16s 0.16s 0.10s 0.11s 0.27s 0.21s 0.15s 0.04s 0.18s 0.16s 0.08s 0.12s 0.19s 0.06s 0.18s 0.14s 0.10s 0.11s 1.51s 0.20s 0.11s 0.21s 0.20s 0.24s EER 37.97% 3.89% 9.93% 15.85% 8.80% 7.93% 50.00% 50.00% 10.26% 15.66% 9.92% 7.35% 11.42% 50.00% 34.80% 15.30% 6.10% 39.03% 5.28% 4.18% 50.00% 18.03% 4.78% 6.45% 14.93% 50.00% Match time 0.32s 0.21s 0.19s 0.25s 0.12s 0.19s – – 0.16s 0.07s 0.18s 0.18s 0.12s 0.18s 0.28s 0.09s 0.22s 0.17s 0.17s 0.16s – 0.25s 0.11s 0.24s 0.25s – EER 20.54% 4.01% 5.64% 9.78% 5.45% 6.11% 50.00% 24.86% 7.78% 8.81% 4.02% 5.63% 7.89% 14.36% 15.02% 11.77% 4.79% 12.18% 5.22% 4.99% 50.00% 14.98% 4.43% 4.25% 6.58% 50.00% Match time 0.29s 0.23s 0.18s 0.22s 0.12s 0.13s – 0.22s 0.15s 0.04s 0.18s 0.18s 0.09s 0.11s 0.24s 0.08s 0.20s 0.19s 0.17s 0.12s – 0.23s 0.12s 0.21s 0.23s – EER 50.00% 4.24% 3.21% 10.45% 6.93% 4.40% 50.00% 50.00% 6.14% 7.59% 5.07% 4.69% 7.28% 7.56% 54.28% 11.46% 5.13% 50.00% 3.57% 5.38% 50.00% 13.57% 3.53% 2.92% 8.20% 50.00% Match time 0.41s 0.24s 0.21s 0.24s 0.13s 0.15s – 0.28s 0.17s 0.05s 0.17s 0.21s 0.10s 0.16s 0.34s 0.15s 0.17s 0.21s 0.18s 0.15s – 0.24s 0.15s 0.25s 0.24s – EER 22.34% 1.88% 2.26% 8.64% 3.65% 4.10% 50.00% 26.03% 3.10% 4.13% 2.13% 1.99% 5.56% 13.59% 7.74% 12.19% 3.43% 6.22% 3.10% 2.78% 50.00% 13.63% 2.03% 2.25% 9.46% 50.00% Match time 0.31s 0.21s 0.19s 0.23s 0.11s 0.12s – 0.21s 0.16s 0.04s 0.19s 0.16s 0.08s 0.12s 0.21s 0.09s 0.19s 0.16s 0.17s 0.12s – 0.21s 0.11s 0.22s 0.22s – Table 22. Performance results FVC2004 (light category). The databases used in FVC2004 were more difficult than those used in previous contests due to some perturbations deliberately introduced in the collection of the fingerprint impressions (usage of dry fingers, wet fingers, high or low pressure in the acquisition process, rotated fingers, skin distortion, etc.). For this reason, the recognition accuracy ratios exhibited by the algorithms are markedly poorer than in previous benchmarks. Furthermore, by comparing open and light categories, it is demonstrated that imposing constraints on computing resources (time, model and memory size) drastically affects the final accuracy performance of the recognition system. 125 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 4.3.4. Fingerprint Verification Competition 2006 (FVC2006) In 2006 a new FVC contest was organized by the Biometric Systems Laboratory (University of Bologna), the Pattern Recognition and Image Processing Laboratory (Michigan State University), and the U.S. National Biometric Test Center (San Jose State University), with the additional support of a fourth organizer that did not participate in the previous contests: the Biometrics Research Laboratory - ATVS (Universidad Autónoma de Madrid). FVC2006 contest followed the same rules than FVC2004. The open and light subcategories were kept, but stringent execution time performances were applied in this contest. FVC 2006 ID P006 P009 P015 P017 P022 P024 P036 P041 P045 P050 P052 P053 P054 P058 P060 P065 P066 P067 P072 P073 P074 P081 P083 P085 P088 P090 P092 P095 P096 P097 P098 P101 P103 P106 P109 P118 P119 P120 P121 P122 P123 P124 P129 P131 P133 P138 P141 P143 P148 P151 P153 Participant Anonymous (Academy) Anonymous (Independent Developer) Anonymous (Industry) Suprema Inc. – Korea Anonymous (Industry) Unicomp Technology Co., Ltd. – China Anonymous (Academy) Deng Guoqiang (Independent Developer) – China Beijing BaiXinTong Electronic Tech Co., Ltd. – China Anonymous (Academy) Anonymous (Academy) Anonymous (Independent Developer) Ilya Belogin (Independent Developer) – Russia Neurotechnologija Ltd. – Lithuania Yang Qianbang (Independent Developer) – China Ji Hui (Independent Developer) – China GriauleTecnologia Ltda. – Brazil Siemens IT Solutions and Services Biometrics Center – Austria Miaxis Biometrics Co., Ltd. – China Anonymous (Academy) Innovatrics – France Anonymous (Industry) Anonymous (Industry) Anonymous (Academy) Anonymous (Industry) Anonymous (Industry) Anonymous (Industry) Anonymous (Academy) Biometric Technologies, Ltd. – Russia Anonymous (Industry) Dalian University of Technology – China Anonymous (Independent Developer) Ilya Poshivaylo (Independent Developer) – Russia Anonymous (Academy) Anonymous (Industry) Anonymous (Academy) Anonymous (Academy) Anonymous (Industry) Song Yong (Independent Developer) – China Anonymous (Independent Developer) Anonymous (Academy) Anonymous (Industry) Beijing Smackbio Technology Co., Ltd. – China Institute of Automation, Chinese Academy of Sciences – China GuangZhou Comet Technology Development Co., Ltd. – China Wei Wang (Independent Developer) – China Xu Zengbo (Independent Developer) – China LGE Institute of Technology – Korea Anonymous (Industry) Anonymous (Industry) Anonymous (Industry) Table 23. List of participants FVC2006. The computational platforms used in the evaluation process were based on personal computers with Intel Pentium IV processors running at 3.2 GHz with 1.0 GB RAM under Windows XP 126 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Professional operating system. The execution time performance for the enrolment and matching stages were fixed to 5 s and 3 s respectively for the open category, and 0.3 s and 0.1 s for the light category. Four new databases were used in this contest. The first one was collected by means of one commercial one-touch fingerprint sensor based on electrical field, the second one was based on one optical sensor, the third one was based on one thermal sweep sensor, and the latest database was synthetically generated by means of SFinGe software. Table 23 shows the list of participants of FVC2006. Table 24 and Table 25 report the achieved performances in the open and light categories respectively. FVC 2006 (Open) ID P006 P009 P015 P017 P022 P024 P036 P041 P045 P050 P053 P054 P058 P060 P065 P066 P067 P072 P073 P074 P081 P083 P085 P088 P090 P092 P095 P096 P097 P098 P101 P106 P109 P118 P119 P120 P124 P131 P138 P141 P148 P151 EER 34.829% 7.370% 7.823% 5.564% 15.530% 8.255% 28.577% 9.468% 6.122% 19.284% 9.900% 11.746% 7.496% 9.124% 10.385% 5.978% 7.044% 8.887% 46.358% 7.733% 18.917% 27.804% 27.536% 8.794% 12.102% 13.208% 14.929% 11.573% 47.075% 12.309% 7.928% 38.156% 15.497% 31.354% 21.133% 17.307% 13.817% 6.922% 15.361% 11.472% 11.423% 16.763% DB1 Enrol time 0.028s 0.265s 0.572s 0.038s 0.234s 0.065s 0.306s 0.033s 0.074s 0.116s 0.159s 0.040s 0.103s 0.134s 0.023s 0.430s 0.108s 0.024s 0.025s 0.092s 0.011s 0.559s 0.926s 0.503s 0.044s 0.039s 0.266s 0.039s 0.265s 0.065s 0.176s 0.016s 0.121s 0.051s 0.054s 0.044s 0.042s 0.068s 0.087s 0.032s 0.106s 0.080s Match time 0.028s 0.304s 0.597s 0.039s 0.235s 0.067s 0.327s 0.049s 0.311s 0.119s 0.156s 0.041s 0.103s 0.136s 0.022s 0.506s 0.120s 0.035s 0.025s 0.094s 0.011s 0.563s 0.968s 0.515s 0.028s 0.041s 0.339s 0.041s 0.265s 0.067s 0.283s 0.016s 0.123s 0.057s 0.052s 0.046s 0.210s 0.205s 0.086s 0.052s 0.105s 0.249s EER 2.093% 0.095% 0.032% 0.486% 0.290% 0.474% 2.847% 0.548% 0.138% 1.103% 0.970% 0.807% 0.100% 2.453% 0.137% 0.122% 0.185% 0.268% 1.777% 0.248% 0.680% 4.444% 10.337% 0.021% 0.374% 0.712% 9.368% 0.707% 25.785% 2.393% 0.121% 2.288% 2.522% 10.342% 3.242% 4.644% 2.378% 0.511% 1.297% 0.237% 1.555% 10.949% DB2 Enrol time 0.521s 0.799s 1.434s 0.109s 0.655s 0.214s 1.119s 0.105s 0.226s 0.157s 0.831s 0.073s 0.587s 0.601s 0.091s 0.771s 0.403s 0.072s 0.342s 0.257s 0.037s 0.398s 1.630s 1.085s 0.141s 0.069s 1.044s 0.069s 1.700s 0.254s 0.727s 0.252s 0.149s 0.319s 0.132s 0.566s 1.159s 0.322s 0.306s 0.158s 0.529s 1.408s Match time 1.204s 0.899s 1.461s 0.110s 0.666s 0.146s 1.424s 0.113s 0.253s 0.157s 0.804s 0.072s 0.589s 0.611s 0.091s 1.002s 0.427s 0.116s 0.347s 0.270s 0.038s 0.399s 1.959s 1.100s 0.085s 0.072s 1.215s 0.072s 1.756s 0.290s 0.769s 0.344s 0.151s 0.464s 0.161s 0.570s 2.541s 0.417s 0.304s 0.168s 0.520s 1.942s EER 7.707% 1.645% 1.534% 2.762% 5.161% 2.335% 29.130% 2.810% 1.890% 6.378% 5.735% 4.871% 1.608% 4.778% 2.979% 2.054% 2.203% 2.135% 7.675% 1.681% 5.755% 10.589% 19.737% 2.156% 3.959% 4.877% 14.834% 4.850% 34.587% 5.619% 3.019% 7.744% 6.374% 25.998% 9.014% 8.572% 3.991% 2.615% 7.022% 3.506% 6.168% 10.332% DB3 Enrol time 0.320s 0.583s 1.682s 0.097s 0.494s 0.239s 0.799s 0.123s 0.194s 0.336s 0.775s 0.068s 0.307s 0.499s 0.058s 0.517s 0.255s 0.071s 0.263s 0.235s 0.032s 0.365s 1.204s 0.844s 0.130s 0.065s 0.909s 0.065s 1.382s 0.244s 0.508s 0.157s 0.148s 0.224s 0.127s 0.466s 0.325s 0.275s 0.056s 0.128s 0.475s 0.994s Match time 1.101s 0.641s 1.678s 0.097s 0.502s 0.169s 0.915s 0.135s 0.223s 0.349s 0.730s 0.067s 0.307s 0.507s 0.057s 0.661s 0.274s 0.114s 0.265s 0.242s 0.034s 0.366s 1.332s 0.847s 0.086s 0.068s 1.076s 0.068s 1.440s 0.261s 0.543s 0.248s 0.149s 0.319s 0.144s 0.463s 0.666s 0.437s 0.067s 0.142s 0.466s 1.876s EER 7.581% 0.269% 0.627% 1.112% 2.868% 1.350% 5.899% 1.486% 0.759% 4.020% 2.393% 5.756% 0.991% 2.999% 1.645% 0.466% 1.570% 1.345% 7.911% 0.453% 7.138% 14.794% 28.219% 0.891% 1.218% 5.245% 45.888% 5.432% 47.885% 3.010% 0.691% 6.268% 9.426% 26.839% 50.000% 19.979% 1.871% 0.701% 4.718% 1.687% 17.702% 10.611% DB4 Enrol time 0.188s 0.564s 1.085s 0.072s 0.361s 0.133s 0.684s 0.062s 0.146s 0.172s 0.403s 0.059s 0.292s 0.323s 0.050s 0.725s 0.197s 0.045s 0.174s 0.230s 0.031s 0.359s 0.783s 1.528s 0.088s 0.057s 0.638s 0.057s 0.820s 0.130s 0.636s 0.091s 0.148s 0.139s 0.089s 0.321s 0.671s 0.185s 0.205s 0.145s 0.266s 0.917s Match time 0.280s 0.601s 1.113s 0.073s 0.362s 0.109s 0.745s 0.080s 0.178s 0.172s 0.387s 0.059s 0.294s 0.325s 0.049s 1.023s 0.244s 0.074s 0.176s 0.238s 0.033s 0.361s 0.861s 1.549s 0.055s 0.060s 0.916s 0.060s 0.824s 0.132s 0.679s 0.111s 0.151s 0.240s 0.089s 0.321s 2.084s 0.291s 0.203s 0.155s 0.265s 1.324s Table 24. Performance results FVC2006 (open category). FVC 2006 (Light) ID DB1 Enrol Match EER time time P017 5.564% 0.037s 0.039s P045 6.420% 0.048s 0.052s P052 11.026% 0.019s 0.022s P054 11.746% 0.040s 0.041s P058 8.019% 0.075s 0.075s P060 50.000% 0.134s 0.135s P065 10.385% 0.023s 0.022s Note: Table 25 continues in next page. DB2 Enrol time 0.109s 0.101s 0.074s 0.072s 0.181s 0.600s 0.092s DB3 Enrol time 0.091s 0.084s 0.061s 0.068s 0.121s 0.498s 0.058s DB4 Enrol time 0.071s 0.071s 0.042s 0.059s 0.116s 0.322s 0.050s EER 0.585% 0.290% 0.886% 0.807% 0.295% 50.000% 0.148% Match time 0.069s 0.047s 0.080s 0.071s 0.082s – 0.091s EER 2.887% 2.489% 3.502% 4.871% 2.351% 50.000% 2.952% Match time 0.070s 0.040s 0.067s 0.067s 0.082s – 0.057s EER 1.135% 0.564% 1.877% 5.756% 3.443% 50.000% 1.666% Match time 0.066s 0.053s 0.048s 0.059s 0.084s 0.325s 0.049s 127 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FVC 2006 (Light) ID P072 P081 P090 P092 P096 P101 P103 P121 P122 P123 P129 P131 P133 P138 P141 P143 P153 EER 9.412% 18.917% 12.136% 13.208% 11.573% 11.839% 43.086% 7.877% 12.732% 23.468% 5.356% 9.942% 5.888% 16.419% 10.514% 10.116% 22.264% DB1 Enrol time 0.024s 0.011s 0.041s 0.039s 0.039s 0.079s 0.058s 0.027s 0.047s 0.045s 0.031s 0.035s 0.039s 0.087s 0.024s 0.022s 0.020s Match time 0.032s 0.011s 0.028s 0.041s 0.041s 0.075s 0.050s 0.027s 0.047s 0.040s 0.029s 0.036s 0.036s 0.086s 0.026s 0.023s 0.026s EER 0.586% 0.680% 0.411% 0.712% 0.707% 50.000% 7.624% 0.190% 1.972% 8.377% 0.169% 1.993% 0.158% 1.882% 0.295% 0.474% 42.499% DB2 Enrol time 0.043s 0.037s 0.114s 0.070s 0.069s 0.208s 0.069s 0.070s 0.057s 0.146s 0.056s 0.076s 0.064s 0.072s 0.044s 0.045s 0.076s Match time 0.061s 0.038s 0.064s 0.072s 0.072s 0.198s 0.063s 0.067s 0.058s 0.063s 0.056s 0.063s 0.066s 0.082s 0.052s 0.044s 0.114s EER 3.205% 5.755% 4.360% 4.877% 4.850% 50.000% 18.052% 3.338% 6.928% 10.068% 1.645% 3.632% 1.634% 7.629% 3.063% 3.548% 7.491% DB3 Enrol time 0.041s 0.032s 0.093s 0.065s 0.065s 0.208s 0.066s 0.058s 0.059s 0.100s 0.047s 0.077s 0.054s 0.055s 0.047s 0.029s 0.057s Match time 0.058s 0.034s 0.058s 0.067s 0.067s 0.211s 0.063s 0.055s 0.059s 0.067s 0.046s 0.058s 0.056s 0.066s 0.051s 0.033s 0.082s EER 2.024% 7.138% 1.144% 5.245% 5.432% 50.000% 44.313% 0.427% 5.937% 28.551% 0.496% 1.603% 0.522% 8.883% 0.680% 0.875% 14.507% DB4 Enrol time 0.036s 0.031s 0.083s 0.056s 0.056s 0343s 0.061s 0.049s 0.072s 0.058s 0.055s 0.073s 0.063s 0.040s 0.050s 0.027s 0.041s Match time 0.053s 0.033s 0.054s 0.060s 0.059s 0.285s 0.052s 0.049s 0.073s 0.051s 0.054s 0.060s 0.062s 0.046s 0.056s 0.032s 0.059s Table 25. Performance results FVC2006 (light category). 4.3.5. On-line Evaluation of Fingerprint Recognition Algorithms (FVC-onGoing) In order to continuously track the advances in fingerprint recognition technologies, in June 2009 was launched the FVC-onGoing contest, leaded by the Biometric Systems Laboratory (University of Bologna). FVC-onGoing (http://biolab.csr.unibo.it/FVCOnGoing/) is a web-based on-line evaluation program of fingerprint recognition algorithms. Specific hardware and software are used for fair evaluation purposes. As in previous FVC contests, a set of databases are used in the evaluation process. FVC-onGoing provides various benchmarks to evaluate and compare recognition algorithms. One of them is focused on fingerprint verification algorithms –covering all the processing stages that take part in one recognition application–, and another one is based on fingerprint matching only –based on minutiae features coded using the standard minutiae-based template format according to ISO/IEC 19794-2–. The results published in both categories in the period 2009-2011 are shown in Table 26 and Table 27. In the fingerprint verification benchmark, two different databases are used: • FV-STD-1.0, which contains fingerprint images acquired using high quality one-touch optical scanners; and • FV-HARD-1.0, which contains more difficult fingerprint images to be processed (noisy images, distorted impressions, etc.) acquired also with one-touch optical scanners. The reduced quality of the bitmaps makes the fingerprint verification process more challenging with this database. Algorithms submitted to this benchmark are required to enrol fingerprints into proprietary or standard template formats and to compare such templates to produce a similarity score. There are no constraints with regard to the maximum memory allocation for the application or the maximum template size. The only limitation imposed in this benchmark is the maximum execution time performance for the enrolment and matching tasks: the enrolment processing time is limited to 5 s, and the matching processing time is limited to 3 s. In the fingerprint minutiae matching benchmark (ISO/IEC 19794-2), other two databases are used: • FMISO-STD-1.0, which contains minutiae-based ISO templates created from fingerprint images acquired in operational conditions using high quality optical scanners; and • FMISO-HARD-1.0, which collects a relevant number of ISO templates deduced from more noisy and distorted fingerprint impressions. Algorithms submitted to this benchmark are required to compare ISO fingerprint templates to produce a similarity score. No fingerprint enrolment is needed, only the minutiae-based matching is evaluated. The maximum matching time of two minutiae sets is limited to 3 s in this benchmark. Based on the evolution of the results, although some progress in recognition accuracy is reached over the years, the EER is still measured in the range of % or ‰ today, but not yet in the range of ppm’s so further research is expected in the coming years to achieve major improvements. 128 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Participant - Fingerprint Verification Test AA Technology Ltd. Tiger IT Bangladesh AA Technology Ltd. UnionCommunity Institute of Automation, Chinese Academy of Sciences Yanbing Zhang & Bao Feng Lan (Independent Dev.) Green Bit S.p.A. Robert Važan (Independent Developer) Neurotechnology Secuest Inc. jFinger Co., Ltd. Algorithm EMB9300 v1.1 TigerAFIS v1.2ec EMB9200 v2.3 Triple_M v1.1 MntModel v1.0 MiraFinger v2.2 GBFRSW v1.3.2.0 SourceAFIS v1.1 MM_FV v3.0 STAR v1.0 JF_FV v1.21a Database FV-STD-1.0 Enrol Match EER time time 0.142% 0.092s 0.010s 0.108% 0.550s 0.026s 0.176% 0.083s 0.010s 0.418% 0.057s 0.003s 0.293% 0.199s 0.109s 6.701% 0.109s 0.006s 0.118% 0.120s 0.035s 3.649% 1.342s 0.953s 0.281% 0.225s 0.003s 1.265% 0.095s 0.005s 1.618% 0.108s 0.036s Database FV-HARD-1.0 Enrol Match EER time time 0.722% 0.082s 0.009s 0.687% 0.464s 0.018s 0.700% 0.074s 0.09s 2.021% 0.052s 0.003s 1.257% 0.172s 0.097s – – – 0.735% 0.097s 0.028s 6.769% 1.323s 0.951s 1.528% 0.234s 0.003s – – – – – – Table 26. Published performance results FVC-onGoing (fingerprint verification) in period 2009-2011. Participant - Fingerprint Matching (ISO) Test NITGen&Company AA Technology Ltd. Institute of Automation, Chinese Academy of Sciences UnionCommunity Suprema, Inc. Communik8 Ltd. Robert Važan (Independent Developer) Biometric Systems Laboratory, University of Bologna id3 Semiconductors Tiger IT Bangladesh APRO TECHNOLOGY (BANGKOK) CO., LTD. Neurotechnology Algorithm Nitgen_ISO v1.0 EMB9200 v2.41 MntModel v1.0 Triple_M_ISO v1.2 SFCore v1.0 Authentik8 v1.0 SourceAFIS v1.3 MCC v1.1 Fingerprint Matcher ISO v1.0 Tiger ISO v0.1 APF_FMISO v1.1 MM_FMISO v3.0 Database FMISO-STD-1.0 Enrol Match EER time time – – – ~0.00s 0.234% 0.009s ~0.00s 0.380% 0.116s ~0.00s 0.234% 0.003s ~0.00s 0.258% 0.018s ~0.00s 1.017% 0.048s ~0.00s 1.334% 0.165s ~0.00s 0.570% 2.279s 0.559% 0.317% 0.582% 0.598% ~0.00s ~0.00s ~0.00s ~0.00s 0.009s 0.021s 0.003s 0.003s Database FMISO-HARD-1.0 Enrol Match EER time time ~0.00s 1.089% 0.006s ~0.00s 1.113% 0.007s ~0.00s 1.588% 0.082s ~0.00s 1.103% 0.003s ~0.00s 1.407% 0.012s – – – – – – ~0.00s 2.315% 2.265s 2.400% – 2.552% 2.430% ~0.00s – ~0.00s ~0.00s 0.008s – 0.003s 0.003s Table 27. Published performance results FVC-onGoing (minutiae matching - ISO) in period 2009-2011. 4.3.6. Fingerprint Vendor Technology Evaluation (FpVTE2003) The FpVTE2003 was an international benchmark focused on fingerprint matching for identification and verification applications, conducted in 2003 in the United States by the National Institute of Standards and Technology (NIST) on behalf of the Justice Management Division (JMD) of the U.S. Department of Justice. A total of 18 companies participated in the evaluation contest. Multiple tests over different combinations of fingerprint impressions acquired with several sensors of distinct technologies were performed in FpVTE2003 to assess the performance of the submitted algorithms. The evaluation process covered a wide range of fingerprint impressions of different qualities. The recognition applications in charge of matching fingerprints were written by each of the participants but, unlike to what happens in FVC contests, the matching algorithms under test were not run on specific hardware but on participant’s own computer hardware. FpVTE2003 was split into three separate tests: (i) the Small-Scale test, (ii) the Medium-Scale test, and (iii) the Large-Scale test. The Small-Scale and Medium-Scale tests were evaluated with individual fingerprint images, specially focused on AFAS applications; whereas the evaluation performed in the Large-Scale test was done using sets of fingerprints acquired from different fingers of the same individual (e.g. two index fingers, four fingers, ten fingers, etc.) mostly oriented to AFIS applications. As conclusion of the evaluation, it is noted that the variables that had the clearest effect on the recognition accuracy were the number of fingers used in the matching process (the higher the number of fingers –e.g. single fingers for template and query, two fingers for template and two fingers for query, four fingers for template and four fingers for query, etc.–, the higher the provided accuracy) and the inherent quality of the fingerprint impressions. The most accurate fingerprint matching system tested in FpVTE2003 with the Medium-Scale database presented the following performances: FRR = 0.60% at FAR = 0.01%, and FRR = 0.10% at FAR = 1.00%. The EER performance achieved by each of the 18 participants in the MediumScale test is shown in Table 28. 129 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Medium-Scale Database Test (Classification) 1st 2nd 3rd 4rd 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th Participant NEC Corporation Cogent Systems, Inc. Sagem Morpho, Inc. (M2 algorithm) Sagem Morpho, Inc. (M1 algorithm) Neurotechnologija Ltd. Motorola Ultra-Scan Corporation (M2 algorithm) Ultra-Scan Corporation (M1 algorithm) Identix, Inc. NIST VTB Biolink Technologies International, Inc. Golden Finger Systems Raytheon Company Antheus Technology, Inc. 123 ID, Inc. (M2 algorithm) The Phoenix Group, Inc. Avalon Systems, Inc. Technoimagia Co., Ltd. EER 0.20% 0.30% 0.60% 0.90% 2.30% 3.60% 3.90% 4.20% 4.40% 4.80% 6.50% 7.60% 7.60% 8.60% 11.60% 13.80% 14.10% 15.10% Table 28. Performance results Medium-Scale test FpVTE2003. 4.3.7. Proprietary Fingerprint Template Evaluation (PFT) Since June 2003 to February 2010, NIST conducted the evaluation of fingerprint-based biometric matching systems from SDK vendors. The main goal behind this evaluation process was to estimate how well potential commercial products perform one to one matching for verification applications over a wide range of fingerprint images of varying qualities. The SDK benchmark was similar to the Medium-Scale test performed during FpVTE2003 contest. A total of 55 SDKs corresponding to different vendors were evaluated under several databases. Table 29 shows the EER performance of some of the SDKs under four of the used databases (DHS2 from the Department of Homeland Security, DOS from the Department of State, POE from US VISIT Point of Entry program, and POEBVA from US VISIT Point of Entry Bio-VISA Application program). SDK Participant ID D Cogent Systems, Inc. F Cogent Systems, Inc. H NEC Corporation I Cogent Systems, Inc. J Sagem Morpho, Inc. K Neurotechnologija Ltd. O NEC Corporation P NEC Corporation Q ID Solutions, Inc. R Cogent Systems, Inc. U Identix, Inc. W Sonateq W2 Sonateq X Sonateq X2 Sonateq Z Sonateq 1A USFIS International, S.A. 1B BioLink Solutions, Inc. 1C Sagem Morpho, Inc. 1D BIO-key International, Inc. 1E BioLink Solutions, Inc. 1F Motorola 1G Neurotechnologija Ltd. 1H Sagem Morpho, Inc. 1I Thales Security Systems 1J Neurotechnologija Ltd. 1K BIO-key International, Inc. 1L Neurotechnologija Ltd. 1M NEC Corporation 1N East Shore Technologies, Inc. 1O NEC Corporation Note: Table 29 continues in next page. EER DHS2 0.24% 0.49% 0.74% 0.34% 0.75% 0.97% 0.86% 1.33% 0.80% 0.72% 0.77% 13.29% 0.59% 12.69% 0.76% 47.80% 13.38% 1.06% 0.58% 0.95% 0.72% 0.59% 3.50% 0.62% 0.87% 3.62% 0.46% 3.57% 0.56% 31.40% 0.50% EER DOS 0.58% 0.26% 0.11% 0.10% 0.42% 0.74% 0.18% 0.32% 0.23% 0.08% 0.22% 0.57% 0.26% 0.99% 0.40% 0.56% 3.31% 1.44% 0.17% 0.68% 1.13% 0.23% 0.59% 0.13% 0.36% 0.44% 0.17% 0.22% 0.03% 6.82% 0.03% EER POE 0.20% 0.10% 0.05% 0.06% 0.15% 0.32% 0.07% 0.11% 0.13% 0.04% 0.12% 0.36% 0.16% 0.84% 0.36% 0.30% 2.65% 0.66% 0.09% 0.38% 0.43% 0.11% 0.28% 0.08% 0.19% 0.19% 0.07% 0.17% 0.03% 8.28% 0.04% EER POEBVA 0.21% 0.11% 0.04% 0.06% 0.18% 0.33% 0.07% 0.14% 0.15% 0.04% 0.14% 0.41% 0.22% 0.65% 0.43% 0.33% 2.79% 0.68% 0.08% 0.34% 0.49% 0.11% 0.26% 0.05% 0.23% 0.18% 0.08% 0.18% 0.02% 7.26% 0.02% 130 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 SDK ID 1P 1Q 1R 1S 1T 1U 1V 1W/1X 1Y 1Z 2A 2B 2C 2D 2E Participant L1 Identity Solutions Sonateq Eastern Golden Finger Technology Beijing Co. Ltd. D.B.A.: GFS Systems L1 Identity Solutions Neurotechnology Tiger IT Bangladesh Limited Warwick Warp Limited Eastern Golden Finger Technology Beijing Co. Ltd. D.B.A.: GFS Systems L1 Identity Solutions Sonda Technologies Ltd. Green Bit S.p.A. Sagem Morpho, Inc. BIO-key International, Inc. Lockheed Martin Corporation Warwick Warp, Ltd. EER DHS2 0.67% 3.36% 3.65% 0.62% 0.89% 0.58% 0.51% 10.86% 0.51% 1.02% 1.00% 0.39% 0.47% 13.66% 0.66% EER DOS 0.08% 0.22% 0.66% 0.06% 0.23% 0.07% 0.12% 0.52% 0.04% 0.24% 0.08% 0.04% 0.12% 2.25% 0.07% EER POE 0.05% 0.11% 0.38% 0.05% 0.14% 0.04% 0.06% 0.16% 0.04% 0.12% 0.05% 0.04% 0.05% 1.98% 0.03% EER POEBVA 0.07% 0.16% 0.40% 0.05% 0.13% 0.04% 0.07% 0.19% 0.03% 0.13% 0.19% 0.02% 0.07% 1.98% 0.04% Table 29. Performance results ongoing PFT. 4.3.8. Fingerprint Template Evaluation II (PFTII) The National Institute of Standards and Technology's Proprietary Fingerprint Template Evaluation II (PFTII), as the previous PFT benchmark, is an ongoing program aimed at measuring the performance of fingerprint matching software systems by utilizing vendor proprietary fingerprint templates. The PFTII evaluation benchmark was started in April 2010. It reports not only matching accuracy information but also template size information, template extraction time, and matching time for three 1K fingerprint image samples of different sizes (368×368, 500×500, 400×776, 800×800, and 412×1000 pixels) when the processing is executed on a uniform hardware and operating system environment. SDK ID Participant Execution Time (s) Maximum enrolment time Median enroment time Minimum enrolment time Maximum matching time Median matching time Minimum matching time Maximum enrolment time Median enroment time Minimum enrolment time Maximum matching time Median matching time Minimum matching time Maximum enrolment time Median enroment time Minimum enrolment time Maximum matching time Median matching time Minimum matching time DB AZ 2.4488 1.9216 1.4591 0.2245 0.0759 0.0267 5.9013 3.0539 0.8461 0.06350 0.00511 0.00104 3.3429 1.4869 0.4188 0.7627 0.0344 0.0078 DB LA 1.6522 1.1506 0.7552 0.1888 0.0414 0.0097 3.7762 2.4603 0.0056 0.04454 0.00273 0.00035 2.8403 1.0129 0.0195 0.7373 0.0299 0.0000 DB DHS2 0.6042 0.5124 0.4046 0.0397 0.0103 0.0031 1.2305 1.0260 0.0027 0.0174 0.0014 0.0002 1.3596 0.4444 0.0076 0.7415 0.0070 0.0000 DB POEBVA 0.9276 0.4918 0.3557 0.0904 0.0113 0.0043 1.8059 0.9923 0.5310 0.0126 0.0012 0.0001 1.3840 0.4673 0.1833 0.9595 0.0059 0.0009 3A Avalon Biometrics S.L. 3B ID Solutions, Inc. 3C Patrima Technology Company Table 30. List of participants PFTII and 1K sample enrolment/matching time results distribution. The NIST PFTII evaluation platform consists of an array of blade servers having a hardware configuration similar to: a) Processors available: • Quad 2.3 GHz/8 MB Cache, AMD (4-core) • Dual 2.8 GHz/1 MB Cache, Xeon (dual-core) • 800 MHz Front Side Bus for PE 1855 131 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 b) Memory available: • 64-bit – 192 GB RAM • 64-bit – 16 GB RAM • 32-bit – 4 GB RAM c) Operating systems available: • Red Hat Enterprise Linux Server 5.1 (64-bit) • Windows Server 32-bit & 64-bit. As a continuation of PFT, PFTII remains as one to one verification evaluation program, and it does not report one to many matching results. Table 30, Table 31 and Table 32 show the set of evaluation results provided by NIST until October 2011. SDK ID 3A Template Size (bytes) Maximum Median Minimum Maximum Median Minimum Maximum Median Minimum DB AZ Gallery 800×800 23901 11988 3875 9018 4633 1072 4336 3334 780 DB AZ Probe 800×800 22262 12121 6308 9324 4666 1946 4336 3330 1472 DB LA Gallery 412×1000 17814 11283 6883 5360 2914 233 4336 2184 32 DB LA Probe 400×776 15211 8651 2788 6529 3072 191 3728 2250 32 DB DHS2 Gallery 368×368 6566 4201 1834 2782 1492 240 2212 1156 32 DB DHS2 Probe 368×368 6796 4275 1808 2872 1507 191 1996 1152 32 DB POEBVA Gallery 500×500 10120 4429 2346 3339 1742 326 2292 1192 628 DB POEBVA Probe 368×368 6050 3924 1738 2830 1284 241 1584 1088 404 3B 3C Table 31. PFTII 1K sample template size distribution. SDK ID 3A 3B 3C EER Test 0 0.4750% NA 0.9000% EER Test 1 0.9083% NA 1.2300% EER EER EER EER EER EER EER EER Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 1.2767% 0.5183% 1.1075% 1.4667% 2.0142% 1.8942% 0.4292% 0.8225% NA NA NA NA NA NA NA NA 1.5000% 0.9900% 1.5100% 1.8800% 6.6900% 5.7800% 0.7300% 1.3800% NA: Could not obtain EER because the minimum of FNMR is greater than the maximum of FMR. Table 32. PFTII recognition accuracy performance results. 4.3.9. Minutiae Interoperability Exchange Test (MINEX04) Biometrics-based recognition systems focused on fingerprints must extract a set of distinctive traits –known as template– from an enrolled fingerprint, and compare it later with another set of distinctive traits extracted on-line from another fingerprint. The matching task becomes harder when the enrolment and authentication systems are different, and interoperability is required between them since the interchange of information is mediated by a standardized format. Difficulty may arise because template generation products may systematically interpret a common input differently. In order to afford the interoperability issue among multiple enrolment/matching systems, in 2004 was organized the Minutiae Interoperability Exchange Test contest (MINEX04). MINEX04 was conducted by NIST, and co-sponsored by the Justice Management Division (JMD) of the U.S. Department of Justice and the Department of Homeland Security (DHS) US-VISIT program. MINEX04 aimed at determining the feasibility of using minutiae data rather than raw image data or other distinctive traits extracted from fingerprint impressions for the exchange of information between the different modules that compose a fingerprint recognition system. MINEX04 was focused on determining whether standard fingerprint template representations –based on minutia points according to ANSI/INCITS 378– were able to provide the same accuracy performance than other proprietary template representations based on minutia points –but not compliant with such fingerprint minutia standard encoding schemes– when the template information is interchanged between dissimilar enrolment and matching systems. Up to 14 organizations took part in that contest. Each participant was required to produce minutiae templates in accordance with their proprietary encoding scheme in order to deduce a base 132 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 performance level. Moreover, each participant was also required to produce two variants of the standard ANSI/INCITS 378 minutiae encoding schemes: (i) MIN:A, which codes minutiae in the form (x, y, φ, type, quality); and (ii) MIN:B, which complements it with ridge count, core and delta information. The test intends to evaluate how the interoperability of templates is affected by the method used to encode the minutiae. Four different databases were used in the evaluation process, covering fingerprint images of size 368×368 and 500×500 pixels: two high quality datasets POE and POEBVA, and two low quality datasets DOS and DHS2. Some of the reported results, when dealing with matching operations with one single enrolment and one single authentication fingerprints, under POEBVA database, are presented in Table 33. It shows the achieved accuracy performance in each of the encoding schemes. A clear degradation in performance can be observed when using standard encoding schemes in comparison to the proprietary ones. SDK ID A B C D E F G H I J K L M N Participant Cogent Systems, Inc. Dermalog Identification Systems GmbH Bioscrypt Inc. Sagem Morpho, Inc. Neurotechnologija Ltd. Innovatrics NEC Corporation Technoimagia Co., Ltd. Identix, Inc. Biologica Sistemas SPEX Forensics SecuGen Corporation NITGen Corporation Cross Match Technologies FNMR at FMR = 0.1% Proprietary MIN:A MIN:B 1.03% 1.87% 1.80% 3.65% 4.62% – 4.03% 4.03% – 1.49% 2.18% 2.18% 4.61% 6.00% 6.02% 4.20% 4.07% 3.48% 0.86% 1.90% – 16.18% 16.65% – 5.10% 5.75% 5.22% 23.72% 23.72% – 2.94% 6.89% 6.34% 8.44% 7.93% – 5.49% 5.50% – 8.37% 7.64% – FNMR at FMR = 1% Proprietary MIN:A MIN:B 0.89% 1.36% 1.35% 1.89% 2.51% – 2.25% 2.25% – 0.89% 1.40% 1.40% 2.51% 3.01% 2.96% 3.37% 2.24% 1.99% 0.47% 1.29% – 10.04% 10.27% – 3.29% 3.48% 3.36% 15.03% 15.05% – 1.86% 4.61% 6.34% 5.75% 5.24% – 3.58% 3.59% – 4.81% 4.86% – Table 33. List of participants MINEX04 and recognition performance results under POEBVA database. SDK ID A B C D E F G H I J K L M N A 1.36% 2.18% 3.57% 2.07% 2.36% 3.59% 3.00% 4.37% 3.97% 4.03% 1.88% 4.67% 4.96% 3.68% B 5.49% 2.51% 4.28% 3.57% 3.65% 4.30% 2.91% 13.36% 8.06% 6.02% 5.93% 5.58% 4.93% 4.36% C 4.58% 3.85% 2.25% 3.01% 3.40% 2.22% 4.47% 12.12% 8.30% 9.39% 4.76% 7.04% 4.55% 4.28% D 2.25% 1.73% 2.04% 1.40% 2.25% 2.06% 2.05% 6.56% 5.18% 4.55% 2.80% 4.28% 3.07% 2.93% E 6.41% 4.02% 5.19% 4.85% 3.01% 5.22% 3.90% 18.60% 10.62% 9.87% 6.61% 8.22% 5.45% 4.58% F 4.59% 3.82% 2.25% 3.03% 3.41% 2.24% 4.41% 12.15% 8.28% 9.43% 4.67% 7.08% 4.54% 4.28% G 4.17% 1.92% 3.48% 3.16% 2.86% 3.45% 1.29% 13.39% 5.48% 4.89% 4.28% 4.32% 3.27% 3.93% H 8.34% 11.36% 19.69% 9.45% 8.74% 19.67% 9.05% 10.27% 22.27% 28.52% 7.90% 16.40% 20.66% 10.19% I 3.34% 3.36% 4.84% 3.92% 4.76% 4.85% 4.41% 7.96% 3.48% 5.42% 4.00% 4.85% 6.16% 4.97% J 17.07% 15.01% 30.34% 20.13% 18.85% 30.38% 17.47% 100% 24.70% 15.05% 19.20% 19.01% 30.22% 19.45% K 7.47% 15.99% 24.51% 12.18% 8.96% 24.56% 16.32% 17.48% 27.56% 73.14% 4.61% 23.75% 39.29% 8.65% L 6.59% 5.06% 7.43% 6.55% 6.00% 7.43% 5.59% 11.81% 7.91% 7.73% 7.70% 5.24% 8.68% 6.82% M 7.92% 4.42% 4.93% 5.51% 5.57% 4.93% 4.19% 18.18% 10.30% 10.26% 8.85% 8.66% 3.59% 6.21% N 9.66% 5.61% 6.91% 5.82% 3.97% 6.86% 5.26% 22.96% 13.83% 11.69% 10.15% 9.38% 8.55% 4.86% Table 34. FNMR at FMR = 1% for the POEBVA database in MIN:A encoding scheme. The SDK identified in the row produces the enrolment template. The SDK identified in each column produces the authentication template and performs the matching. SDK ID A D E F I K A 1.35% 2.06% 2.34% 3.54% 3.89% 1.89% D 2.25% 1.40% 2.25% 2.07% 5.18% 2.81% E 6.85% 5.54% 2.96% 8.85% 10.73% 65.35% F 3.98% 2.64% 2.98% 1.99% 7.65% 4.63% I 3.33% 4.17% 4.66% 4.86% 3.36% 86.15% K 20.97% 34.80% 20.27% 87.05% 48.39% 6.34% Table 35. FNMR at FMR = 1% for the POEBVA database in MIN:B encoding scheme. The SDK identified in the row produces the enrolment template. The SDK identified in each column produces the authentication template and performs the matching. 133 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The recognition accuracy performance that is reached in case of having one enrolment system responsible for generating one template, and a different matcher system taking care of the generation of the second template and the comparison of both templates is depicted in Table 34 and Table 35. In those scenarios, both templates are coded in MIN:A or MIN:B standard formats, respectively, under the POEBVA database. Specific hardware and software was used to perform the tests: 3 GHz i386 machines running under Window or Linux operating systems, depending on the participant (A to J used Windows, K to N used Linux). Table 36 and Table 37 show the execution time performance exhibited by each of the SDKs under test when using proprietary minutiae template encoding schemes. DB / SDK ID POEBVA DHS2 POE DOS A 737 849 753 782 B 310 360 328 346 C 451 480 453 474 D 129 130 130 131 E 169 155 173 197 F 373 399 376 394 G 442 537 449 510 H 182 169 187 205 I 480 479 481 482 J 1386 704 1438 800 K 699 584 756 617 L 140 140 141 146 M 167 156 168 174 N 197 195 198 202 Table 36. Template generation times. Average execution time, in milliseconds, for generation of proprietary templates in each of the databases. DB / SDK ID POEBVA DHS2 POE DOS A 38.8 40.3 38.0 43.6 B 2.0 2.1 1.7 2.0 C 10.1 9.0 7.5 9.7 D 3.0 2.9 2.7 2.7 E 0.7 0.7 0.6 0.7 F 10.1 9.8 9.3 10.5 G 6.0 6.2 5.0 6.4 H 21.4 21.1 24.4 23.5 I 8.4 6.5 7.8 8.7 J 73.9 33.1 66.1 38.7 K 8.4 8.4 7.6 9.4 L 4.5 3.4 3.8 4.4 M 6.2 2.4 4.9 5.6 N 0.9 0.9 0.8 0.9 Table 37. Template matching times. Average execution time, in milliseconds, for matching of proprietary templates in each of the databases. A verification system includes a matcher module that compares a query or sample fingerprint template with an enrolled fingerprint template to produce a measure of similarity between both templates. The templates can be generated by either the matcher module or other enrolment modules. In case of dealing with minutiae-based templates, the recognition accuracy of the system and the interoperability between different enrolment and/or matching modules is affected by the minutiae detection and extraction algorithms implemented in each of the modules, the minutiae encoding schemes (proprietary versus standard) applied to the system, and the way the minutiae extractor and matcher modules are able to process those minutiae-based templates. Apart from the conclusions taken from those one to one matching tests based on single minutiae-based representations of fingerprints, further tests were carried out in MINEX04. New tests dealing with the matching process of multiple templates per user, that is, multiple fingerprint impressions (corresponding to different fingers) per user, were carried out in order to evaluate the impact of using more distinctive information from the users. Based on those complementary tests, it was concluded that the initial degradation in accuracy, experienced when using standard single-template minutiae-based representations with regard to the usage of proprietary single-template representations, can be compensated by increasing the number of templates per user (that is, the amount of fingers used to characterize one individual in the recognition system) while keeping the standard minutiae-based encoding schemes in the system. In this direction, it is proven that the matching process in standard minutiae-based representation of a set of two fingerprints from two different fingers –two templates– of the legitimate user with the corresponding two fingerprints –the corresponding two templates– of the query user (few to few matching) leads to similar or better performances than the matching process of proprietary single-templates (one to one matching). 4.3.10. Ongoing Minutiae Interoperability Exchange Test (OMINEX) OMINEX is an ongoing evaluation program organized by NIST that is in charge of tracking the advances on interoperability between systems based on fingerprint ANSI/INCITS 378 minutiae template representations. OMINEX follows the approach of NIST’s MINEX04 contest since 2005 to date. Table 38 shows the list of participants and the test result completion dates. 134 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 SDK ID A B C D E F G N 1A 1B 1C 1D 1E 1F 1G 1H 1I 1J 1K 1L 1M 1N 1O 1P 1Q 1R 1S 1T 1U 1V 1W 1X 1Y 1Z Participant Cogent Systems, Inc. Dermalog Identification Systems GmbH Bioscrypt, Inc. Sagem Morpho, Inc. Neurotechnologija Ltd. Innovatrics NEC Corporation Cross Match Technologies Startek Engineering, Inc. Aware, Inc. Identix, Inc. Precise Biometrics Sagem Morpho, Inc. XTec, Inc. SecuGen Corporation Ultra-Scan Corporation Startek Engineering, Inc. BIO-key International, Inc. Antheus Technology Motorola Aware, Inc. Sonda Technologies Ltd. Griaule Tecnologia Ltda. Precise Biometrics Aware, Inc. Startek Engineering, Inc. Secure Design KK Neurotechnologija Ltd. Ultra-Scan Corporation Precise Biometrics Aware, Inc. Secure Design KK Aware, Inc. Griaule Tecnologia Ltda. Test Completed 03/21/2006 03/21/2006 03/21/2006 03/21/2006 03/21/2006 03/21/2006 03/21/2006 03/21/2006 07/05/2006 07/05/2006 07/05/2006 08/14/2006 08/30/2006 09/11/2006 09/26/2006 09/26/2006 10/19/2006 11/16/2006 12/07/2006 12/15/2006 01/10/2007 01/10/2007 04/05/2007 03/12/2007 03/28/2007 04/17/2007 05/08/2007 05/08/2007 05/18/2007 06/29/2007 07/25/2007 07/25/2007 08/31/2007 09/19/2007 SDK ID 2A 2B 2C 2D 2E 2F 2G 2H 2I 2J 2K 2L 2M 2N 2O 2P 2Q 2R 2S 2T 2U 2V 2W 2X 2Y 2Z 3A 3B 3C 3D 3E 3F 3G 3H Participant ImageWare NITGen Co., Ltd. Startek Engineering, Inc. BioVision Co., Ltd. Aware, Inc. BioVision Co., Ltd. Suprema, Inc. FingerMatch Sonda Technologies, Ltd. Bioscrypt Enterprise Access, L-1 Identity Solutions Precise Biometrics Neurotechnologija Ltd. id3 Semiconductors Precise Biometrics Tiger IT Bangladesh Ltd. Precise Biometrics NEC Corporation Aware, Inc. CogniBio Griaule Tecnologia Ltda. Biomatch, Inc. UPEK GreenBit S.p.A. Futronic Technology Co. Ltd. Id3 Semiconductors SkyBiometrics Sagem Morpho, Inc. Innitor Biosystems Clickone Device, Inc. NITGen Co., Ltd. Patrima Technology Company DigitalPersona, Inc. Innovatrics Thales Test Completed 11/21/2007 02/06/2008 05/09/2008 05/14/2008 09/04/2008 11/03/2008 02/23/2009 02/23/2009 04/03/2009 04/16/2009 05/18/2009 06/29/2009 07/27/2009 10/28/2009 12/14/2009 12/14/2009 03/31/2010 03/31/2010 06/07/2010 04/15/2010 06/23/2010 09/01/2010 07/01/2010 09/01/2010 02/02/2011 06/02/2011 03/14/2011 05/09/2011 07/21/2011 07/21/2011 10/12/2011 Table 38. List of participants OMINEX. 4.3.11. Assessment of Match-on-Card Technology (MINEXII) MINEXII is the part of MINEX program that assesses fingerprint minutiae match-on-card technology under ISO/IEC 7816 smart cards. This test aims at evaluating the accuracy and speed of match-on-card verification algorithms when running on ISO/IEC 7816 smart cards. The final aim is to improve the performance and interoperability of those implementations of the ANSI/INCITS 378 and ISO/IEC 19794-2 fingerprint minutia standards. Participant ID MX2-IV-A MX2-IV-B MX2-IV-C MX2-IV-D MX2-IV-E MX2-IV-F MX2-IV-G MX2-IV-H MX2-IV-I MX2-IV-J MX2-IV-K MX2-IV-L MX2-IV-M MX2-IV-N MX2-IV-O MX2-IV-P MX2-IV-Q Fingerprint Matcher Vendor Neurotechnology Neurotechnology Morpho Morpho ID3 ID3 Precise Biometrics Precise Biometrics Precise Biometrics Precise Biometrics Innovatrics Innovatrics Micro-PackS Dermalog Dermalog Dermalog Institute for Infocomm Research Smartcard Vendor Athena Athena Morpho Morpho Oberthur Oberthur Spyrus Giesecke & Devrient Gemalto Gemalto Gemalto Gemalto Gemalto Gemalto PAV Card MaskTech DART Scenario 1 1.63% 3.88% 1.79% 1.51% 3.11% 3.14% 1.89% 1.89% 1.89% 1.89% 2.49% 2.47% 1.73% 2.83% 2.83% 2.83% 4.41% Scenario 2 3.72% 6.98% 1.79% 1.51% 4.27% 4.69% 3.27% 3.27% 3.27% 3.27% 3.48% 3.47% 2.40% 4.51% 4.51% 4.51% 6.73% Scenario 3 6.17% 15.37% 3.84% 3.25% 7.07% 7.13% 4.89% 4.89% 4.89% 4.95% 5.35% 5.34% 4.10% 6.29% 6.29% 6.29% 9.21% Scenario 4 4.94% 12.96% 4.85% 4.19% 7.07% 6.98% 4.51% 4.51% 4.51% 4.57% 5.83% 5.82% 4.74% 6.29% 6.29% 6.29% 9.21% Table 39. FNMR values at FMR=0.05% in case of performing the match-on-card operation of one single enrolment fingerprint template and one authentication fingerprint template in 4 different scenarios (the enrolment and authentication templates are produced by different sources in each scenario). 135 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 MINEXII is carried out by NIST, and up to four different evaluation phases have been done since 2007 to 2011. Table 39 and Table 40 show the list of participants, the recognition accuracy, and the execution time performances exhibited in MINEXII – Phase IV. Participant ID MX2-IV-A MX2-IV-B MX2-IV-C MX2-IV-D MX2-IV-E MX2-IV-F MX2-IV-G MX2-IV-H MX2-IV-I MX2-IV-J MX2-IV-K MX2-IV-L MX2-IV-M MX2-IV-N MX2-IV-O MX2-IV-P MX2-IV-Q Verify 0.225s 0.148s 0.147s 0.517s 0.076s 0.085s 0.080s 1.272s 0.310s 0.317s 1.173s 1.338s 0.537s 0.377s 0.606s 0.718s 1.007s Table 40. Median durations of the ISO/IEC 7816-4 VERIFY command (for genuine minutiae template comparisons) derived from 1210 genuine trials. The hardware specifications for the match-on-card process is unknown: NIST did not ask for information related to card processor, memory, nor cost; but it is likely that each participant of MINEXII submitted its recent and more capable and expensive cards to the test in order to achieve the maximum possible performance. 4.4. Research Institutions In the following sections, some of the most recognized research groups in both industry and academia that cover different research areas like biometric algorithms development, recognition performance evaluation, and physical implementation of recognition systems are disclosed. 4.4.1. Academy Many research group initiatives have arisen in the last years dealing with the investigation of efficient algorithms and architectures for the development of biometric applications. Table 41 points out some of the most active research groups in the field of fingerprint-based recognition systems. Research Group and Institution Researcher Research Fields Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Evaluation of biometric systems Implementation of biometric systems on HPC platforms Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Generation of synthetic fingerprint images Evaluation of biometric systems Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Evaluation of biometric systems Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Evaluation of biometric systems URL Department of Computer Science and Engineering. Anil K. Jain Biometrics Research Group. Michigan State University. Michigan, USA. Department of Electronics, Computer sciences and Systems. Dario Maio Biometric Systems Laboratory. University of Bologna. Bologna, Italy. National Biometric Test Center. James L. San José State University. Wayman San José, California, USA. Area of Theory of Signs and Communications. Javier Ortega, Biometric Recognition Group. Julián Fiérrez Universidad Autónoma de Madrid. Madrid, Spain. Note: Table 41 continues in next page. http://cse.msu.edu/ biometrics/ http://biolab.csr.unibo.it/ http://www.engr.sjsu.edu/biomet rics/ http://www.eps.uam.es/esp/inve stigacion/index2.php?siglas=AT VS#ATVS/ 136 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Research Group and Institution Researcher Research Fields Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Implementation of biometric systems on embedded platforms URL http://www.cbsr.ia.ac.cn/english/ index.asp http://www.fingerpass.net Institute of Automation. Center for Biometrics and Security Jie Tian, Research. Stan Z. Li Chinese Academy of Sciences. Beijing, China. Center for Identification Technology Research (CITeR) & Lane Department of Computer Science and Electrical Arun Ross Engineering. West Virginia University. Morgantown, West Virginia, USA. Centre for Information Security. School of Electrical and Electronic Xudong Jiang Engineering. Nanyang Technological University. Singapore. Biometric Research Centre. David Zhang Department of Computing. The Hong Kong Polytechnic University. Dapeng Hung Hom, Kowloon, Hong Kong. Electronics Technology Department. University Group for Identification Raul Sánchez Technologies. Reillo Universidad Carlos III. Madrid, Spain. Polytechnical University of Madrid. Group of Biometrics, Biosignals and Carmen Sánchez Security. Ávila Complutense University of Madrid. Madrid, Spain. Department of Electrical Engineering. Ingrid Embedded Security Group. Verbauwhede University of California. Los Angeles, USA. Department of Computer and Systems. Microcomputers Research Group. Giovanni Danese University of Pavia. Pavia, Italy. Electronics Technology Department. Microelectronic Design and Luis Entrena Applications Group. Universidad Carlos III. Madrid, Spain. Department of Electronic, Electrical and Automatic Control Engineering. Development of Embedded Systems Enrique Cantó Research Group. Universitat Rovira i Virgili. Tarragona, Spain. Department of Electronic Engineering. Embedded Systems and Biometric Mariano López Identification Group. Universitat Politècnica de Catalunya. Vilanova i la Geltrú, Spain. Computer Engineering Department. Innovative Digital Computer Filippo Sorbello Architecture Research Group. University of Palermo. Palermo, Italy. Institute of Electronics, Communications and Information Technology. Danny Crookes School of Electronics, Electrical Engineering and Computer Science. Queen’s University Belfast. Belfast, United Kingdom. Electronics and Telecommunications Research Jang-Hee Yoo Institute (ETRI). Information Security Research. Daejeon, South Korea. Note: Table 41 continues in next page. Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) http://citer.wvu.edu/ http://www.lcsee.cemr.wvu.edu/ Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) http://www.cis.eee.ntu.edu.sg/R esearch/Pages/Biometrics.aspx Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Implementation of biometric systems on embedded platforms Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Implementation of biometric systems on HPC platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Implementation of biometric systems on HPC platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Implementation of biometric systems on HPC platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications http://www4.comp.polyu.edu.hk/ ~biometrics/ http://guti.uc3m.es/ http://sites.google.com/site/engb 2s/ http://www.emsec.ee.ucla.edu/ http://dis.unipv.it/ http://dma.uc3m.es/dma/ http://deeea.urv.cat/DEEEA/cat/ recerca/ http://petrus.upc.es/emsy/ http://www.dinfo.unipa.it/en/cont ent/research-topics/ http://www.ecit.qub.ac.uk/Resea rch/SpeechVisionSystems/ http://www.etri.re.kr/eng/ 137 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Research Group and Institution Department of Computer & Information Science. Korea University. Seoul, South Korea Department of Information Control and Instrumentation Engineering. Chosun University. Gwangju, South Korea Electronics and Telecommunications Research Institute (ETRI). Biometrics Technology Research. Daejeon, South Korea. College of Engineering. School of Electrical and Computer Engineering. University of Seoul. Seoul, Korea. Signal Processing Research Group. Escola Universitària Politècnica de Mataró. Universitat Politècnica de Catalunya. Mataró, Spain. Department of Communication and Integrated Systems. School of Science and Engineering. Tokyo Institute of Technology. Tokyo, Japan. Department of Computer Science and Engineering. Indian Institute of Technology Kanpur. Kanpur, India. Electronics Department. Signal Theory and Communications Research Group. Universidad de Mondragón. Arrasate-Modragón, Spain. Assistive Technologies Research Center (ATRC). College of Engineering and Computer Science. Wright State University. Dayton, Ohio, USA. Department of Computer Science and Engineering. The Chinese University of Hong Kong. Shatin, N.T., Hong Kong. Researcher Research Fields Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (algorithm development) Other biometrics (algorithm development) Implementation of biometric systems on embedded platforms Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Fingerprint biometrics (system architecture) Implementation of biometric systems on embedded platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms Implementation of biometric systems on HPC platforms Exploitation of FPGA devices in biometric applications Fingerprint biometrics (system architecture) Other biometrics (system architecture) Implementation of biometric systems on embedded platforms URL Yongwha Chung http://www.korea.edu/ Sung Bum Pan http://eng.chosun.ac.kr/ Daesung Moon http://www.etri.re.kr/eng/ Kichul Kim http://eng.snu.ac.kr/english/ Marcos FaundezZanuy http://www.eupmt.cat/ Hiroaki Kunieda http://www.titech.ac.jp/english/ Phalguni Gupta http://www.cse.iitk.ac.in/ http://www.mondragon.edu/en/p hs/research/researchlines/signal-andcommunications-theory/ Jon Altuna Nikolaos Bourbakis http://www.cs.wright.edu/atrc/ Yiu Sang Moon http://www.cse.cuhk.edu.hk/v6/r esearch/lab/biometrics.html Table 41. Some of the most active academy-related research groups in the field of fingerprint biometrics. 4.4.2. Industry Many companies have been grouped in the way of organizations to work together with the final aim of advancing in the field of biometrics. One example can be seen in The Biometric Consortium (http://www.biometrics.org) that serves as the US government’s focal point for research, development, testing, evaluation, and application of biometric-based personal identification/verification technology. Another example is The International Biometrics & Identification Association (http://www.ibia.org), founded in Washington, which promotes the effective use of technology to identify individuals and enhance security and privacy in current society. In addition, The European Biometric Forum, which has recently shut its doors, was an independent organization aiming at supporting the development of the biometrics industry in Europe as well. Table 42 shows some of the companies that are member of such organizations in the area of fingerprint biometrics. These are only some examples among the thousands of companies around the world actively involved in the field of biometrics. 138 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Company Name 123ID, Inc. 360 Biometrics 3M Cogent Systems ActivIdentity AllTrust Networks Antheus Technology, Inc. Artemis Solutions Group LLC. AuthenTec, Inc. Axxis Biometrics, LLC. Bergdata Biometrics GmbH. BIO-key International BioLink Solutions Biometrika srl. Biometrix Int. Cansec Systems Ltd. Cross Match Technologies, Inc. Datastrip Company Name DERMALOG Identification Systems GmbH. Digent DigitalPersona, Inc. e-DATA Corp. Fingerprint Cards AB Green Bit S.p.A. Griaule Biometrics Hongda Opto-Electron Co., Ltd. IBM Corporation id3 Semiconductors IDLink Systems Pte Ltd. IDENCOM Germany GmbH. iEVO Innovatrics Integrated Biometrics Lumidigm, Inc. M2SYS LLC. Company Name Merkatum Corporation NEC Corporation Neurotechnology NITGEN & Company Co. Ltd. Precise Biometrics Safran L-1 Identity Solutions, Inc. Safran MORPHO, Inc. (MorphoTrak) SecuGen Corporation Sense Technologies Startek Engineering Inc. Suprema, Inc. TBS (Touchless Biometric Systems) TechSense Ventures Group Pte. Ltd. TSSI Systems Ltd. Ultra-Scan Corporation Verimetrics Zvetco Biometrics LLC. Table 42. Some of the most relevant companies in the field of fingerprint biometrics. 4.5. Research Disclosure The continuous advances in the fields of biometrics and embedded systems technologies, and the deployment of biometric applications are mainly disclosed through international journals and conferences all over the world. In the next sections, a general overview of the most relevant journals and conferences that help to spread the recent advances in those research areas is provided. 4.5.1. Journals In Table 43 some of the most important international scientific journals focused on topics such as pattern recognition, biometrics, digital design, and development of embedded systems are listed. Acronym AEU ASMP ASP COMCOM COSE CVIU DAES DSP ES FGCS IET-BMT IET-CDT IET-CVI IET-EL IET-IFS IET-IPR IET-SPR IJPP IPL IS IVC IVP JECE JETC JETCAS JLPEA JNCA Journal International Journal of Electronics and Communications EURASIP Journal on Audio, Speech, and Music Processing EURASIP Journal on Advances in Signal Processing Computer Communications Computers & Security Computer Vision and Image Understanding Design Automation for Embedded Systems Digital Signal Processing EURASIP Journal on Embedded Systems Future Generation Computer Systems IET Biometrics IET Computers & Digital Techniques IET Computer Vision IET Electronics Letters IET Information Security IET Image Processing IET Signal Processing International Journal of Parallel Programming Information Processing Letters EURASIP Journal on Information Security Image and Vision Computing EURASIP Journal on Image and Video Processing Journal of Electrical and Computer Engineering ACM Journal on Emerging Technologies in Computing Systems IEEE Journal on Emerging and Selected Topics in Circuits and Systems Journal of Low Power Electronics and Applications Journal of Network and Computer Applications Editor Elsevier Hindawi Hindawi Elsevier Elsevier Elsevier Springer Elsevier Hindawi Elsevier IET, IEEE IET, IEEE IET, IEEE IET, IEEE IET, IEEE IET, IEEE IET, IEEE Springer Elsevier Hindawi Elsevier Hindawi Hindawi ACM IEEE MDPI Elsevier Elsevier URL http://www.sciencedirect.com/science/journal/14348411 http://www.hindawi.com/journals/asmp/ http://www.hindawi.com/journals/asp/ http://www.elsevier.com/locate/comcom http://www.elsevier.com/locate/cose http://www.elsevier.com/locate/cviu http://www.springer.com/engineering/circuits+%26+systems/j ournal/10617 http://www.elsevier.com/locate/dsp http://www.hindawi.com/journals/es/ http://www.elsevier.com/locate/fgcs http://digital-library.theiet.org/IET-BMT http://digital-library.theiet.org/IET-CDT http://digital-library.theiet.org/IET-CVI http://digital-library.theiet.org/EL http://digital-library.theiet.org/IET-IFS http://digital-library.theiet.org/IET-IPR http://digital-library.theiet.org/IET-SPR http://www.springer.com/computer/theoretical+computer+sci ence/journal/10766 http://www.elsevier.com/locate/ipl http://www.hindawi.com/journals/is/ http://www.elsevier.com/locate/imavis http://www.hindawi.com/journals/ivp/ http://www.hindawi.com/journals/jece/ http://jetc.acm.org/ http://jetcas.polito.it/ http://www.mdpi.com/journal/jlpea/ http://journals.elsevier.com/10848045/journal-of-networkand-computer-applications/ http://www.elsevier.com/locate/jpdc JPDC Journal of Parallel and Distributed Computing Note: Table 43 continues in next page. 139 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Acronym JRTIP JSA JSEE JSPS JVCIR MEE MEJO MICPRO MTA PARCO PR PRIA PRL SP SPIC TC TCAD TCAS TCE TCSVT TECS TIE TIFS TII TIP TISSEC TJS TOCL TODAES TPAMI TRETS TSP TVLSI VLSI Journal Journal of Real-Time Image Processing Journal of Systems Architecture Journal of Systems Engineering and Electronics Journal of Signal Processing Systems Journal of Visual Communication and Image Representation Microelectronic Engineering Microelectronics Journal Microprocessors and Microsystems Multimedia Tools and Applications Parallel Computing Pattern Recognition Pattern Recognition and Image Analysis Pattern Recognition Letters Signal Processing Signal Processing: Image Communication IEEE Transactions on Computers IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems IEEE Transactions on Circuits and Systems II: Express Briefs IEEE Transactions on Consumer Electronics IEEE Transactions on Circuits and Systems for Video Technology ACM Transactions on Embedded Computing Systems IEEE Transactions on Industrial Electronics IEEE Transactions on Information Forensics and Security IEEE Transactions on Industrial Informatics IEEE Transactions on Image Processing ACM Transactions on Information and System Security The Journal of Supercomputing ACM Transactions on Computational Logic ACM Transactions on Design Automation of Electronic Systems IEEE Transactions on Pattern Analysis and Machine Intelligence ACM Transactions on Reconfigurable Technology and Systems IEEE Transactions on Signal Processing IEEE Transactions on Very Large Scale Integration (VLSI) Systems Integration, the VLSI Journal Editor Springer Elsevier Elsevier Springer Elsevier Elsevier Elsevier Elsevier Springer Elsevier Elsevier Springer Elsevier Elsevier Elsevier IEEE IEEE IEEE IEEE IEEE ACM IEEE IEEE IEEE IEEE ACM Springer ACM ACM IEEE ACM IEEE IEEE Elsevier URL http://www.springer.com/computer/image+processing/journal /11554 http://www.elsevier.com/locate/sysarc http://www.elsevier.com/locate/inca/709324 http://www.springer.com/engineering/signals http://www.elsevier.com/locate/jvci http://www.elsevier.com/locate/mee http://www.elsevier.com/locate/mejo http://www.elsevier.com/locate/micpro/ http://www.springer.com/computer/information+systems+and +applications/journal/11042 http://www.elsevier.com/locate/parco http://www.elsevier.com/locate/pr http://www.springer.com/computer/image+processing/journal /11493 http://www.elsevier.com/locate/patrec http://www.elsevier.com/locate/sigpro http://www.elsevier.com/locate/image http://www.computer.org/portal/web/tc http://www.ece.umn.edu/~sachin/tcad/ http://ieee-cas.org/publications/transactions-on-circuits-andsystems-part-ii-express-briefs/ http://www.ewh.ieee.org/soc/ces/publications_trans_ce.html http://ieee-cas.org/publications/transactions-on-circuits-andsystems-for-video-technology/ http://acmtecs.acm.org/ http://tie.ieee-ies.org/ http://www.signalprocessingsociety.org/publications/periodic als/forensics/ http://tii.ieee-ies.org/ http://www.signalprocessingsociety.org/publications/periodic als/image-processing/ http://tissec.acm.org/ http://www.springer.com/computer/swe/journal/11227 http://tocl.acm.org/ http://todaes.acm.org/ http://www.computer.org/portal/web/tpami/ http://trets.cse.sc.edu/ http://www.signalprocessingsociety.org/publications/periodic als/tsp/ http://tvlsi.eecs.northwestern.edu/index.html http://www.elsevier.com/locate/vlsi Table 43. List of Journals. 4.5.2. Conferences Table 44 provides a list with some of the most important international conferences dealing with biometrics and embedded system-based applications. Acronym ACCPR ACSAC Conference Publisher IEEE ACM IEEE IEEE Springer Springer IEEE IEEE ACM IEEE Springer ACM URL http://www.acpr2011.org/ http://www.acsac.org/ http://act.engineersnetwork.org/2011/ http://www.see.ed.ac.uk/ahs2011/ http://ppi.fudan.edu.cn/appt2011/ http://arc2011.org/ http://artcom.engineersnetwork.org/2011/ http://asap-conference.org/ http://www.cs.hku.hk/asiaccs2011/ http://fjic.xmu.edu.cn/ASID/ http://congreso.us.es/caip2011/ http://www.sigsac.org/ccs.html Asian Conference on Pattern Recognition Annual Computer Security Applications Conference International Conference on Advances in Computing, Control, & ACT Telecommunication Technologies AHS NASA/ESA Conference on Adaptive Hardware and Systems APPT Advanced Parallel Processing Technologies ARC International Symposium on Applied Reconfigurable Computing International Conference on Advances in Recent Technologies in ARTCom Communication and Computing International Conference on Application-specific Systems, Architectures and ASAP Processors ASIACCS ACM Symposium on Information, Computer and Communications Security International Conference on Anti-Counterfeiting, Security and Identification in ASID Communication CAIP International Conference on Computer Analysis of Images and Patterns CCS Conference on Computer and Communications Security Note: Table 44 continues in next page. 140 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Acronym CHES CIARP CIS CISDA CISP CISSE CODASPY CODES+ISSS CSA CSAE CVPR DAC DASIP DATE DICTA DIM Conference Cryptographic Hardware and Embedded Systems Progress in Pattern Recognition, Image Analysis and Applications International Conference on Computational Intelligence and Security Symposium on Computational Intelligence for Security and Defense Applications International Congress on Image and Signal Processing International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering ACM Conference on Data and Application Security and Privacy International Conference on Hardware/software Codesign and System Synthesis International Symposium on Computer Science and its Applications International Conference on Computer Science and Automation Engineering Conference on Computer Vision and Pattern Recognition Design Automation Conference Conference on Design & Architectures for Signal & Image Processing Conference on Design, Automation and Test in Europe Publisher Springer Springer Springer IEEE IEEE IEEE ACM ACM IEEE IEEE IEEE IEEE ACM IEEE European Design and Automation Association, IEEE, ACM IEEE ACM IEEE IEEE IEEE IEEE IEEE IEEE ACM IEEE IEEE ACM IEEE IEEE ACM IEEE Springer IEEE IEEE ACM IEEE IEEE IEEE IEEE IEEE IEEE IEEE Springer IEEE Springer IEEE IEEE Springer IEEE IEEE IEEE IEEE IEEE IEEE IEEE IEEE IEEE IEEE URL http://www.iacr.org/workshops/ches/ http://www.ciarp.org/ http://www.cis-lab.org/ http://ieee-ssci.org/2011/cisda-2011/ http://cisp-bmei.dhu.edu.cn/index.html http://www.cisseconference.org/2011 http://www.codaspy.org/ http://esweek.acm.org/codesisss/ http://www.ftrai.org/csa2011/ http://www.ieee-csae.org/ http://www.cvpr2011.org/ http://www.dac.com/ http://www.ecsi.org/dasip http://www.date-conference.com/ http://itee.uq.edu.au/~dicta2011/ http://www2.pflab.ecl.ntt.co.jp/dim2010/ http://www.dsdconf.org/ http://dsp2011.gr/ http://www.eit-conference.org/ http://www.ieee-espa.org/ http://fccm.org/2011/ http://fitme2010.jstu.edu.cn/ http://www.isfpga.org http://fpl2011.org/ http://www.cse.iitd.ernet.in/~icfpt11/ http://www.glsvlsi.org/ http://hcw.wsu.edu/ http://hostsymposium.org/ http://www.hipeac.net/hipeac2011 http://hpcs11.cisedu.info/ http://www.frcrce.ac.in/icac/ http://www.isical.ac.in/~icapr09/ http://www.icassp2011.com/ http://www.nitrkl.ac.in/conference/icccs2011 http://www.iccd-conf.com/ http://www.icce.org/ http://www.iceec.org/ http://www.cs.ucy.ac.cy/CIT2011/ http://www.iccms.org/ http://www.iccsp2011.nitc.ac.in/ http://www.iccst2011.com/ http://www.iciar.uwaterloo.ca/iciar11/ http://www.utp.edu.my/icias2010/ http://www.ic-ic.org/2011/ http://www.icics.org/2011/ http://www.icip2011.org/ http://www.uqtr.ca/~icisp/ http://www.ieee-icm.com/ http://www.theuais.org/icpca2011/ http://www.icpr2012.org/ http://www.iecon2011.org/ http://www.cse.nd.edu/IJCB_11/ http://www.ipdps.org/ http://ipta10.ibisc.univ-evry.fr/doku.php http://www.iscas2011.org/ http://www3.ntu.edu.sg/SCE/isce2011/ http://www.isie2011.pl/ Digital Image Computing: Techniques and Applications Workshop on Digital Identity Management Euromicro Conference on Digital System Design, Architectures, Methods and DSD Tools DSP International Conference on Digital Signal Processing EIT International Conference on Electro/Information Technology ESPA International Conference on Emerging Signal Processing Applications Annual International IEEE Symposium on Field-Programmable Custom FCCM Computing Machines International Conference on Future Information Technology and Management FITME Engineering FPGA ACM/SIGDA International Symposium on Field Programmable Gate Arrays FPL International Conference on Field Programmable Logic and Applications FPT International Conference on Field Programmable Technology GLSVLSI Great Lakes symposium on VLSI HCW International Heterogeneity in Computing Workshop HOST International Workshop on Hardware-Oriented Security and Trust International Conference on High Performance and Embedded Architectures HiPEAC and Compilers HPCS International Conference on High Performance Computing and Simulation International Conference on Advances in Computing, Communication and ICAC3 Control ICAPR International Conference on Advances in Pattern Recognition ICASSP International Conference on Acoustics, Speech and Signal Processing ICCCS International Conference on Communication, Computing & Security ICCD International Conference on Computer Design ICCE International Conference on Consumer Electronics ICCES International Conference on Computer Engineering & Systems ICCIT International Conference on Computer and Information Technology ICCMS International Conference on Computer Modeling and Simulation ICCSP International Conference on Communications and Signal Processing ICCST International Carnahan Conference on Security Technology ICIAR International Conference on Image Analysis and Recognition ICIAS International Conference on Intelligent and Advanced Systems ICIC International Conference on Intelligent Computing International Conference on Information, Communications and Signal ICICS Processing ICIP International Conference on Image Processing ICISP International Conference on Image and Signal Processing ICM International Conference on Microelectronics ICPCA International Conference on Pervasive Computing and Applications ICPR International Conference on Pattern Recognition IECON Annual Conference on IEEE Industrial Electronics IJCB International Joint Conference on Biometrics IPDPS International Parallel and Distributed Processing Symposium IPTA Workshops on Image Processing Theory, Tools and Applications ISCAS International Symposium on Circuits and Systems ISCE International Symposium on Consumer Electronics ISIE International Symposium on Industrial Electronics Note: Table 44 continues in next page. 141 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Acronym ISLPED ISPACS ISSPA MiFor MIXDES MM&Sec MWSCAS NSPW PCSPA PSIVT RAW ReConFig ReCoSoC SAC SAFECOMP SAMOS SIGMAP SiPS SOCC SoICT SPL TRUST WESS WIFS WISA WISES WISP Conference International Symposium on Low Power Electronics and Design International Symposium on Intelligent Signal Processing and Communication Systems International Conference on Information Sciences Signal Processing and their Applications International ACM Workshop on Multimedia in Forensics and Intelligence International Conference on Mixed Design of Integrated Circuits and Systems Workshop on Multimedia and Security International Midwest Symposium on Circuits and Systems Workshop on New Security Paradigms International Conference on Pervasive Computing Signal Processing and Applications Advances in Image and Video Technology Reconfigurable Architectures Workshop International Conference on Reconfigurable Computing and FPGAs International Workshop on Reconfigurable Communication-centric Systemson-Chip ACM Symposium on Applied Computing International Conference on Computer Safety, Reliability, and Security International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation International Conference on Signal Processing and Multimedia Applications Workshop on Signal Processing Systems International SOC Conference Symposium on Information and Communication Technology Southern Conference on Programmable Logic International Conference on Trust and Trustworthy Computing Workshop on Embedded Systems Security International Workshop on Information Forensics and Security International Workshop on Information Security Applications Workshop on Intelligent Solutions in Embedded Systems International Symposium on Intelligent Signal Processing Publisher IEEE IEEE IEEE ACM IEEE ACM IEEE ACM IEEE Springer IEEE IEEE IEEE ACM Springer IEEE IEEE IEEE IEEE ACM IEEE Springer ACM IEEE Springer IEEE IEEE URL http://www.islped.org/2011/ http://www.ispacs2011.org/ http://www.isspa2010.com/ http://www.acmmm11.org/ http://www.mixdes.org/ http://www.mmsec11.com/ http://www.mwscas.org/ http://www.nspw.org/ http://www.pcspa2011.com/ http://www.psivt.org/ http://www.ece.lsu.edu/vaidy/raw/ http://www.reconfig.org/ http://www.recosoc.org/ http://www.acm.org/conferences/sac/ http://www.safecomp2011.unina.it/ http://samos.et.tudelft.nl/ http://sigmap.icete.org/ http://www.sips11.org/ http://www.ieee-socc.org/ http://www.fit.hut.edu.vn/~soict2010/ http://www.splconf.org/spl11/ http://www.trust2011.org/ http://www.wess-workshop.org/ http://www.wifs11.org http://www.wisa.or.kr/ http://fbim.fh-regensburg.de/~wises2011/ http://www.trivent.hu/WISP2011/ Table 44. List of International Conferences. 4.6. Related Work In the field of biometrics, different research lines have been established along the last decades. One of the directions is based on the study and development of reliable recognition algorithms, which has been already covered in chapter 2. Other directions deal with the definition and the physical implementation of high-performance platforms where to run those recognition algorithms, as discussed in the chapter 3 and the first sections of this chapter. Different options are possible depending on the nature of the system, covering: (i) standard solutions based on the execution of the recognition algorithms under commercial personal computer workstations, (ii) application-specific chips in charge of the biometric processing, or (iii) other more flexible embedded system solutions based on cost-effective general-purpose processors, digital signal processors and/or programmable logic devices. In the next sections, a classification of the most relevant works published in the scientific literature is made for AFAS applications, other monomodal biometric applications, or multimodal biometric systems. 4.6.1. AFAS SW In the last years, many works have been published dealing with the physical implementation of AFAS applications under HPC or embedded system platforms based on general-purpose or application-specific standard processors. The execution of the recognition applications is managed purely by software and directly driven by microcontrollers, microprocessors, digital signal processors, graphics processing units or multiprocessor devices embedded in the systems. The system performance is mainly constrained by the limited computational power of the processing platforms –which has an effect on the real-time execution performance– as well as the non-sufficient recognition accuracy inherent to the algorithms under test –which affects the FAR/FRR/EER performance–. Table 45 summarizes some of the AFAS applications based on purely software embedded platforms disclosed in literature. 142 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Research Work [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Embedded System Architecture (1) Whole Recognition Process: MPU (StrongArm, 206 MHz) Whole Recognition Process: MCU (8051, 25 MHz) Whole Recognition Process: MPU (ARM9 S3C2410) Image Enhancement, Feature Extraction & Matching: DSP (TMS320VC5510A, 200 MHz) Whole Recognition Process: DSP (TMS320C54x, 100 MHz) Whole Recognition Process: DSP (TMS320F2812) Fingerprint Acquisition & Image Reconstruction: MPU (ARM LPC2106, 60 MHz) Image Enhancement, Feature Extraction & Matching: MPU (S1C33, 13 MHz) Image Enhancement, Feature Extraction & Matching: MPU (S1C33, 13 MHz) Image Enhancement, Feature Extraction & Matching: MPU (S1C33, 13 MHz) Feature Matching: MPU (ARM7TDMI, 50 MHz) Whole Recognition Process: MPU (LEON-2, 50 MHz) HW/SW SW-only SW-only SW-only SW-only SW-only SW-only SW-only Execution Time Performance (2) 0.9 s < 1.0 s < 0.5 s > 2.0 s 0.665 s < 1.0 s 1.5 s Recognition Accuracy Performance (2) EER = 6.94% EER = 4.27% FAR = 0.0001% FRR = 0.1% FAR = 0.18% FRR = 2.79% FAR = 0.001% FRR = 0.1% FAR < 1% FRR < 5% EER = 4.13% 6.0 s SW-only SW-only SW-only SW-only 5.0 s 4.5 s 0.9 s 10.0 s EER = 4.16% EER = 4.23% EER = 6.00% FAR = 0.01% FRR = 0.5% EER = 9.6% EER = 7.9% EER = 7.2% 1.10 s Feature Matching: [12] SW-only 3.28 s MPU (Java processor, 5 MHz) 6.49 s Note 1: The recognition algorithms, memory and processing resources used in each research work are different. Note 2: Performance indicator values in each work are given for different databases. [1]: [Tang et al., 2004] [5]: [Rikin et al., 2002] [9]: [Su et al., 2005a] [2]: [Chen and Dai, 2005] [6]: [Zhang et al., 2011] [10]: [Gil et al., 2003] [3]: [Ping et al., 2010] [7]: [Su et al., 2005b] [11]: [Hwang and Verbauwhede, 2004] [4]: [Shi and Xie, 2009] [8]: [Chen et al., 2005] [12]: [Moon et al., 2005] Table 45. Software-based AFAS applications disclosed in literature. A survey on the state of the art about purely software solutions points out four major trends in the development of AFAS applications: a) development of AFAS applications in the form of static embedded systems b) development of AFAS applications in the form of portable embedded systems c) development of AFAS applications under off-the-shelf hardware embedded systems d) development of AFAS applications under commercial personal computer platforms Moreover, when addressing the deployment of biometric applications under processing platforms with limited computational resources, a priority topic refers to the accommodation techniques that are followed in order to fit the complexity of the original biometric algorithms into those available hardware resources of the systems to reach, in the end, acceptable accuracy and execution time performances. Several trade-offs exist that need to be carefully analysed in each application. Algorithm accommodation techniques into system processing resources Many approaches deal with the implementation of an AFAS application according to a design process composed of two steps. In the first step, the verification of the biometric recognition algorithm takes place under one HPC platform such as a personal computer or any other high level workstation. Once the validity of the suggested algorithm is proven, in a second step, the porting of the algorithm to a cost-effective physical platform (e.g. embedded system, smart card, ID token, etc.) is performed. However, in those scenarios where the final physical platform consists of a lowcost embedded system that features poor processing power, longer execution times are normally achieved. In order to meet real-time execution performance in those scenarios, it is usual to perform some adaptations of the original recognition algorithms to fit their complexity within the limited resources featured by the physical platforms that are planned to host the applications. To reach realtime performance, those algorithms are used to relax their computational processing workloads or to reduce the amount of stages needed in the recognition operation, resulting thus in modified versions 143 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 of the original algorithms. As a consequence, some degradation on recognition accuracy performance is normally exhibited by the final system when compared against the original implementation under one HPC platform without processing constraints. One of the most used techniques to accommodate the algorithms into those available system resources deals with the replacement of floating-point operations by fixed-point operations; other techniques deal with the optimization of the memory usage. In [Tang et al., 2004] authors address the implementation of a fingerprint verification application under an embedded system based on a 206 MHz StrongArm processor and the embedded Linux operating system. The recognition algorithm is mainly focused on minutia features directly extracted from greyscale fingerprint impressions. The comparison of the accuracy performance of the original algorithm, based on floating-point operations, and the embedded one, based on fixedpoint operations, shows a slight difference: EER = 6.01% in the former versus EER = 6.94% in the latter. However, the comparison of the execution time performances shows a notorious difference: 21 s versus 0.9 s when executed on StrongArm processor running floating-point emulation (since no FPU is available in the processor) or fixed-point computations, respectively. In [Yeung et al., 2005] authors face the implementation of a real-time fingerprint authentication application under another embedded system platform. The system is composed of one Intel PXA255 processor –commonly found in PDAs and mobile phones–, limited RAM and FLASH memories, and one fingerprint sensor and one touch screen used as user interfaces. A fingerprint verification algorithm based on minutia points is also suggested in this work. It covers the whole recognition process: from the fingerprint acquisition to the authentication result decision stages. The implementation is purely software-based; no hardware coprocessors have been used to speed up the processing. Since embedded processors like Intel PXA255 or Motorola ARM are not equipped with FPUs, and in order to avoid the emulation by software of floating-point operations, which would lead to longer execution times, the migration from slow floating-point arithmetic to faster fixedpoint arithmetic is applied. Lookup tables are used to implement trigonometric functions, as an alternative to hardware instances of CORDIC coprocessors. A comparison of the execution time performance is carried out when dealing with a floating-point embedded system versus when having a fixed-point embedded system. The average verification time optimization is notorious: from 47 s to 0.97 s. The evaluation of both implementations under a custom fingerprint database composed of 1149 images (corresponding to 381 users) proves that no significant degradation of the accuracy is performed in the fixed-point embedded system scenario. It is because the transformation of floating-point arithmetic to fixed-point arithmetic is carefully designed in this work by coding in 32-bit words any number (using fifteen bits for the integer part of the numbers, sixteen bits for the fraction part of the numbers, and one single bit for the sign, just to reduce the quantization errors as much as possible). In [Allah, 2005] author addresses the implementation of a fingerprint authentication system under a computational platform based on a DSP processor. Additional resources like 1MB of FLASH memory and one fingerprint sensor are also part of the system. The original matching algorithm, based on image pattern matching, is properly adapted to allow its execution under constrained platforms with limited memory and processing resources like state-of-the-art smart cards. A TMS320VC5409 16-bit DSP processor running at 100 MHz with 64 kB of on-chip RAM is used in the implementation of the authentication system. In order to fit the authentication algorithm within the on-chip memory, a task of dynamic memory allocation is carried out. The authentication algorithm is split in four different stages: fingerprint acquisition (150 ms), image preprocessing (182 ms), image processing (299 ms) and fingerprint matching (16 ms); and real-time performance (647 ms) has been successfully achieved when the authentication process is performed on that resource-limited platform. Similarly, in [Pan et al., 2003] authors deal with the development of a memory-efficient fingerprint matching algorithm and its integration in a smart card platform based on limited resources such as one 32-bit 66 MHz CPU, 64 kB ROM, 32 kB EEPROM, and up to 8 kB RAM. This article presents 144 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a minutia-based match-on-card system that can operate in real-time on resource-constrained environments like smart cards or other embedded systems. Owing to the RAM memory limitations of current smart cards, authors convert the original matching algorithm –traditionally implemented under resource-free or non-limited platforms such as personal computers– in a memory-optimized algorithm at the expense of extending the execution time of the matching process. Authors focus their work on modifying the original matching algorithm in order to fit it in the resources of standard smart cards. By doing so, the execution time performance of the application is worsened but without compromising the real-time requirements and the recognition accuracy of the resulting matching system. In this way, it is possible to embed the match-on-card application in the smart card platform. The minimum RAM memory needs demanded by the matching application becomes 4.8 kB, and the matching process takes 1.6 s under one ARM7TDMI CPU when processing fingerprint images of size 248×292 pixels. The resultant execution time performance is slightly longer than in the original scenario, where 300 kB of RAM memory are used and one execution time performance of 0.3 s is achieved. The recognition accuracy performance remains the same in both implementations. Other algorithm accommodation techniques that are considered when porting complex applications into constrained computational platforms refer to the development of a set of custom instruction extensions for those embedded processors in charge of the computations. These new techniques are only valid in those scenarios where the available hardware processing resources provide such flexibility (e.g. soft-core instances of processors under programmable logic devices). Those new instructions are focused on the execution of specific operations needed to speed up those timeintensive tasks involved in the recognition process. Some examples are provided in [Yang and Verbauwhede, 2004], [Gupta et al., 2005] or [Yang et al., 2005]. The work [Gupta et al., 2005] addresses the integration of a fingerprint-based user authentication application under classical resource-constrained embedded systems based on low-cost microprocessors and limited memory blocks. One T1050.3 Xtensa processor from Tensilica Inc. configured at 33MHz is used, with 8 kB instruction and data caches, one 32-bit multiplier, and without FPUs in order to reduce the total silicon area needed by the device. Moreover, one Authentec fingerprint sensor AES3400 is used to capture fingerprint images. A minutiae-based fingerprint matching algorithm is deployed in this work. The original software implementation of FVS algorithm (http://fvs.sourceforge.net), mainly based on floating-point operations, is transformed into a new version based on fixed-point operations. Moreover, those trigonometric (sine, cosine, etc.) and other mathematical (exponential, square root, etc.) functions used in the algorithm are implemented through a CORDIC hardware coprocessor in the suggested system. Custom instructions in charge of implementing mathematical operations through the CORDIC hardware coprocessor are instantiated in the microprocessor in order to accelerate the processing. The original algorithm is reorganized in order to minimize the transfers of the images under process from system memory to cache. All those architectural modifications of the embedded system at both hardware and software levels aim at speeding up the user authentication process. The resultant authentication system features an area overhead of about +10% when comparing it with the area of the original processor –without custom instructions–. However, the speed up of the system in terms of execution time performance is improved by a factor of ×10.4. In the same direction, in [Yang et al., 2005] authors develop a dedicated coprocessor called FV16 to be attached to an ARM processor in order to speed up application-specific operations that take place in a cryptographic and biometric authentication system. The aforementioned coprocessor is developed by means of a microcoded architecture, and a dedicated assembler instruction set is also developed for this architecture. The introduction of the microcoded coprocessor in the system permits to speed up the processing of the fingerprint authentication application by a factor of ×83 with regard to the original implementation under the standard ARM processor alone. The design is developed under the Xilinx Virtex-II XC2V1000 FPGA device. 145 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 AFAS development under static embedded systems Many works published in literature deal with the implementation of fingerprint-based recognition applications under low-cost embedded system platforms in the way of static matcher modules. A big variety of architectures and platforms is used, covering single-core and multi-core processors. In this sense, in [Chen and Dai, 2005] authors implement a fingerprint recognition algorithm of reduced complexity in a low-cost and resource-constrained embedded system platform composed of an 8-bit 8051 MCU processor able to run at 25MHz, and one Authentec AFS2 fingerprint sensor in charge of acquiring fingerprint impressions of size 128×128 pixels with resolution 250 dpi. The application is split into four main steps: fingerprint acquisition, image preprocessing, feature extraction and feature matching. The memory needs of the processing algorithm are reduced to 12 kB of program memory and 35 kB of data memory. The experimental results prove the efficiency of the physical system in terms of execution time and recognition accuracy performances. On the one hand, the recognition accuracy is evaluated under a database of 594 fingerprints from 54 different fingers and a performance EER = 4.27% is reached in the evaluation test. On the other hand, the average execution time needed to match two fingerprints is less than 1 s in the suggested system. In [Callaly et al., 2007] a real-time fingerprint authentication system for consumer electronics applications in the form of embedded systems like PDAs, mobile phones or similar devices is presented. The platform is composed of a standalone 800 MHz MPU and a capacitive fingerprint sensor MBF200 from Fujitsu. The recognition algorithm is based on minutia points, and it is fully executed by the MPU. Execution time performances in the range of 3-5s are achieved under the proposed software-based platform. No details about the achieved recognition accuracy performance are provided in this work. The work [Ping et al., 2010] is another example of application driven by MPU devices in order to implement a real-time remote monitoring application of biometrics-based access control systems. One ARM9-based S3C2410 processor is in charge of performing personal recognition through fingerprint biometrics, reaching performances in the range of FAR = 0.0001%, FRR = 0.1%; and response times lower than 0.5 s. The fingerprint recognition module is interfaced with other processors responsible for driving a door lock, and with other modules in charge of wire/wireless communications through GSM or PSTN telephone networks. Other references like [Kertész, 2008] deal with the acceleration of one specific fingerprint image enhancement stage abstracted from those published techniques in [Hong et al., 1998] and [Hong et al., 1996]. The suggested architecture for the embedded system platform where to lodge the application allows the replacement of those originally accurate and time-expensive floating-point operations by less accurate but faster fixed-point operations, the minimization of the amount of complex calculations like trigonometric or exponential functions, and the evaluation of the proposed implementation in terms of recognition accuracy performance. The embedded system platform is based on Fujitsu MDFP200 development kit, which is mainly composed of one Fujitsu MBF200 capacitive fingerprint sensor (of size 256×300 pixels and resolution 500 dpi), one 32-bit RISC processor MB91F302 from Fujitsu running at 68MHz, and some memory blocks like 8 MB SDRAM and 2 MB FLASH. The experimental tests carried out in this work prove that it is possible to speed up the processing at the expense of lowering the recognition accuracy performance: a) the most complex enhancement algorithm, based on floating-point operations, is able to process fingerprint images in 54.98 s in average, providing valid minutiae detection ratios in the range of 60%; b) the fixed-point version of the algorithm, on the other hand, can process the same images in 8.30 s at the expense of featuring a lower accuracy in the range of 59%; whereas c) the simplest and computationally less accurate algorithm can do the processing much faster, in 0.22 s in average under the same platform, but the reached right minutia recognition accuracy is much lower in this scenario: 53%. Therefore, and based on those results, there exists a clear compromise between the recognition accuracy and the execution time that can be achieved in mid- or low-performance processing 146 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 platforms. The specific requirements that one biometric application must fulfil are in the end the major constraints that set the architectural choices in the physical implementation of the embedded system responsible for the execution of the application. In [Survenika et al., 2009] authors provide a basic example of embedded system architecture from which to develop a purely software-based personal recognition application. The suggested architecture consists of Blackfin ADSP-BF533 DSP processor from Analog Devices, and the FingerChip AT77C104B thermal fingerprint sensor of sweeping technology from Atmel Corporation. Similarly in [Shi and Xie, 2009], a fingerprint recognition system is developed under a softwarebased platform composed of one 16-bit fixed-point DSP processor TMS320VC5510A from Texas Instruments –operating at 200MHz and provided with 24 kB instruction cache, 320 kB on-chip RAM, 32 kB on-chip ROM–, as well as 2 MB off-chip FLASH, 8 MB off-chip SDRAM, and one Fingerprint Cards FPC1011C scanner that permits to acquire fingerprint impressions of size 152×200 pixels. Although no details about the recognition algorithm are provided in this work, the total execution time performance for the authentication process (covering image processing, feature extraction and matching stages) is reported to be more than 2 s, and the reached recognition accuracy performance results FAR = 0.18% and FRR = 2.79%. The work [Zhang and Xie, 2008] addresses the development of a fingerprint recognition system based on one DSP processor and one RF-card. The architecture of the system is mainly composed of one 16-bit fixed-point TMS320VC5502 DSP processor from Texas Instruments Inc. that embeds 32 kB of on-chip ROM and 64 kB of on-chip RAM memories and operates at up to 300MHz, additional 4 MB of off-chip SDRAM and 2 MB of off-chip FLASH memories, one Fujitsu MBF200 capacitive fingerprint sensor, and one RF-card that stores the template of the user’s fingerprint. In the enrolment stage, the fingerprint characteristics of the user are saved in the RFcard. In the authentication stage, the acquired fingerprint is compared with the template (which is read from the RF-card) and the authentication result is disclosed. The proposed system is able to match fingerprints in about 1 s. No recognition accuracy performance is provided in this work. In [Rikin et al., 2002] a standalone fingerprint authentication system based on a DSP processor TMS320C54x from Texas Instruments and one fingerprint sensor is presented. The matching processor takes into consideration the minutia points and the ridge shapes of fingerprints. The fingerprint sensor captures fingerprint bitmaps of size 224×288 pixels and, from them, templates of size 64 bytes are deduced. In addition to the DSP central processor and the fingerprint sensor, the system is composed of a FLASH memory block, and one FPGA used to build the interface of the DSP device with the rest of peripherals (some switches and LEDs embedded in the platform, one RS-232 link with an external host, etc.). The whole authentication process is performed in 665 ms in average when the DSP operates at 100 MHz. The recognition accuracy performance exhibited by the suggested recognition system is FAR = 0.001% and FRR = 0.1%. In [Tselios et al., 2008] authors develop a fingerprint authentication system under a computational platform composed of one TMS320C6713 DSP processor from Texas Instruments and one fingerprint sensor AFS8600 from Authentec. A personal recognition algorithm based on the ridgevalley map information available around the core points of the fingerprint impressions is ported to the embedded system. The fingerprint sensor provides 8-bit greyscale images of size 96×96 pixels with resolution 250 dpi. For recognition accuracy evaluation purposes, a database composed of 800 fingerprint images from 40 different individuals is built with the selected sensor. A genuine authentication accuracy of 91% is reached in this approach. The application is developed purely by software under the action of the DSP processor, and the average authentication execution time is 745 ms in the proposed system. Similarly, the work [Li and Qi, 2010] deals with the implementation of a fingerprint recognition system based on one TMS320VC5510 DSP processor from Texas Instruments, one fingerprint sensor FPC1011C from Fingerprint Cards, and one RF-card for template storage purposes. Apart from the on-chip RAM memory (64 kB + 256 kB) embedded in the DSP device, 4 MB of SDRAM 147 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 and 2 MB of FLASH memory are integrated off-chip in the computational platform. Moreover, one keyboard and one LCD display are provided as user interface. The fingerprint recognition algorithm developed in the application is based on image enhancement, binarization, thinning and minutiae extraction stages. The whole feature extraction process (image enhancement, binarization, thinning and minutiae extraction tasks) takes 1004.41 ms in average when the DSP processor runs at a maximum operating frequency of 200 MHz, and the one to one matching process takes 148.30 ms, which results in a real-time authentication system (1152.71 ms). No details about the recognition accuracy performance exhibited by the developed system are provided in this work. In [Zhang et al., 2011] authors develop a minutiae extraction and matching algorithm that directly works with greyscale fingerprint images. It is not needed to perform binarization and thinning on the acquired images; only image preprocessing and image enhancement tasks are needed prior to the feature extraction process takes place. The extraction algorithm is based on the work of Maio and Maltoni [Maio and Maltoni, 1997], and it is implemented under a processing platform mainly composed of the Texas Instruments TMS320F2812 DSP processor. The execution time performance for the whole authentication process is less than 1s, with resultant recognition accuracies in the range of FAR < 1%, and FRR < 5%. In [Ruili and Jing, 2008] authors develop an embedded system in charge of acquiring and enhancing fingerprint impressions. The system is based on a fixed-point TMS320VC5509A DSP processor from Texas Instruments and a capacitive fingerprint sensor MBF200 from Fujitsu. Besides, SDRAM and FLASH memories are present in the system, as well as one communication link with one personal computer platform. The developed embedded system deals with the acquisition, enhancement, binarization and thinning stages of the recognition process, and the enhanced images can be transferred to the personal computer for further processing. Other references that deal with the physical implementation of purely software-based recognition applications making use of single-core and low-cost MPU or DSP processors are [Hong, 2011], [Ning, 2010], [Tong, 2010], [Wan and Zhang, 2011], [Wang and Gao, 2011], [Xian-chun et al., 2011], [Zhang et al., 2010] and [Zhou and Lu, 2009]. Other works use multi-core devices or multiple processors to build the system. In this direction, in [Jinhai, 2011] and [Zhang, 2011] some examples of fingerprint recognition applications are provided based on a dual-core DSP in charge of the recognition process, which interfaces with an optical fingerprint scanner U.are.U4000 from Digital Persona and one general-purpose MCU P89C52. The DSP processor is uniquely in charge of the biometric recognition. However, the MCU is responsible for driving the user interface – which is composed of a USB link with an external host, a keyboard, and a LCD display–, and is able to handle any high-level application additionally to its role of monitoring the biometric recognition process. In [Wang et al., 2010] an embedded system that implements an access control application based on fingerprint biometrics and wireless communications is presented. The user authentication process is managed by a fingerprint recognition module composed of one Fingerprint Cards FPC1011C scanner and one Texas Instruments TMS320VC5510A DSP processor. Additionally, one ARM S3C2410 processor is in charge of interfacing the fingerprint recognition module and the GSM/GPRS communication module, as well as interacting with the user who wants to access the system. In case of positive recognition, the system grants the user access to the privileges of the application; in case of negative recognition, wireless alarms can be sent. In [Wang et al., 2011a] a new example of fingerprint recognition system is presented. The application is developed purely by software under an embedded system platform composed of one TMS320C5515 DSP processor device from Texas Instruments in charge of the fingerprint recognition processing, one FPS200 fingerprint sensor from Veridicom based on solid-state capacitance technology, SRAM and FLASH memories, and one MSP430F1111A MCU from Texas Instruments responsible for generating a communication link between the user (through an interface based on a keyboard and one LCD display) and the biometric application. 148 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Similarly, in [Su et al., 2010] authors develop a fingerprint authentication system based on minutia points under an embedded system platform composed of two processing units: the ARM920T S3C2410 processor and the TMS320VC5509A DSP processor. The proposed recognition algorithm features an execution time performance of 5 ms and 7 ms in average when matching the images of FVC2002 DB1 and DB2 respectively under a personal computer platform based on a Pentium IV processor operating at 1.6 GHz. The recognition accuracy performance exhibited by the algorithm is EER = 1.08% and EER = 1.23% in DB1 and DB2 respectively. The average maximum memory requirement of the algorithm is less than 2.5 kB assuming a template composed of up to 60 minutia points, and template sizes in the range of 200 bytes. The average matching time when executing the same algorithm under the proposed embedded system platform is however not provided. AFAS development under portable embedded systems Apart from static embedded systems, many works address the development of autonomous and portable systems to cover those applications where mobile biometric tokens are demanded. In this direction, in [Su et al., 2005b] a fingerprint-based authentication system is suggested to be embedded in one mobile phone device as replacement or complement to those traditional PIN-based verification applications that aim at protecting the sensitive and private information stored nowadays in such devices. The biometric authentication system is split in two blocks: the front-end fingerprint capture subsystem, and the back-end recognition sub-system: a) The fingerprint capture sub-system is composed of one Atmel AT77C101B thermal sweep sensor and one general-purpose 32-bit Philips ARM LPC2106 processor. The processor features 128 kB on-chip FLASH, 64 kB on-chip RAM, and it can operate at up to 60 MHz. This subsystem is in charge of the acquisition and reconstruction of the digital impressions of users’ fingerprints. This processing takes place out of the phone device, in an external module, and the resultant reconstructed images are transferred to the mobile phone subsystem for further processing. The fingerprint image reconstruction process takes place on-the-fly, concurrently with the fingerprint slices acquisition process. The reconstructed images have a resolution of 8 bits, and one fixed size of 256×280 pixels. The fingerprint acquisition and reconstruction process typically takes 1.5 s. b) The fingerprint recognition sub-system is based on the BIRD E868 mobile phone platform developed by Ningo Bird Mobile Communications Co. Ltd. It is mainly composed of one 16-bit embedded processor S1C33 from Epson operating at 13 MHz. This sub-system is in charge of receiving the template fingerprints in the enrolment phase, as well as the query fingerprints in the authentication phase. Moreover, the recognition sub-system is responsible for the feature extraction of template and query fingerprints, the storage of the extracted features in the enrolment stage, and the matching of both template and query fingerprints in the authentication stage. The fingerprint recognition algorithm is based on minutia points. An image enhancement stage in the frequency domain helps to improve the clarity of the ridges and the valleys of the acquired print. After image enhancement, the next step is focused on the thinning process of the fingerprint ridges in order to obtain the skeleton of the fingerprint impression, from which to deduce its minutiae set. The size of the extracted templates is less than 128 bytes, and the average matching time of two fingerprints is about 6 s in the proposed system. The communication between both sub-systems is possible through a serial interface based on UARTs running at a baud rate of 230400 bps. The recognition sub-system acts as the master controller of the application, and the capture sub-system works as one slave processor. Both subsystems are based on purely software solutions, and no acceleration through hardware/software codesign techniques is exposed in this work. Based on a custom fingerprint database composed of 480 images captured with the presented fingerprint sensor, the recognition accuracy achieved in this work is EER = 4.13%. Similarly, in [Chen et al., 2005] a slightly different recognition algorithm is evaluated under the same purely software processing platform with a database composed of 300 fingerprint images of 149 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 size 128×128 pixels. Accuracy results in the range of EER = 4.16%, and a matching average execution time of 5 s are achieved in this work. The same kind of embedded system architecture is suggested in [Su et al., 2005a] with the only difference of the fingerprint sensor. A Fujitsu MBF310 capacitive sweeping sensor is used instead of the Atmel AT77C101B thermal sweeping sensor. Based on a custom database obtained with the new fingerprint sensor and composed of up to 480 images of size 200×200 pixels, a recognition accuracy of EER = 4.23%, and a matching execution time of 4.5 s are reported. In [Suto et al., 2004] authors address the physical implementation of FingerToken, a compact and portable personal fingerprint verification device in the form of a personal token. It integrates the storage of the user’s template as well as the processing capabilities to perform fingerprint authentication, and features one USB interface to make easy its exploitation in many applications in the current technological age. FingerToken is provided with all those capabilities required for biometric authentication: fingerprint acquisition (thanks to the fact that a fingerprint sensor is embedded in the token), image enhancement, feature extraction, feature storage and fingerprint matching. It consists of a CPU, a capacitive sensor, a non-volatile memory block and a serial USB interface in the way of a small embedded system. FingerToken is powered by the external device where it is connected to (i.e. personal computer USB port). The authentication process is done entirely within the device, and the fingerprint template never leaves the device so it provides additional benefits in terms of security and privacy. Since the personal information stored in FingerToken is protected by the fingerprint authentication process, only those recognized users can access and use that confidential data stored in the device. No details about the cryptographic protocols to be applied in the exchange of information through the USB bus, or the specific recognition algorithm implemented in the token are provided. In [Mueller and Sanchez-Reillo, 2009], the design of one on-token fingerprint authentication application based on Mobility Token MicroSD from Giesecke & Devrient GmbH., an integrated smart card and flash memory authentication token with USB interface and data encryption/decryption features, is presented. The fingerprint acquisition stage is carried out by a webcam installed in any personal computer or embedded device, and the rest of the authentication process is carried out internally in the portable token. The authentication algorithm is saved in the on-token flash, and the personal recognition operation is carried out by the smart card chip as a new approach to biometric embedded systems applications. The work [Li and Chen, 2010] addresses the design of a portable authentication system with encryption features to increase the security of any personal recognition application. The proposed system is called High-Level Security Portable System (HLSPS), and embeds a fingerprint scanner and all those digital processing units needed to develop the authentication application in the form of a low-cost USB key. The private encryption key and the user’s fingerprint features are stored in the on-chip flash memory. All the authentication process is done internally in the device so HLSPS results in a secure on-chip authentication system. In [Park et al., 2007] authors perform an evaluation analysis of the different partitioning scenarios when embedding a fingerprint verification system, fully or partially, in a smart card. Depending on the distribution of the workload among the different modules –smart card and smart card reader– it is possible to develop store-on-card, match-on-card or system-on-card solutions. Factors such as the security level exhibited by every verification system model and the application execution time performance reached in each proposal are evaluated when dealing with resource-constrained embedded systems like smart cards –typically based on one 50 MHz 32-bit ARM processor, 256 kB ROM, 72 kB EEPROM, and 8 kB RAM memory resources–. A fingerprint recognition algorithm based on minutia points is used as reference, and cryptographic AES algorithms are used to protect the information exchanged between the smart card and the card reader so it is possible to guarantee the integrity/confidentiality of that sensitive data of the user. The processing power and the memory requirements needed in each solution are also considered. As a result of the evaluation process, it is concluded that, when purely software solutions under one ARM7TDMI 50 MHz 32-bit processor 150 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 are considered, the fingerprint acquisition is estimated to take 1 s, the feature extraction is estimated to take 7.5 s, the feature alignment and matching processes 0.3 s, the AES encryption of fingerprint images (of size 140 kB) 9.7 s, the decryption about 13.8 s, the encryption of templates (of size 1 kB) 0.06 s, and the decryption 0.1 s. Therefore, although the system-on-card applications are the safest solutions, based on the exposed execution time performances under the proposed platform, authors conclude that system-on-card solutions cannot guarantee real-time performance. Consequently, authors suggest the implementation of match-on-card systems. Based on the proposed hardware architecture, composed of one single RISC processor and no other parallel hardware coprocessors, the matching process and the storage of the cardholder templates are feasible to be embedded in the smart card so they meet real-time performance (0.3 s of processing time for matching, and about 0.1 s for sensitive data decryption), therefore it is proven the feasibility of real-time match-on-card systems under purely software embedded platforms. In [Moon et al., 2009] authors make use the conclusions of the previous work to analyse the implementation of a fingerprint authentication service for large-scale scenarios such as healthcare information systems based on smart cards. It is possible to develop a Client-Server management model of a healthcare information system through biometric smart cards acting under client and server stations over Internet. The proposed system architecture is based on low-cost embedded processors for the smart cards, and powerful processors like those embedded in personal computer platforms for the client and the server workstations. In [Gil et al., 2003] authors deal with the implementation of match-on-card operations based on fingerprint biometrics under resource-constrained smart card systems –which feature limited resources such as one 50 MHz 32-bit ARM7TDMI RISC MPU, 64 kB ROM, 32 kB EEPROM and 8 kB RAM–. The original version of the algorithm relies on minutia points, and when it is developed under a platform without memory restrictions (up to 400 kB of RAM memory available in the system) it can match two fingerprints in 0.3 s. That original algorithm is evaluated under a custom database composed of 400 fingerprints of size 248×292 pixels, and it exhibits a recognition accuracy performance of EER = 3.8%. In order to accommodate the application within the limited resources available in the smart card platform, the algorithm has been rearranged to minimize its RAM memory demands. Based on experimental results, authors claim to minimize the RAM memory needs up to 6.8 kB in the presented work. The proposed matching algorithm can be successfully integrated into the smart card; it is able to match two fingerprint minutiae sets in a realtime of 0.9 s; but the achieved recognition accuracy is slightly worse than in the original scenario: EER = 6.0% under the same database. The presented work is therefore on more example of the design compromises that exist in the deployment of such systems. In [Moon et al., 2003] and [Moon et al., 2004] authors address the development of a personal authentication application based on fingerprint biometrics under a resource-constrained USB token system. The embedded platform –composed of one 32-bit 206 MHz StrongARM CPU, FLASH and RAM memory, and one USB controller– is responsible for performing match-on-token operations. The verification algorithm used in this work is based on minutia points. Processing tasks such as fingerprint acquisition, image enhancement and minutiae extraction are carried out externally to the token, in a host PC along the enrolment and authentication stages. Those other more sensitive tasks such as the storage of the template minutiae corresponding to the legitimate user, and the alignment and matching of minutiae sets are performed within the token. A memory efficient fingerprint recognition algorithm is developed in order to minimize the RAM memory requirements of the USB token system. As a result of the optimization process, only 16 kB of volatile memory is demanded by the biometric application at the expense of up to 80 MB of non-volatile memory where to store the program code. The original implementation required 300 kB of RAM memory and 20 MB of FLASH. Several fingerprint images of the legitimate user are acquired and processed in the enrolment stage, and one single fingerprint image of the query user is acquired, processed and compared with the genuine template in the authentication phase. The recognition accuracy performance reached with the suggested platform is EER = 1.7% when evaluated under a custom 151 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 fingerprint database of 400 images of size 248×292 pixels with resolution 500 dpi captured with an optical scanner. Real-time performance in the matching process is reported by the authors in this work. In [Hwang and Verbauwhede, 2004] a portable fingerprint-based personal authentication system is implemented in the form of a physical token device. The physical token features limited battery life, processing power and memory capacity. It embeds an Authentec AFS2 live-scan fingerprint sensor, a 32-bit RISC processor running at 50 MHz, volatile and non-volatile memory blocks, and a Bluetooth wireless transceiver, all in the form of a portable device under a Xilinx Virtex-II XC2C1000 FPGA-based platform. The design of the embedded system application is mainly focused on three factors: security, performance, and energy efficiency in order to spread the usage of biometric products in the consumer electronics arena: a) Regarding to security aspects, some studies have been carried out to determine the proper partitioning of the application between the physical token and the server. In this sense, the safer solution is proven to be the one where the user template is stored in the physical token and it does not leave the token. The matching process is performed internally in the token as well, unlike those traditional systems where the template is stored in the server or transferred to the server from the external token and the matching is done in the server itself. The fingerprint sensor used to acquire the input fingerprints is also embedded in the token so the acquisition stage takes also place in the token device. In this way, some weak barriers are mitigated and the security of the system is increased since the fingerprint acquisition, the feature extraction, the template storage and the matching processes are all implemented in the token. Moreover, the communication between the server and the token (match/non-match result) is encrypted to add further security to the system. b) Concerning to performance factors, the verification algorithm is based on fixed-point operators; the feature extraction algorithm is based on minutia points; and the matching stage deals with the correlation of the template and the input minutiae sets. The proposed fingerprint verification algorithm exhibits a recognition accuracy performance of FRR = 0.5% and FAR = 0.01% under a fingerprint database composed of greyscale images with size 256×256 pixels. The total latency of the recognition system, when executed purely by software under LEON-2 processor at 50 MHz, is less than 10 s. The estimated RAM memory demands are in the range of 1 MB. c) With regard to energy and power consumption, less than 5 J are estimated to be needed in the proposed system to match fingerprints. In [Moon et al., 2005] authors address the implementation of a fingerprints match-on-card application under a traditional smart card system equipped with an 8-bit 5 MHz Java processor and limited memory resources –512 bytes of RAM and 32 kB of EEPROM–. The fingerprint features of the cardholder –template minutiae– are stored in the smart card, and the matching process of the template and a query fingerprint is conducted inside the smart card so the sensitive biometric information of the cardholder is not released out of the smart card to provide further security to the recognition process. The fingerprint acquisition and minutiae extraction processes are performed externally in the smart card reader, and the minutiae sets of the cardholder and the query user are transferred to the smart card in the enrolment and authentication stages respectively. All data transfers are protected by 3DES data encryption mechanisms in the communication link between the smart card reader and the smart card. The matching process of the legitimate cardholder and the query user minutiae sets, as well as the authentication decision are carried out in the smart card. In order to speed up the matching process to be conducted in the smart card, the minutiae sets are prealigned in the smart card reader by referring all minutia points in polar coordinates with regard to a reference point of the fingerprint. The reference point is the one with the maximum direction change in the ridge-valley pattern. Consequently, the minutiae descriptions are translational and rotational invariant, and fingerprint alignment is not necessary to be conducted in the smart card. To minimize the transfer time of minutiae sets from the smart card reader to the smart card, each minutia is properly coded in binary format and normalized within 7 bytes of data. The matching execution time depends on the number of minutia points transferred for each fingerprint. Average 152 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 transfer times of 0.7 s or 1.28 s are achieved when transferring templates of size 10-18 or 25 minutia points respectively. A point-to-point matching process is carried out in the smart card. The accuracy of the system is evaluated by means of a custom fingerprint database composed of 1149 fingerprints from 383 users. The recognition accuracy depends on the number of minutia points used in the matching process. EER values of 9.6%, 7.9% and 7.2% are achieved when considering 6, 10 and 18 minutia points respectively in the matching stage. The matching execution time also depends on the number of minutia points used in the matching process. Average matching times of 1.1 s, 3.28 s, and 6.49 s are achieved when considering 10, 18 and 25 minutia points respectively for template and query fingerprints. Other example is the mobile biometric system-on-token device used to sign digital transactions using fingerprints from portable devices like mobile phones presented in [Ribalda et al., 2010]. The system-on-token comprises three modules: (i) the system core, which is the main module in charge of controlling the whole system, (ii) the certification module, which handles the user’s private and public keys, and (iii) the biometric module, which performs the biometric authentication of the user. Other biometric system-on-token platforms that address other biometric traits different to fingerprints can be found in the SecurePhone research project (http://www.secure-phone.info/), funded partly by the European Commission’s Framework Program 6, where embedded biometric security was planned to be integrated into mobile phone/PDA devices to deal with electronic signatures for business transactions and administrative operations directly managed from the devices. Three modalities of personal recognition such as facial, voice and signature are used, and the sensitive information is properly secured in a SIM module. Biometrics, cryptography and electronic signature technologies are merged to provide the proper security in those scenarios. AFAS development under off-the-shelf hardware embedded systems Other alternatives deal with the integration of already available commercial embedded modules into high-level applications. In this direction, the work [Liu et al., 2010b] proposes the integration of the commercial fingerprint recognition module SM-62, developed by Miaxis Biometrics Co., in a real access control system application based on automatic door lock control. The commercial module embeds an optical fingerprint sensor, one DSP processor, FLASH memory, and one serial RS-232 communication link. It is able to conduct fingerprint enrolment, templates storage, and fingerprint matching in the form of a stand-alone electronic module. This module is integrated in an embedded system composed of a general-purpose microprocessor C8051F020, one LCD display, one pushbutton switch array, additional EEPROM memory, one temperature sensor, one clock chip, and one electronic lock to develop a time attendance and access control application. It is a good example of integration of biometrics in real-world consumer electronic systems. The work [Liu et al., 2010c] is another example of high level application based on biometrics. A fingerprint-based personal recognition system composed of the M04 fingerprint identification module, developed by Changchun Hongda Company, is in charge of managing a deposit box where customers can deposit their bags before accessing a supermarket, and they can pick them up after shopping. The personal recognition module, which is based on one optical fingerprint scanner, is a subsystem within the complete application; and it clearly shows some of the benefits of deploying biometric products in the current technological age. The work [Lei et al., 2010] covers the development of a fingerprint recognition system based on one 8-bit microcontroller SST89V564 (that embeds on-chip FLASH and RAM memories and acts as the master core of the application), an embedded fingerprint acquisition and recognition module ZFM-20 (based on one DSP processor and one optical fingerprint sensor), off-chip SRAM memory, and a set composed of one LCD display and one keyboard acting as the user interface. The fingers are detected when placed on the sensor surface, and a fingerprint impression is captured and shown in the system LCD. It is another example of integration of commercial biometric modules in realworld applications. 153 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 AFAS development under commercial personal computer platforms Based on the aforementioned works, a normal trend when integrating fingerprint biometrics in embedded system architectures is the optimization at algorithm, hardware and software levels. This has been proven not to be the right direction of operation for those applications demanding high reliability in the authentication process since all those countermeasures are focused on resource optimizations only, and do not try to optimize or improve the recognition accuracy performance of the algorithm itself. Therefore, although most of them are able to reduce the physical resources with regard to the original implementation of the recognition algorithm under more powerful processing platforms like personal computers, the recognition performance is not improved, just the opposite: some slight degradation on performance is normally obtained in such scenarios. To overcome any potential degradation in the recognition accuracy performance, many works deal with the implementation of AFAS or AFIS applications directly under HPC platforms instead of under resource-constrained embedded systems. In this direction, in [Faundez-Zanuy, 2004] and [Faundez-Zanuy and Fabregas, 2005] authors develop a fingerprint-based personal identification system based on a standard personal computer platform to implement one access control system application for a small database (less than 100 users). The hardware of the system is mainly composed of a Pentium III processor running at 800 MHz, one fingerprint sensor based on optical technology, one keyboard, and one screen where to display some messages to the user. The application software is developed in Visual C++ programming language under Linux or Windows operating systems. The recognition accuracy performance exhibited by the system after near 10000 accesses is 0.4% of false rejections. False acceptance ratio is not provided. Other works such as [Marupudi et al., 2006] deal with the development of a fingerprint verification system fusing different techniques published in literature. The implementation is done by means of Matlab in a personal computer platform, and the reached recognition accuracy performance is in the range of EER = 25%, quite far from the targets demanded by high-security applications. Similarly, in [Chen and Wang, 2003] authors develop an automatic fingerprint identification system based on minutia points under a personal computer platform composed of K6-400 MPU, 128 MB SDRAM, and Windows 2000 operating system. The fingerprint matching stage does not reach realtime performance; it takes about 5-9 s in average under the proposed platform. In [Huang et al., 2007] an example of AFIS application based on a personal computer platform is presented. A classical minutiae pattern matching algorithm –covering image enhancement, binarization, thinning, field orientation map computation, and minutiae extraction and filtering processes– is implemented under a personal computer platform with Athelon XP 2500 MPU, 512MB SDRAM, and with Window XP operating system. The resultant matching processing time is 0.2 s when dealing with 500 dpi 8-bit greyscale images of size 300×300 pixels. A recognition rate in the range from 88% to 98% is achieved depending on the quality of the input images. In [Gil et al., 2010] the implementation of an AFIS application aimed at supporting user recognition in distance education environments is presented. The verification of the students that access a building, a Web application, or one specific content of a course are some of the examples covered by the suggested system. The proposed fingerprint recognition algorithm is based on minutia points, and is executed purely by software under personal computer platforms in the Spanish UNED university. In [Challita et al., 2010] authors develop an AFIS application based on fingerprint biometrics to protect the access to computer systems and networks as complementary protections to those Intrusion Detection Systems (IDS) traditionally based on user names and passwords. The evaluation of the identification system with a database of 100 users under a personal computer platform based on a 3 GHz dual core CPU shows a recognition accuracy performance of FAR = 0.0001% and FRR = 1%, one identification execution time less than 2 s, and a matching rate of 700 fingerprints/s. 154 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 4.6.2. AFAS HW This section covers those AFAS applications implemented purely by hardware by means of devices like ASICs or FPGAs that accommodate one or several dedicated biometric coprocessors in charge of one portion or the complete set of processing stages that take part in the recognition application. Table 46 shows some of the published works in this area of research. Research Work [1] [2] [3] [4] Embedded System Architecture (1) Whole Recognition Process: ASIC (57 MHz) Whole Recognition Process: FPGA (Xilinx Virtex E2000, 90 MHz) Whole Recognition Process: FPGA (Xilinx Virtex E2000) Whole Recognition Process: FPGA (Xilinx Virtex E2000) HW/SW HW-only HW-only HW-only HW-only Execution Time Performance (2) 0.3 s 0.102 s 0.183 s 0.514 s Recognition Accuracy Performance (2) EER = 2.34% FAR = 1.07% FRR = 8.33% EER < 5% FAR = 1.07% FRR = 8.33% FAR1 = 1.52% FRR1 = 9.64% FAR2 = 0.66% FRR2 = 6.13% Note 1: The recognition algorithms, memory and processing resources used in each research work are different. Note 2: Performance indicator values in each work are given for different databases. [1]: [Nakajima et al., 2006] [2]: [Vitabile et al., 2005] [3]: [Vitabile et al., 2007] [4]: [Militello et al., 2011] Table 46. Hardware-based AFAS applications disclosed in literature. Biometric applications driven by ASIC devices Many research works have been published dealing with the physical implementation of recognition applications with dedicated ASIC devices. Among them, the work [Nakajima et al., 2006] addresses the development of an ASIC device oriented to fingerprint recognition purposes for access control systems in residential applications. The recognition algorithm is based on pattern matching, specifically on Band-Limited Phase-Only Correlation (BLPOC) theory. An optimized version is suggested by the authors in order to improve the execution time performance of the whole system. The system is mainly composed of one pressure-based fingerprint sensor in charge of the acquisition stage, and one ASIC device responsible for the fingerprint image processing and matching computations. Pipeline and parallelism techniques have been applied in the design of the hardware processing units instantiated in the ASIC device. The processing time of the algorithm, when executed by the ASIC device running at 57 MHz, is 0.3 s for the whole verification process. This execution time is clearly faster than the scenario where the algorithm is executed by a personal computer platform based on a Pentium IV microprocessor operating at 3.06 GHz. To set an example, the processing of a fingerprint area of size 128×128 pixels takes 8.8 ms when executed by the ASIC device at 57 MHz, whereas the same processing takes 28 ms when executed by the personal computer platform. The recognition accuracy performance of the proposed algorithm is evaluated under a custom database composed of 120 fingerprint images from 12 different people. The exhibited accuracy performance results EER = 2.34%. In [Ranganathan and Venugopal, 1994] a fingerprint image matching processor is designed as a VLSI chip by means of parallelism and pipeline techniques. The algorithm implemented in the device is based on moment preserving pattern-matching technique. A template image of size k×k pixels is matched against a query image of size N×N pixels, with N ≥ k. For this purpose, the template is superimposed to the query image, and for each relative position, the similarity between the overlapped areas is computed. At the end of the processing, the relative position that implies the best similarity score is identified. The computation of the similarity measure for each relative position template-query implies k2 operations, which are all executed in parallel in the device. The execution time performance of the processor is in the range of 1.476 ms to match query images of size 512×152 pixels with template images of size 128×128 pixels, or 0.339 ms in case of matching query images of size 256×256 pixels with template images of size 32×32 pixels. No details about the matching accuracy performance exhibited by the system are given in this work. 155 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Some other good examples are provided in [Anderson et al., 1991], [Fujii et al., 2002], [Galy et al., 2007], [Kim et al., 2005], [Morimura et al., 2002], [Shigematsu and Morimura, 1999], [Shigematsu et al., 1999], and [Tiri et al., 2005]. Biometric applications driven by FPGA devices Other works deal with the development of purely hardware solutions based on FPGAs. Among them, in [Aboalsamh, 2010] authors develop a compact fingerprint-based authentication system composed of a Fingerprint Cards FPC1011F1 scanner together with a Fingerprint Cards FPC2020 biometric processor chip, FLASH memory for storage of the application program and the fingerprint templates, and one RFID circuit that permits to build a low-cost embedded system in the form of an electronic identification card (e-ID card) for personal access system applications. The developed system is clearly oriented to wireless applications that demand personal authentication based on biometrics. The biometric processor chip provides a set of instructions in charge of executing the Distinct Area Detection recognition algorithm on the acquired user’s fingerprints. The application program is stored in FLASH memory and is downloaded in the biometric processor chip after system power up. The authentication process is carried out by the biometric processor chip. The acquired fingerprint is compared against the cardholder’s fingerprint template (previously stored during the enrolment stage in the on-board FLASH). The results of the authentication stage are transmitted by the on-card RFID circuit to the external RFID reader, which is in charge of providing right access to restricted resources/areas to only those biometrically authorized users. The work [Vitabile et al., 2005] is one of the first personal recognition systems based on fingerprints that is developed purely by hardware on a platform composed of the Hamster SecuGen fingerprint sensor and the Celoxica RC1000 evaluation board equipped with one Xilinx Virtex E2000 FPGA. The fingerprint recognition algorithm used in this work is based on minutia points. The processing stages that take place along the algorithm are image normalization, image binarization, fingerprint thinning, minutiae extraction, and matching. Parallelization and pipeline techniques are used in order to speed up the processing. The FPGA runs at 90 MHz and the onboard RAM memory block at 22.5 MHz. The specific coprocessors operate at working frequencies in the range of 21 MHz. The execution time performance of the whole authentication process is 101.95 ms, split into normalization (12.5 ms), binarization (18.6 ms), thinning (39.8 ms), minutiae extraction (24.3 ms) and matching (6.75 ms). During the enrolment phase, up to four instances of the user fingerprint are acquired and processed. In the authentication stage, the on-line acquired fingerprint is compared against the four previously stored templates in parallel. The proposed system has been tested with a custom database made of 384 fingerprint impressions from 96 individuals. A recognition accuracy performance of FAR = 1.07%, FRR = 8.33%, and EER less than 5% is reached in the suggested platform. Similarly, in [Vitabile et al., 2007] a fingerprint-based personal recognition system is embedded in a token device that integrates the fingerprint sensor and one FPGA where all the user authentication processing is carried out. The system is mainly composed of one PB100MC capacitive fingerprint sensor from Precise Biometrics and one RC1000 FPGA-based development board from Celoxica. The Virtex E2000 FPGA from Xilinx Inc. is used to lodge those application specific processors in charge of the different processing stages involved in the recognition algorithm: image normalization, image binarization, image thinning, minutiae extraction, fingerprint matching and the encryption/decryption AES algorithms to secure the transfer of data among the different components of the system. Under this architecture, the complete fingerprint processing takes 183.32 ms. The recognition accuracy is tested by means of a custom database composed of 352 greyscale fingerprint images of size 200×200 pixels, all them obtained from 88 different people and built with the system’s fingerprint scanner. The achieved recognition accuracy performance results FAR = 1.07% and FRR = 8.33%. In [Bonato et al., 2003] authors provide an example of acceleration performance achieved with dedicated hardware design on FPGA devices. Given a fingerprint minutiae extraction algorithm, it 156 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 is executed in two different platforms: a Pentium II operating at 233MHz, and one embedded system based on Altera FLEX10KE FPGA (EPF10K50 device) running at 27.65MHz. The processed images feature a fixed size of 256×256 pixels, and the processing covers image filtering, field orientation map computation, ridge detection and thinning, and minutiae extraction tasks. The acceleration performance is quite notorious: 11850 ms versus 302.94 ms, which leads to a speed up of ×39.1 in case of basing the design on the FPGA platform. The work [Arjona et al., 2010] provides an efficient implementation, by means of hardware, of a method to save the directional image deduced from a fingerprint impression. A fuzzy model is used to define the fingerprint orientation image in a compact form, allowing thus the reduction of the size of the templates in fingerprint recognition applications. Two physical developments, under one ASIC and under one FPGA device, are presented. The work [Militello et al., 2011] aims at developing a personal recognition application based on embedded system architectures to be integrated in HPC systems in order to replace those conventional and less secure authentication systems based on “username & password” approaches for accessing networks. A full hardware implementation of the authentication system is suggested in order to mitigate those vulnerabilities linked to the security of the system and the integrity of the users’ biometric information. The proposed system adds to the networked workstation a biometric sensor (Precise Biometrics PB100MC sensor), a smart card reader (Biometrika FX2000), and one processing platform prototyped via an FPGA device (Celoxica RC1000 board with Xilinx Virtex E2000 FPGA) where to implement the authentication algorithm. A distributed biometric database focused on smart cards is proposed as an alternative solution to the usage of centralized databases. One AES encryption/decryption algorithm is used to protect the user template from external attacks in the exchange of information between the smart card reader and the processing platform. Hardware coprocessors described in Handel-C language and in charge of image preprocessing, minutiae extraction, AES encryption/decryption and fingerprint matching tasks are developed in the suggested FPGA. The experimental tests provide FAR-FRR performance pairs in the range of 1.07%-8.33%, 0.66%-6.13% or 1.52%-9.64% under one fingerprint database of size 200×200 pixels and two databases with images of size 296×560 pixels, respectively. The execution time of the authentication process is 183.32 ms or 514.1 ms depending on the fingerprint image size to be processed. In order to improve the execution time performance and the security of the whole system, the integration of all the components –fingerprint sensor, smart card reader and processing platform– in a single chip (sytem-on-chip or ASIC devices) is suggested. In this way, the fingerprints can be acquired and processed by the same hardware device, with no biometric information transmission outside the processing element. Moreover, the usage of computationally efficient hardware architectures guarantees a low processing time for the user authentication task so it avoids any temporal bottleneck when embedding the authentication system in HPC platforms. 4.6.3. AFAS HW/SW Most of the published works that deal with the implementation of AFAS applications under embedded system platforms make use of hardware-software codesign techniques. The programmability performance of FPGAs makes them ideal devices in such applications since it is possible to fit into the logic fabric not only general-purpose processors (soft-cores), but also dedicated hardware coprocessors made to measure of the application demands. All those works take advantage of the flexibility linked to the software development, and the parallelism feature inherent to the hardware design, in order to speed up the processing. Many works are focused on the implementation of one or few specific hardware coprocessors in charge of those more critical (computationally expensive) stages that permit to improve the application execution time performance till reaching acceptable targets. Other works focus on the implementation of the AFAS application through the synthesis of as many hardware coprocessors as needed to optimize as much as possible the execution time performance of the whole application. Table 47 summarizes some of the AFAS applications based on hardware/software codesign techniques disclosed in literature. 157 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Research Work [1] [2] [3] [4] Embedded System Architecture (1) Fingerprint Acquisition: FPGA (Xilinx Spartan 3, 12 MHz) Image Normalization: FPGA (Xilinx Spartan 3A XC3SD1800A, 100 MHz) Edge Detection: FPGA (Xilinx Virtex-E, 16 MHz) Image Processing (filtering/convolution): FPGA (Xilinx Virtex5 XC5VSX50T, 100/50 MHz) Image Processing & Feature Extraction: FPGA (Xilinx Spartan 3 XCS1500FG676, 50 MHz) Image Processing (thinning): FPGA (Xilinx Virtex II XC2VP30, 100 MHz) Image Processing (thinning): FPGA (Xilinx Spartan 3 XC3S1000) Image Processing (thinning): FPGA (Xilinx Spartan-II XC2S100, 40 MHz) Image Processing & Feature Extraction: FPGA (Xilinx Spartan 3 XC3S200, 40 MHz) Image Processing & Feature Extraction: SOC/FPGA (SPEAR2) Whole Recognition Process: SOC (MPU + FPGA, 33 MHz) HW/SW HW HW HW HW/SW Execution Time Performance 50.99 ms 11.10 ms 4.20 ms − Speed-up versus Other Solutions ∼ ×5 (vs HPC) ∼ ×18 (vs HPC) ∼ ×11 (vs HPC) ×12.9 to ×157.3 (vs Embedded System SW-only) ×0.5 to ×2.9 (vs HPC) ×370 (vs Embedded System SW-only) ×11 (vs HPC) ×49 (vs HPC) ×2.6 (vs HPC) ×2 (vs HPC) ×15.5 (vs Embedded System SW-only) ×0.67 (vs HPC) ×408 (vs Embedded System SW-only) ×10.5 (vs Embedded System SW-only) ×1.5 (vs HPC) ×4.7 (vs Embedded System SW-only) [5] [6] [7] [8] [9] [10] [11] [12] HW/SW HW HW HW HW/SW HW/SW HW/SW 262.0 ms 1.997 ms 18.0 ms 70.0 ms 987.8 ms 800.0 ms 5.431 s Feature Alignment & Matching: 190 ms HW FPGA (Xilinx Virtex-II XC2V2000) Feature Alignment & Matching: [13] 14.0 ms HW ×31 (vs HPC) FPGA (Xilinx Virtex4 XC4VSX55) Feature Matching: [14] 0.590 ms HW ×6.5 (vs HPC) FPGA (Altera Stratix II EPS2S180F1020C4, 90 MHz) Feature Matching: [15] 0.655 ms HW/SW ×25 (vs HPC) FPGA (Altera Stratix II EPS2S60, 100 MHz) Feature Matching: [16] 0.820 HW/SW ×45 (vs HPC) FPGA (Xilinx Virtex-E, 65 MHz) Feature Matching: [17] HW ×156 to ×372 (vs HPC) − FPGA (Xilinx Virtex-4 XC4SX55-11, 250-400MHz) Feature Matching: [18] HW ×23.8 and ×25.4 (vs HPC) − FPGA (Xilinx Virtex4 XC4VLX22, 81 MHz) Feature Matching: ×4 [19] 0.5 s HW/SW FPGA (Xilinx XC3S500E) (vs Embedded System SW-only) Image Processing (thinning): [20] 1.1 s HW/SW ×54 to ×81 (vs HPC) HPC + FPGA (Xilinx Virtex 2 XC2V6000) Feature Extraction: ×46 [21] 3.36 s HW/SW FPGA (Xilinx Spartan 3 XC3S-1500-4, 40 MHz) (vs Embedded System SW-only) Feature Matching: [22] 0.660 ms HW/SW ×5.3 to ×26.2 (vs HPC) FPGA (Altera Stratix II EP2S60F672C3, 100 MHz) Note 1: The recognition algorithms, memory and processing resources used in each research work are different. [1]: [Sudiro et al., 2008] [9]: [López and Cantó, 2008] [17]: [Lindoso and Entrena, 2007] [2]: [Martell and Abe, 2009] [10]: [Hepp et al., 2008] [18]: [Lindoso et al., 2007b] [3]: [Venkatesan and Rao, 2001] [11]: [Chao et al., 2005] [19]: [Lim et al., 2010] [4]: [Lindoso et al., 2008] [12]: [Pan et al., 2008] [20]: [López et al., 2005] [5]: [López and Cantó, 2006] [13]: [Lindoso et al., 2007c] [21]: [Barrenechea et al., 2009] [6]: [Xu et al., 2009] [14]: [Danese et al., 2007] [22]: [Danese et al., 2011] [7]: [Hermanto et al., 2010] [15]: [Danese et al., 2009] [8]: [Hsiao et al., 2006] [16]: [Jiang and Crookes, 2008] Table 47. Hardware/software-based AFAS applications disclosed in literature. In [Sudiro et al., 2008] authors implement a dedicated hardware controller under Xilinx Spartan 3 FPGA device in charge of reading fingerprint images by means of the Fujitsu capacitive sensor MBF200. According to the authors, fingerprint images of size 256×300 pixels are acquired in 50.99 ms when the controller operates at 12 MHz, much faster than other existing solutions that scan fingerprint images with the same sensor through USB interfaces and feature acquisition times of at least 250 ms. 158 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Martell and Abe, 2009] authors present the development of a hardware processor in charge of the normalization process of fingerprint images making use of computational operations like square root or division functions with fixed-point operands. The normalization processor handles 8-bit greyscale images of size 256×256 pixels. The processor computes the average and the variance of the greyscale level of the image pixels in order to perform the normalization process. It is implemented on the Spartan 3A XC3SD1800A FPGA device from Xilinx Inc., which can operate at a maximum frequency of 100 MHz. The same implementation is carried out with Matlab under a personal computer platform based on AMD Athlon Dual Core processor operating at 2.5 GHz. Very similar normalization results are achieved in both scenarios with quite different execution times: an average execution time of 11.1 ms is reached in case of FPGA implementation versus 206 ms in case of implementing the normalization process under the personal computer platform. The work [Venkatesan and Rao, 2001] addresses the implementation of a dedicated hardware processor in charge of the edge detection operation of greyscale images under a Xilinx Virtex-E device. The hardware processor operates at a frequency of 16 MHz, and it is developed by means of parallelism and pipelining techniques. The execution time performance exhibited by the edge detector processor is 4.2 ms when dealing with greyscale images of size 256×256 pixels. Similarly, the implementation of the same edge detection algorithm under a personal computer platform based on a Pentium III processor running at 1.3 GHz and Visual C++ programming language is 47 ms, just one order of magnitude slower. Similarly, in [Houari et al., 2010] authors study the physical implementation of several real-time edge detector processors (based on Sobel, Prewitt or Canny operators) by means of hardwaresoftware codesign techniques under the Virtex-II XC2V1000 FPGA from Xilinx Inc. The software processing is implemented with Visual C++ programming language, and the hardware processing is described in Handle-C. The resultant system is able to identify the edges of the images in real-time, at a rate of 8.64 ms when processing images of size 720×576 pixels and the dedicated hardware core blocks operate at 47.96 MHz. In [Neji et al., 2011] the implementation of a hardware CORDIC processor involved in the Fast Fourier Transform computations of a fingerprint recognition system is presented. The CORDIC processor performs 14 iterations to get accurate fixed-point results in the computation of sinus, cosine, exponential and arctangent operations. The processor is implemented under the Stratix III P3SL150F1152C3 FPGA device from Altera Corp., and it is able to operate at a maximum frequency of 250 MHz. The reference [Sagar et al., 1995] is one of the first published works that aims at accelerating the processing that takes place in one personal fingerprint recognition system through the usage of hardware-software codesign techniques. Acceleration factors of one or two orders of magnitude are achieved in tasks such as image binarization or image thinning when comparing the purely software implementations under a personal computer platform with the hardware-software implementations of the same processing algorithms under a platform based on a personal computer processor plus some hardware coprocessors instantiated in one companion FPGA device. In [Lindoso et al., 2008] authors afford the development of an embedded system in charge of specific image processing tasks (filtering or convolution) implemented by means of one generalpurpose soft-core processor and one multi-purpose hardware coprocessor, both cores instantiated into one system-on-chip device. The proposed topology is based on a 32-bit MicroBlaze processor operating at 100 MHz and one configurable coprocessor running at 50 MHz, all embedded in one Xilinx Virtex5 XC5VSX50T FPGA device. The soft-core processor takes the control of the application flow, and the multi-purpose coprocessor is in charge of those computationally intensive image processing tasks. DMA accesses in the transfers of input and output data to/from the hardware coprocessor have been considered. The multi-purpose coprocessor is configured by the soft-core processor at any time. A coarse-grain configurable coprocessor is developed by means of four 32-bit registers driven by the soft-core processor. In this way, the soft-core is able to configure in a few clock cycles the specific processing task to be executed at each moment by the 159 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 coprocessor. The embedded system implementation based on hardware-software codesign techniques is able to speed-up the processing in the range of ×12.9 to ×157.3 when compared against the purely software implementation of the same tasks under MicroBlaze processor alone in the embedded system platform; and speed-ups in the range from ×0.5 to ×2.9 are reached when compared against the execution of the same processing algorithms purely by software under a personal computer platform based on a Pentium IV MPU running at 3.2 GHz. In [Alibeigi et al., 2007] authors develop a real-time hardware processor in charge of computing the orientation of fingerprint images through pipeline techniques under a Xilinx Virtex 4 XC4LX200 FPGA device. The fingerprint images are tessellated into non-overlapped blocks of size 16×16 pixels, and for each pixel in one block, the orientation is estimated based on 16 possible reference directions. Once the orientation for each of the pixels of one block is known, the dominant orientation of the block is computed. It is proven that the hardware processor permits to speed up the processing when compared against the purely software implementation of the same fingerprint field orientation extraction algorithm. In [Liu et al., 2010a] authors address the design of a pipelined hardware coprocessor in charge of the enhancement of fingerprint images through Gabor filtering. After deducing the local frequency and ridge orientation of each pixel of the image, specific Gabor filters adjusted with those local features are built to process the fingerprint impressions. As a result of the processing, a clearer bitmap with more defined ridge-valley structures can be obtained. The proposed system architecture consists of one general-purpose processor, SDRAM memory, and one application-specific Gabor coprocessor. The proposed hardware coprocessor is fully configurable: it permits to process images of different sizes, up to 1023×1023 pixels, using programmable kernels of size 3×3, 5×5, 7×7, 9×9 or 11×11 pixels. It embeds two CORDIC processors implemented in hardware, and one AMBA bus controller in order to establish a communication link between the hardware coprocessor and the rest of system resources – microprocessor and SDRAM memory–. The rest of processing stages involved in the personal recognition application are performed purely by software under the action of the general-purpose processor. In terms of execution performance, the hardware coprocessor is able to achieve a throughput of 2 Mpixels/s when dealing with a convolution kernel of size 11×11 and it operates at 250 MHz. It can enhance a fingerprint image of size 256×256 pixels in 31.7 ms using a total amount of logic resources of about 63.8K equivalent gates. Similarly, the work [Idros et al., 2010] deals with the implementation of a dedicated hardware processor, described in Verilog HDL, in charge of fingerprint image enhancement tasks through Gabor filtering operations. The hardware coprocessor is instantiated in a Xilinx Spartan 3A FPGA, and it operates at 25 MHz. Two different implementations are carried out in this work: (i) The first implementation tries to speed up the processing at the expense of hardware resources. It performs the enhancement operation through the convolution of the image with Gabor filters of size 3×3 pixels by means of many multiplication-accumulation (MAC) units running in parallel. In such a scenario, the enhancement process of one image takes 127 clock cycles, and the controller needs 5759 FPGA slices. (ii) In order to reduce the amount of hardware resources needed to instantiate the hardware processor, a second scenario is defined where those parallel multiplication-accumulation (MAC) units are replaced by serial multiplication-accumulation units. The same image convolution process with a Gabor filter of size 3×3 is carried out in 222 clock cycles in this new scenario, slightly longer than in the first design. However, the amount of FPGA slices used to instantiate the controller is notoriously reduced: only 1625 slices are needed in the new design. Other examples of fingerprint image enhancement by means of dedicated Gabor filter processors instantiated into FPGA devices are [Rosshidi and Hadi, 2009] and [Razak and Taharim, 2009]. In [López and Cantó, 2006] authors address the implementation of a fingerprint ridge extraction algorithm by means of hardware-software codesign techniques on the Spartan 3 XCS1500FG676 FPGA device from Xilinx Inc. This research work deals with 8-bit greyscale images of size 160 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 256×256 pixels, and it performs the computation of the field orientation map of the input fingerprints, as well as their binarization process by Gabor convolution. The input images to be processed are read from off-chip SRAM memory, and the resultant enhanced images are written back to the off-chip SRAM memory. The system architecture consists of a soft-core MicroBlaze processor operating at 50 MHz, and one application-specific hardware coprocessor running also at 50 MHz, both processors instantiated in the FPGA. The average execution time of the processing is 262 ms, much less than the execution time performance exhibited by the soft-core MicroBlaze when it runs the same application alone at 50 MHz in the FPGA (achieved execution performance of 97 s), and also less than the execution time performance exhibited by one HPC platform based on one Intel Centrino processor operating at 1.7 GHz (2930 ms). Outstanding speed-ups in the range of ×370 and ×11 respectively are therefore proven in this work with the suggested system architecture based on hardware-software codesign. The work [Kheiri et al., 2005] focuses on the design and implementation of a hardware coprocessor in charge of the fingerprint image binarization and thinning processes of a personal recognition system. The fingerprint image is divided into blocks of size 16×16 pixels with and overlapped pixel between adjacent blocks, and the mean value of the pixels in a block is used as threshold for binarization purposes. After binarization, one morphological dilation process with a kernel of size 2×2 is applied to the resultant image to filter any residual noise that may exist. Next, one parallel thinning algorithm is implemented to get the skeleton of the binarized image. A parallel and pipelined processor is responsible for all those tasks. It is able to achieve real-time performances: the execution time of the whole processing is reduced to 1.45ms when dealing with images of size 512×512 pixels. The hardware coprocessor is synthesized in a Xilinx Virtex II 2V8000 FPGA. The work [Xu et al., 2009] deals with the physical implementation of a hardware processor in charge of the thinning process of previously binarized fingerprint images of size 256×256 pixels. The processor is implemented under a Virtex II XC2VP30 FPGA from Xilinx Inc. The experimental results indicate that the solution based on the hardware processor running at 100 MHz can reduce the thinning time by one order of magnitude when compared against the purely software implementation of the same algorithm under a personal computer platform based on one 2 GHz Pentium IV processor and C programming language. The thinning process takes 1.997 ms in average when executed by the hardware processor, and 97.6 ms when executed by the computer workstation, so one speed-up of ×49 is achieved in this work. In [Hermanto et al., 2010] authors develop a fingerprint image thinning hardware processor under a Xilinx Spartan 3 XC3S1000 FPGA device. One acceleration factor of about ×2.6 is achieved when comparing the FPGA-based hardware implementation (execution time performance of 18 ms) with the purely software implementation (using Visual C++ programming language) of the same thinning algorithm under a personal computer platform (execution time performance of 47 ms). In [Hsiao et al., 2004] and [Hsiao et al., 2006] authors develop a hardware thinning processor suitable to be integrated in an embedded fingerprint recognition system. The thinning processor is implemented by means of parallelism and pipelining techniques under a Xilinx Spartan-II XC2S100 FPGA device, and it is able to perform the thinning operation of an image of size 512×512 pixels in 0.07s at 40 MHz, which results in an acceleration speed-up of about ×2 when compared with the purely software implementation of the same thinning procedure under a personal computer workstation based on a Pentium III processor operating at 800 MHz, or a speed-up of at least ×40 in case of normalizing the operating frequencies in both scenarios. Other thinning processors are developed in works [Kim et al., 2007] and [Kim et al., 2008]. The works [Pan et al., 2006] and [Pan et al., 2007] deal with the implementation, by means of hardware-software codesign techniques, of the minutiae extraction stage of a fingerprint authentication system. An architecture based on a system-on-chip device embedding one field programmable logic array and one general-purpose microprocessor is suggested in this work. A modified ridge following algorithm abstracted from the original algorithm [Maio and Maltoni, 1997] has been developed in order to directly extract minutia points from greyscale images. The 161 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 resultant embedded system is composed of a 32-bit ARM940T microprocessor and a Xilinx Virtex V2000EFG680 FPGA. The platform operates at 50 MHz, and is in charge of processing fingerprint images of size 248×292 pixels. A total of 20~50 minutiae can be extracted from the input images in 2~4 s. The proposed system can be integrated in a smart card in order to increase the security of the authentication process since the feature extraction stage (and, although not covered in this work, also the feature matching stage) can be ported within the smart card, avoiding thus the exposure of that sensitive biometric template information out of the smart card, in the smart card reader, which would result more susceptible and vulnerable to external attacks. In [Alibeigi et al., 2009] authors present a hardware coprocessor in charge of extracting minutia points directly from fingerprint binary images, without the need of applying a thinning process to the binary images. The implementation is done by means of pipeline techniques under a Virtex 4 XC4VLX200 FPGA from Xilinx Inc. In [López and Cantó, 2008] authors afford the physical implementation of all the processing stages that take place in a classical fingerprint enrolment process based on minutia points, that is, normalization, segmentation, ridge extraction, image thinning, minutiae extraction and minutiae filtering. The application deals with 8-bit greyscale images of size 256×256 pixels. The suggested enrolment algorithm is first implemented purely by software under a personal computer platform based on an Intel Centrino MPU running at 1.7 GHz. The execution time performance exhibited by the system is 668 ms in average. Next, the application is ported to an embedded system based on one FPGA Spartan 3 XC3S200 from Xilinx Inc. and off-chip SRAM memory. A purely software implementation of the same algorithm is done by instantiating the soft-core processor MicroBlaze in the FPGA. The input images to be processed are saved in off-chip SRAM memory and the resultant images are written back to the same off-chip SRAM memory after the processing. When the processor operates at 40 MHz in the proposed platform, the enrolment process takes 15336 ms, which does not meet real-time performance. Therefore, the fingerprint enrolment process is implemented in a second step by means of hardware-software codesign techniques in the embedded system platform. The suggested hardware architecture consists of the soft-core MicroBlaze operating at 40 MHz, and other three application-specific hardware coprocessors operating at 40 MHz and in charge of the segmentation, ridge extraction and thinning stages of the application. An outstanding execution time performance is achieved in the new system: 987.8 ms, which is slightly higher than the execution time performance exhibited by the HPC platform but leads to a speed up performance of ×15.5 when compared against the initial purely software implementation under the embedded system platform. In [Hepp et al., 2008] further investigations are carried out dealing with the search of the best hardware-software partitioning to be applied to a fingerprint recognition algorithm (initially written to be executed purely by software under a personal computer platform) when the proposed algorithm is ported to an embedded system based on SPEAR2 device. SPEAR2 refers to Scalable Processor for Embedded Applications in Real-time environments. SPEAR2 is a VHDL open-source description of a limited-performance 16-bit instruction/32-bit data path processor, equipped with registers, instruction and data memory. The partitioning of the application is mainly oriented to reach the following performances: - short execution time, - reduced FPGA resource usage, - low power consumption, - and limited memory usage, when dealing with the image enhancement and feature extraction stages (it is not considered the matching stage) of a minutia-based recognition algorithm. A first implementation of the recognition algorithm purely by software under the suggested platform, followed by optimizations at hardware and software levels make possible the improvement of the aforementioned features: the code size is reduced by a factor of ×3 (the initial code composed of 29000 assembler instructions is reduced to 8757), and one speed up of ×408 is reached (initial execution time of 326.4 s versus final execution 162 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 time of 0.8 s) with one additional hardware resources usage of 182 registers and 262 ALUs with regard to the initial purely software-based embedded system platform. As in previous reference works, this is another good example which proves the main advantages of hardware-software codesign techniques in resource-constrained embedded systems. In [Chao et al., 2005] authors afford the implementation of a fingerprint authentication system under an embedded system platform composed of a system-on-chip device from Altera Corporation and the Fingertip sensor from Infineon Technologies. The authentication algorithm is based on singular points (core/delta) and minutia points (ridge endings/bifurcations), and it is implemented by means of hardware-software codesign techniques. A Nios soft-core processor is instantiated in the system-on-chip device, which acts as master of the application and operates at 33 MHz. Additional hardware coprocessors acting as companion processing units are instantiated in the device to speed up those time-consuming processing tasks of the algorithm like the directional Gabor filtering. The recognition accuracy of the presented system is evaluated by means of a custom database composed of 1000 fingerprints captured from 200 different fingers. The achieved accuracy, expressed in terms of FAR and FRR, results in FAR = 0.092% and FRR = 2.75%. The execution time performance of the proposed algorithm is evaluated under different platforms: (i) under a personal computer workstation based on Intel 80486 MPU operating at 50 MHz, the authentication execution time is 8.34887 s; (ii) under the proposed embedded system platform, the achieved execution time performance, when the algorithm is executed purely by software under Nios processor alone, is 57.23s; (iii) and the aforementioned execution time performance is highly improved when applying hardware-software codesign techniques on the same computational platform. An execution time performance of 5.431 s is reached when parallel hardware coprocessors are added to the system. It is therefore a new example that shows the temporal improvements that can be achieved when implementing biometric applications under embedded system platforms making use of hardwaresoftware codesign techniques. In [Chung et al., 2004] and [Pan et al., 2008] authors develop a match-on-card hardware processor under a Xilinx Virtex-II XC2V2000 device in charge of matching fingerprint images by means of the alignment and the matching stage of minutiae sets. The rest of processing stages involved in any biometric personal recognition system (image preconditioning and feature extraction phases) are performed out of the smart card, in the smart card reader, owing to the very limited computational resources available in the smart card. The proposed hardware solution, when added to the smart card system, permits to achieve real time performances in the matching process (190 ms), as well as accurate recognition results (EER = 3.8%) when evaluated under a fingerprint database. However, the purely software and memory-efficient implementation of the same algorithm in a state-of-the-art smart card (with limited resources: 32-bit 50 MHz CPU, ROM, EEPROM, and 6.8 kB RAM) takes much more time (900ms) and performs worse (EER = 6%) when evaluated under the same database. Both solutions are feasible in terms of resources constraints, but the one based on hardware is proven to be more efficient in terms of execution time and recognition accuracy. In [Lindoso et al., 2007a] and [Lindoso et al., 2007c] authors develop a fingerprint matching algorithm based on (i) a coarse alignment step (which considers translation and rotation) focused on the correlation of fingerprint orientation fields, and (ii) a matching stage based on the correlation (Zero Mean Normalized Cross Correlation method) of three selected regions (greyscale image portions of size 50×50 pixels) of the input fingerprint with other three corresponding regions (greyscale image portions of size 100×100 pixels) of the template fingerprint. Some enhancement techniques based on normalization, low-frequency filtering, directional Gabor filtering and equalization are applied to the fingerprints to extract the field orientation maps used in the alignment stage, and the enhanced version of the greyscale images used in the matching process. The algorithm is evaluated under the FVC2002 DB2 public database, and one accuracy performance of EER = 8% is achieved. The suggested algorithm is implemented under different processing platforms: 163 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a) When executed under a personal computer platform based on one Pentium IV 3 GHz processor, the application takes 1.15 s in total. This performance covers the input image preprocessing tasks and the matching process of the input image with the template. The correlation task for alignment specifically takes 109 ms, and the correlation for matching 320 ms. b) When the same algorithm is implemented by means of hardware-software codesign techniques under a Xilinx Virtex4 XC4VSX55 FPGA device, those tasks ported to hardware are notoriously accelerated. The hardware coprocessors in charge of the correlation permits to reach a speed up in the range of ×31. The correlation for alignment takes 1 ms, the correlation for matching 8 ms, and the additional data transfer operations result in a final execution time of 14 ms, much less than the execution time performance (109 ms + 320 ms = 429 ms) exhibited by the same processing stages in the HPC platform. In [Danese et al., 2007] authors address the physical implementation of a fingerprint matching stage based on Phase Only Correlation (POC) techniques under one Altera Stratix II EPS2S180F1020C4 FPGA device. The application deals with 8-bit greyscale images of size 128×64 pixels. The execution time performance of the suggested algorithm under one AMD Athlon 64 processor running at 2 GHz results 3856 us, whereas the same algorithm executed by a made-to-measure hardware coprocessor under the selected FPGA running at 90 MHz takes 590 us, which leads to a relative speed-up of ×6.5. Similarly, in [Danese et al., 2009] authors study the physical implementation of a matching stage based on POC techniques under one Altera Stratix II EPS2S60 FPGA device. The application deals with 8-bit greyscale images of size 256×256 pixels in this work, and the reported accuracy performance of the suggested algorithm is EER = 7.7%. The execution time performance of the suggested algorithm, when executed purely by software under a processing platform based on one Pentium IV MPU running at 2.4 GHz, is in the range of 16375 µs. The achieved performance, when executed by hardware-software codesign techniques in the FPGA running at 100 MHz, is 655 µs. It corresponds to a relative speed-up of ×25, and it is achieved under a processing platform based on the soft-core Nios II processor and one hardware instance of one dedicated coprocessor in charge of those 2-D discrete Fourier transformations involved along the processing. Another implementation of a fingerprint matcher system focused on POC methodology is covered in [Tulabandhula et al., 2009]. The processing platform suggested in this work is composed of one software block –based on MicroBlaze soft-core processor– and one hardware block –based on several hardware coprocessors in charge of Fast Fourier Transform computations–, all implemented under an FPGA device. Different matcher system solutions are implemented dealing with different hardware-software partitioning approaches of the application. On the one hand, the hardware coprocessors develop 1-D or 2-D (direct and inverse) Fast Fourier Transform operations since they are the most compute-intensive tasks involved in the processing, and they permit thus to accelerate the whole matching stage. On the other hand, the soft-core processor computes those generalpurpose operations and controls the communication with an external host. The system is composed of Xilinx Virtex 4 XC4VSX55 FPGA. MicroBlaze is provided with 32 kB of internal memory and it operates at 100 MHz in the suggested platform. Different implementations of the hardware coprocessors are done operating at frequencies in the range from 86 MHz to 102 MHz. Considering images of size 64×64 pixels, the fingerprint matching is accomplished in 25 ms. In [Kannavara and Bourbakis, 2009] authors develop a fingerprint matching stage under an FPGA device. The proposed matching algorithm deals with the Local-Global graph methodology, which combines the minutiae features and the ridge information to determine the uniqueness of fingerprints. The Local-Global graph method embeds both the local information (i.e. region descriptions) and the global information (i.e. the topology of the image and the relationship among the different regions) of the image for the purpose of authentication. A database of synthetically generated 8-bit greyscale fingerprints of size 240×320 pixels is used to evaluate the recognition accuracy of the suggested matching algorithm. The most compute intensive parts of the matching algorithm are implemented by hardware in Xilinx Virtex 5 XC5VLX30 FPGA device. 164 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Jiang and Crookes, 2008] authors develop, by means of hardware-software codesign techniques, one dedicated processor responsible for fingerprint matching in biometric identification applications under large databases. The processor is instantiated in one Xilinx Virtex-E FPGA device, and it is in charge of minutia matching operations running at operation frequencies of 65 MHz. It is able to match 1 million of fingerprints in 820 ms, which results up to 45 times faster than a personal computer platform. When the same matching application is written in C++ under Microsoft Visual Studio, the same processing takes 38.165 s when executed under a personal computer platform based on an Intel Celeron MPU running at 2.8 GHz. The works [Lindoso et al., 2005] and [Lindoso and Entrena, 2007] deal with the acceleration of a correlation-based fingerprint matching algorithm focused on Zero-Mean Normalized CrossCorrelation techniques by means of its hardware implementation on an FPGA device. Those works aim at implementing the matching algorithm in both spatial and spectral domains based on application-specific coprocessors running at frequencies in the range of 250-400 MHz under the Xilinx Virtex-4 XC4SX55-11 FPGA. Acceleration factors of two orders of magnitude, from ×156 up to ×372, are reached when compared with the software implementation of the same algorithm under a personal computer platform based on a Pentium IV microprocessor running at 3 GHz. This speed-up permits to reach real-time image processing performances (execution times in the range of tenths of ms, always below 160 ms) when dealing with greyscale images of size 256×256 pixels and correlation regions of size 12×12, 16×16, 20×20, 32×32 or 50×50 pixels. Given a target matching time for the fingerprint verification application, the development of dedicated hardware coprocessors under FPGAs proves to speed up the processing without losing accuracy with regard to the software implementation of the same algorithm. Therefore, it is possible (i) to incorporate additional processing stages to the original algorithm, or (ii) to develop more complex algorithms able to improve the authentication accuracy performance of the whole recognition system within the target matching time when further hardware coprocessors are added into the system architecture. No recognition accuracy results are provided in these works. Other implementations of matching processors for fingerprint recognition applications can be found in [Chung et al., 2005] and [Lindoso et al., 2007b]. In [Lindoso et al., 2007b] authors afford the acceleration of two different minutiae-based fingerprint matching algorithms by means of the design of application-specific hardware processors synthesized in one FPGA device. Both algorithms under test are based on point pattern matching. The developed hardware matcher processors run at operating frequencies of 81 MHz, and both are implemented in the Xilinx Virtex4 XC4VLX22 FPGA device. The matching execution time performance is compared with the software implementation of the same algorithms under a Pentium IV-based personal computer platform operating at 3 GHz. Speedups in the range of ×23.8 and ×25.4 are reached in the proposed system. The achievement of high performances at low cost proves that the suggested system architecture based on hardware-software codesign techniques is suitable for the development of on-line fingerprint recognition applications. Apart from acceleration goals, other purposes to use FPGAs in the realization of embedded biometric systems are the improvement in security and privacy aspects. In this direction, some examples can be found in [Erat et al., 2007], [Bakhteri and Hani, 2009] and [Lim et al., 2010]. a) In [Erat et al., 2007] authors make use of one FPGA to develop a hardware circuit in charge of generating random numbers in a personal recognition system based on fingerprints. The random numbers are used as private keys in one AES-based encryption scheme in order to protect the fingerprint templates of individuals. With this, it is possible to enhance the security of the whole application in both enrolment and authentication phases. Each time a successful authentication takes place, the private key is automatically changed to increase the robustness of the system. Few details are given about the implementation of the recognition system. One sweeping sensor TouchStrip TCS3 from UPEK and one FPGA device are used. The average execution time for the verification process is less than 1 s. The recognition accuracy performance of the system, when 165 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 evaluated under a custom database of 440 images of size 124×180 pixels deduced from 88 fingers acquired with the mentioned sensor, is EER = 8.12%. b) The work [Bakhteri and Hani, 2009] deals with the implementation of a biometric encryption system based on fuzzy vault. Biometric encryption system refers to the fact that the application uses user biometrics to grant access to one secret in the form of crypto key. It merges cryptography and biometric authentication to extend the security of current embedded systems. The fuzzy vault stores the crypto key, which is encoded by the fingerprint template of the legitimate user. The only way to decode the fuzzy vault is by authenticating the right user. The query fingerprint is matched with the stored template, and only in case of successful matching the crypto key is disclosed. The application is developed purely by software under an FPGA device. The soft-core Nios II is instantiated in the FPGA to perform all the processing. This works covers therefore a new field of application of biometrics: the integrity protection of keys or sensitive data used in cryptographic systems. c) Similarly, in [Lim et al., 2010] a fingerprint matching system based on fuzzy vault theory is designed to increase the level of security in the storage of those sensitive data like users fingerprints or users templates. The implementation of the matching stage is done by means of hardwaresoftware codesign techniques under one Xilinx XC3S500E FPGA device. The suggested implementation permits to reach real-time performances in the processing (matching execution time in the range of 2 s when the processing is performed by software, and of about 0.5 s when it is implemented with the support of dedicated hardware processors and hardware-software codesign techniques). Other works make use of HPC platforms instead of embedded systems in the development of biometric applications. In this direction, the work [Ratha et al., 1995a] addresses the development of a hardware coprocessor specifically focused on fingerprint matching based on minutia points with allowable elastic deformation in the fingerprint impressions. The coprocessor is suited to identification (one to many matching) applications. It can process 17.1×106 point features per second, and assuming an average content of 65 minutia points per fingerprint, the matching speed can reach the theoretical performance of up to 260000 fingerprints/s. The system architecture is composed of one SUN SPARC station host and the companion coprocessor Splash 2 system. The Splash 2 system consists of an array of Xilinx 4010 FPGAs through different Splash boards. Each Splash board contains up to 17 Xilinx 4010 FPGAs with a dedicated block of memory of size 16×512 kB for fingerprint processing purposes. The Splash 2 system is described in VHDL, and is connected to the host through an interface board that extends the address and data buses of the host system to the Splash boards. The original sequential algorithm, implemented purely by software in the host, is able to perform 70 matches per second whereas the parallel algorithm, implemented by means of hardware-software codesign techniques with the additional support of the Splash 2 system running at 1 MHz, reaches 6300 matches per second. In case of increasing the operating frequency till the maximum allowed frequency for the Splash 2 design of 17.1 MHz, an acceleration factor of 110000 matches per second –a performance slightly lower than the theoretically expected of 260000 matches per second because of some constraints in the physical implementation– is also expected. Most of the processing involved in the application like the image enhancement and feature extraction stages take place in the host, and only those time-consuming operations involved repeatedly in the matching stage are implemented by hardware in the Splash 2 system. This work is a clear example of processing acceleration through hardware-software codesign with special emphasis on hardware parallelization with the final goal of maximizing the performance of the overall matching system. In [López et al., 2005] authors develop a fingerprint feature extraction algorithm by means of hardware-software codesign techniques under a powerful computational platform composed of: (i) a Pentium IV workstation running at 3 GHz with 2 GB of RAM, and (ii) one PCI card connected to the system through a 33 MHz PCI bus, which is equipped with a Xilinx Virtex 2 XC2V6000 FPGA device running at 65 MHz and 24 MB of on-board SRAM. 166 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The feature extraction algorithm is based on minutia points, and it is split into a set of sequential steps such as fingerprint image normalization, local ridge orientation map computation, fingerprint filtering, image binarization, image thinning, minutiae extraction, and minutiae filtering. The application deals with greyscale fingerprint images of size 388×374 pixels and resolution 500 dpi. The software processing is carried out by the Pentium processor through Matlab, and a Matlab toolbox permits to access the PCI bus and send/receive 32-bit data blocks to/from the PCI card. In the PCI card, some dedicated hardware coprocessors instantiated in the FPGA device are in charge of the execution of those computationally intensive tasks that take place in the fingerprint extraction algorithm. The fingerprint thinning process is one of the most expensive tasks: it takes between 60 s and 90 s when executed purely by software in the personal computer platform. Therefore, and in order to speed up the processing, this task is described in Handel-C and ported to the FPGA. Making use of parallelism design techniques, it is possible to implement a dedicated hardware processor able to execute the thinning process of fingerprint images in less than 100 ms in the FPGA. Because of throughput limitations in the communication link between the personal computer and the FPGA, 1 s is added to the execution time performance of the proposed system, which results in 1.1 s for the thinning stage. The acceleration performance achieved in the thinning process is proven to be satisfactory in the suggested system architecture when compared against the initially purely-software-based solution. A similar work is shown in [Thomas et al., 2010], where authors develop a fingerprint recognition system by means of hardware-software codesign techniques. In order to describe the hardware processors in charge of part of the processing (image binarization and thinning tasks), the Embedded Matlab HDL tool is used. The software processing is executed under one 1.84 GHz Intel Core 2 Duo personal computer platform, and the hardware processing is carried out in an FPGA device. The implementation of the binarization and thinning tasks in hardware results much faster than the software approach: the binarization process takes 125 µs in software and 10.38 ns in hardware, and the thinning takes 429 µs in software and 27.34 ns in hardware. Next, some more works that deal with the implementation of the whole AFAS application under SOC or FPGA devices are cited. In [Yang et al., 2003] and [Yang et al., 2006] authors perform the porting of a fingerprint verification algorithm, initially developed under a personal computer workstation, to an embedded system platform composed of one 50 MHz LEON-2 soft-core processor and several instances of application-specific hardware coprocessors (DFT –Discrete Fourier Transform– operator for minutiae processing ,and AES –Advanced Encryption Standard– operator for data encryption purposes) under a Xilinx Virtex-II FPGA device. By means of optimization and acceleration techniques applied at algorithm level, hardware level and software level, it is proven that it is possible to embed the whole application into one low-cost platform that features limited memory and computational power, and it is feasible to achieve real-time performance. The suggested authentication system, developed by means of hardware-software codesign techniques, is able to complete the processing, that is (i) query fingerprint acquisition stage, (ii) feature extraction phase (based on minutia points), (iii) matching of the query fingerprint with one pre-stored template fingerprint, and (iv) secure communication of the matching result to the external world, in less than 4 s. The biometric application provides higher security than other recognition systems based on physical ID tokens or knowledge-based tokens (passwords). Besides, further security is added to the system by storing the user’s template in the embedded system platform, thus the biometric information is not disclosed and the amount of data that needs to be transmitted out of the embedded system is minimized. A set of custom instruction extensions is added to the embedded processor in order to optimize the execution of the recognition algorithm under the proposed hardware. In parallel, the resource optimization tasks (memory usage, processing scheduling, usage of fixed-point operations instead of floating-point operations, etc.) carried out in this work provides the needed efficiency to reach high performance at low cost. 167 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Barrenechea et al., 2007] and [Barrenechea et al., 2009] authors afford the physical implementation of a fingerprint minutiae extractor and matcher system by means of hardwaresoftware codesign techniques under one small FPGA device. The hardware of the system mainly consists of the programmable logic device Spartan 3 XC3S-1500-4 from Xilinx Inc. and the Fujitsu MBF200 fingerprint sensor, all embedded in the GR-XC3S-1500 commercial development board. The minutia extraction algorithm used in this work corresponds to the open source software MINDTCT (NFIS2 routine from NIST), which uses a great amount of floating-point operations and leads to a penalty in time when executed under low-performance processors that do not feature FPU cores. The proposed matching algorithm corresponds to the open source software BOZORTH3 (NFIS2 routine from NIST). Both minutiae extraction and matching algorithms are adapted to the suggested embedded system architecture in this work. The recognition algorithm deals with 8-bit greyscale images of resolution 500 dpi. The application is split in the following processing stages: image quality maps evaluation (based on segmentation and field orientation map features), image binarization, minutiae extraction, minutiae filtering, minutiae quality assessment, and minutiae matching (based on BOZORTH3 algorithm). A 32-bit fixed-point soft-core LEON-2 CPU able to operate at up to 50 MHz is instantiated in the FPGA to take care of most of the processing. Several implementations dealing with different hardware-software partitioning approaches of the same algorithm are carried out in this work to compare the execution time performance achieved in each of the scenarios: (i) Purely software implementation (8 kB of instructions and data cache, floating-point operations emulated in software with integer arithmetic under the system CPU operating at 50 MHz): minutiae extraction 157 s, matching 51.6 s. (ii) Hardware-software implementation (8 kB cache, the system CPU runs at 31 MHz, and one FPU and one DFT processors are instantiated in hardware to accelerate some specific stages of the algorithm): minutiae extraction 12.7 s, matching 51.8 s. (iii) Hardware-software implementation (8 kB cache, the sytem CPU runs at 37 MHz, and one FPU and one DFT processors are instantiated in hardware): minutiae extraction 9.5 s, matching 43.7 s. (iv) Hardware-software implementation (4 kB cache, the system CPU runs at 40 MHz, and one FPU and one DFT processors are instantiated in hardware): minutiae extraction 9.2 s, matching 41.3 s. When applying some optimizations at either hardware design level, or hardware and software design levels, the execution time performance of the minutiae extraction process gets further improved till reaching values in the range of 4.14 s and 3.36 s respectively (in both scenarios the system operates at frequencies of 40 MHz). However, the fact of instantiating the FPU core by hardware in scenarios (ii), (iii) and (iv) constrains the maximum operating frequency of the CPU and the rest of processors instantiated in the FPGA in those scenarios. Moreover, the instantiation of the FPU core affects the amount of cache that can be instantiated in each design so the limited amount of resources available in the selected FPGA device makes not possible to achieve real-time performances in the whole processing (extraction and matching) covered in this work. This work proves therefore that certain minimum amount of resources is needed to guarantee the real-time performance of any application. In case not enough resources are available in the system, the execution time performance of the application results severely affected. In [Schaumont et al., 2005] authors address the development of an embedded fingerprint authentication device called ThumbPod. ThumbPod is a biometrically driven electronic key that establishes a strong and secure link between the owner of the key and the key itself. The biometric features extracted from the fingerprint(s) of the owner of the key are stored in the form of a biometric template in ThumbPod’s internal memory during the enrolment stage. In this way, the saved template makes ThumbPod a personal device. In the authentication stage, the spatial features of the user’s fingerprint are compared with the template in order to determine whether the user is the genuine owner of the key. The complete authentication process takes place in ThumbPod device: it scans the user’s fingerprint, extracts the biometric features of the acquired fingerprint, compares them with the template, and generates a positive or negative authentication result. The 168 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 authentication result is used in a context of a secure client-server protocol, where the server relies on the client (ThumbPod) to implement secure authentication. A challenge-response mechanism is implemented to guarantee the security of the system. The server sends a random challenge to ThumbPod that can only be properly answered by means of a successful fingerprint authentication. ThumbPod is developed by means of hardware-software codesign techniques under a Xilinx VirtexII XC2V1000 FPGA device. One AES coprocessor is developed into the FPGA to implement the security protocol in the authentication procedure. A fingerprint sensor is embedded in ThumbPod device to acquire greyscale images of size 128×128 pixels in both enrolment and authentication stages. Feature extraction and feature matching routines based on minutia points are computation intensive, and require hardware acceleration by means of one specific DFT coprocessor. The biometric token is therefore embedded in a platform composed of the programmable logic device (where some processing units are instantiated such as a 32-bit RISC LEON-2 processor + 2 kB data cache + 2 kB instruction cache + DFT coprocessor + AES coprocessor + 2 UARTs + volatile and non-volatile memory controllers), 1 kB ROM, 32 MB DDR RAM, the Authentec AFS2 live-scan fingerprint sensor, and one RS-232 link with the external server. In terms of performance, the authentication process takes in the range of 4 s, and the achieved recognition accuracy is FAR = 0.5% and FRR = 0.01%. Moreover, in [Hwang et al., 2003] authors present an example of application based on ThumbPod embedded system where cryptographic and biometric functionality is implemented. It shows the advantages of accelerating Java functions in both software (routines coded in C programming language) and hardware/software (VHDL instances of application-specific coprocessors). A pure Java implementation of the Rijndael AES algorithm can be accelerated by a factor of ×6.8 in case of software implementation under a 32-bit soft-core LEON-2 low cost processor, or up to ×333 when implemented through hardware/software codesign techniques in a programmable gate array device. In [Militello et al., 2008] authors afford the physical implementation of a fingerprint-based personal authentication system based on core and delta singular points. The whole system is prototyped on the Celoxica RC203E board, which is equipped with a Xilinx Virtex-II XC2V3000 FPGA. Two specific hardware coprocessors have been instantiated in the FPGA that are in charge of extracting the singular points present in the fingerprint impressions, and matching those singular points of template and query fingerprints based on their correspondence and relative position (distance and orientation). The dedicated coprocessors work at operating frequencies of 25 MHz. The system is tested with the FVC2006 DB2 database, which is composed of 366 fingerprint images of size 560×296 pixels acquired by means of a photoelectric sensor. Similar recognition accuracy performances than others published algorithms based on minutia points and evaluated with the same database are achieved in this work. The developed system features a FAR = 1.2% and a FRR = 2.6%, with an execution time for the whole authentication process (image preprocessing, feature extraction and matching stages) of 34.82 ms in average. The work [Wang et al., 2007b] presents the design of an embedded fingerprint recognition system based on minutia features under a SOPC Cyclone II EP2C35 device from Altera Corporation. The whole system design includes fingerprint image acquisition, fingerprint image preprocessing, minutiae extraction, template storage and minutiae matching. A Nios II RISC soft-core processor is instantiated in the SOPC device in order to execute some of the processing stages that take place in the recognition algorithm, as well as to monitor and control the different processing steps in which the recognition algorithm is split. Those compute-intensive tasks that, in case they were executed purely by software by the Nios II processor, could compromise or extend too much the execution time performance of the whole recognition process are ported to hardware. In this way, they can be executed by dedicated hardware coprocessors instantiated in the FPGA as well. This makes possible to accelerate those critical tasks of the recognition algorithm. The final partitioning of the application is as follows: the image preprocessing –including image normalization, frequency extraction, field orientation extraction and Gabor filtering– are executed by hardware, and the rest of tasks involved in the recognition algorithm like binarization and thinning tasks, minutiae 169 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 extraction and fingerprint matching are executed by software (Nios II). A Veridicom FPS200 fingerprint sensor of size 256×300 pixels and resolution 500 dpi is used for fingerprint image collection purposes. The final hardware-software codesign of the recognition system features the following performances: (i) FRR = 1%, (ii) fingerprint acquisition execution time = 2.3 s, and (iii) image processing and feature extraction execution time = 49.7 s. Since the matching stage is based on minutia points, the execution time performance of the matching stage is quite dependent on the amount of minutia points present in template and query fingerprints. The average matching execution time is not detailed in this work. Another similar example that addresses the physical implementation of a fingerprint recognition application is presented in [Li et al., 2007]. The suggested system architecture consists of one Cyclone II EP2C35 device and the FPS200 fingerprint sensor. The proposed implementation features one accuracy performance with FAR < 5% and FRR < 20%, and one execution time performance of 11.57 s when implemented by means of hardware-software codesign techniques. Although hardware-software codesign techniques have been used in those two previously cited works, they are examples where the achieved acceleration is not sufficient to guarantee real-time performance in the recognition process. In [Wang et al., 2004] and [Wang et al., 2005] authors address the implementation of a fingerprint authentication system under two different embedded system architectures: one is based on discrete components and the other tries to increase the level of integration by using a system-on-chip device. In the second scenario, the system-on-chip device embeds one 32-bit RISC processor running at 200 MHz, 8 kB + 8 kB of data and instruction caches, on-chip ROM and SRAM memories, and a bit-serial FPGA where to instantiate application-specific hardware coprocessors running at 50 MHz. The FPGA has a serial interface, with about 60 kB of configuration memory and 600 µs of configuration time. Additionally, external FLASH memory and one fingerprint sensor are used to build the system. The same fingerprint minutiae extraction and matching algorithms are developed by means of hardware-software codesign techniques in the suggested platforms. In both scenarios one RISC processor is in charge of the image processing and feature extraction stages, and one FPGA is responsible for the matching stage. The program code of the algorithm takes 30 kB, and in terms of recognition accuracy performance the implemented algorithm features FAR < 0.001% and FRR < 0.1%. This work analyses the impact on execution time performance when (i) the embedded system is deployed with discrete components versus when (ii) the embedded system is implemented with a SOC device built with 0.18µm CMOS technology. In scenario (i), the execution time is less than 1 s for the enrolment stage and less than 0.04 s for the fingerprint verification stage, whereas the whole execution time is reduced to less than 0.3 s and 0.01 s for the enrolment and verification stages respectively in case of SOC implementation. Therefore, it is proven that additional speed-up is achieved in case of embedding most of the system processing units (CPU, hardware coprocessors and memory blocks) in one single chip. In [Danese et al., 2010] and [Danese et al., 2011] authors develop a fingerprint identification system implemented by means of hardware-software codesign techniques. Both enrolment and matching stages are covered, but only the matching stage, which is based on Band-Limited Phase Only spatial Correlation (BLPOC) algorithm, is implemented in hardware since this is the critical section of the algorithm to be repeated as many times as the size of the database. Up to 32 rotated versions of the template fingerprint (from -16 to +15 degrees with steps of +1 degree) are compared with the query fingerprint. The accuracy estimation of the proposed algorithm over FVC2002 databases results in EER = 6.16%. One experimental execution time performance evaluation test is carried out with a set of 404 templates deduced from 8-bit greyscale fingerprint images of size 256×256 pixels. When the identification application is implemented under one Intel Core2 Quad Q6600 processor running at 170 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 2.4 GHz (running 4 threads), the enrolment process of one image takes 770 ms, and the identification process with those 404 images takes 7 s, which means one average matching time of 17.3 ms per image. In case of using one Intel Pentium IV processor operating at 2.6 GHz (running 2 threads) one match is performed in 13 ms in average. The same processing under one AMD Quadcore Phenom 9750 processor (concurrently running 6 threads) achieves an average matching time of 3.5 ms. And the implementation of the suggested application by means of hardware-software codesign techniques under one Altera Stratix II EP2S60F672C3 FPGA, where one 32-bit soft-core Nios II processor and two matching cores operating at 100 MHz are instantiated, provides a matching performance of 660 µs in average. Therefore, the latest solution is proven to be more efficient. It provides a scalable system architecture based on programmable hardware that allows up to 250 parallel cores to run concurrently. With this solution, some flexibility is provided since it is possible to adjust the amount of hardware resources of the embedded system (selection of the size of the FPGA to be used) to the required performances demanded to the application. In [Rodríguez et al., 2007] authors address the implementation of a client-server system that performs mobile biometric fingerprint identification and verification processes through Internet. The architecture of the system is based on a XML Web Service acting as server, and a PDA with a built-in fingerprint sensor and wireless LAN connectivity acting as mobile client. In the mobile client, the acquisition process of the user’s fingerprint takes place. In the server, the whole identification/verification processing is performed by means of hardware-software codesign techniques under one FPGA device. This is a new system architecture proposal for application of biometrics under FPGA devices. More examples can be found in [Peng et al., 2008], [Binbin et al., 2010] and [Lorrentz et al., 2009]. 4.6.4. Other Biometric Systems Although fingerprints are clearly the most studied biometrics, other biometric traits are also under active research nowadays. Among them, face, iris, hand or finger veins, hand geometry, and voice or speech recognition are becoming more and more relevant in the specialized literature. To set some examples, in [López et al., 2011] authors address the implementation of an iris recognition algorithm based on IrisCode calculation under different processing platforms. The application deals with 8-bit greyscale iris images of size 640×480 pixels, and the processing covers the following stages: scrub specular reflections, iris localization, pupil boundary localization, detection and fitting of eyelids, fine-tuning models of iris boundaries, dimensionless sampling, eyelashes removal, and IrisCode creation. The execution time performance reached by the same algorithm under different processing platforms, with different memory and peripheral configurations (FPU, cache, etc.), is compared: a) under a personal computer platform based on Intel Pentium processor running at 133 MHz, the reached execution time performance is 1112 ms; b) under a personal computer platform based on Intel Centrino processor operating at a higher frequency of 1.7 GHz, the processing takes 39.3 ms; c) when porting the application to an embedded system platform based on ARM922 running at 160 MHz, the execution time performance is much higher, in the range of 3162 ms; d) when porting the application to another embedded system platform based on the soft-core processor MicroBlaze, instantiated under the Spartan 3 XC3S200 FPGA from Xilinx Inc. running at 40 MHz, the processing takes 2591 ms; and e) when applying hardware-software codesign techniques to the aforementioned FPGA platform in order to instantiate not only the soft-core MicroBlaze processor but also other 4 additional dedicated hardware coprocessors running at 40 MHz as well (in charge of those compute intensive tasks of the processing like scrub specular reflections, iris localization, pupil boundary localization, and finetuning models of iris boundaries), the achieved execution time performance is 522.6 ms. These results prove the advantages of implementing the application by means of hardware-software 171 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 codesign techniques under low-cost embedded platforms based on programmable logic, where to exploit the parallelism and pipelining techniques inherent to the hardware design. Similarly, the work [Liu-Jimenez et al., 2011] carries out a comparison of the performances reached when implementing the same personal authentication algorithm based on iris biometrics under three different platforms: a) a personal computer workstation running at a working frequency of 2.8 GHz; b) an embedded system based on a 32-bit ARM microprocessor device operating at 60 MHz; and c) an embedded system based on dedicated hardware coprocessors instantiated into a Virtex 4 SX35 FPGA device running at operating frequencies in the range of 112MHz – 122MHz. Total execution times of 5660 µs, 56600 µs and 295.459 µs – 321.674 µs are achieved by each of the platforms, respectively. Slight loss of accuracy can be provided in those solutions based on fixed-point operators (under microprocessor and hardware-based platforms) when compared against the original algorithm based on floating-point operators (under a personal computer platform). In [López-Ongil et al., 2004] authors implement a personal recognition system based on hand geometry under a processing platform mainly based on an FPGA Virtex 2000E device from Xilinx Inc. operating at a frequency of 33 MHz. Based on estimations, the presented solution is able to reduce the processing time by three orders of magnitude with regard to low-cost microprocessor solutions thanks to the usage of parallelism and pipelining techniques. The algorithm is based on five stages: hand image capture and storage, greyscale conversion, edge detection, features extraction and template-query features matching. The processing is split into three basic tasks: memory readings, memory writings, and arithmetical operations. Specific hardware coprocessors are in charge of those compute-intensive tasks, leading to a typical execution time for the complete algorithm of 47 ms, much less than the estimated execution time (of about 20 s) in case of performing the same algorithm under a low-cost and mid-performance microprocessor alone able to reach 1 MIPS. Other physical hardware-software implementations of biometric processors under FPGA devices can be found in [Kumar et al., 2007], [Tumeo et al., 2010], and [Matai et al., 2011] dealing with face recognition; [Mohd-Yasin et al., 2004], [Liu-Jimenez et al., 2005], [Liu-Jimenez et al., 2006], [Liu-Jimenez et al., 2007], [Militello et al., 2009], [Rakvic et al., 2009] and [Shafer et al., 2010] dealing with iris recognition; [Im et al., 2000], [Malki et al., 2006] and [Khalil-Hani and Eng, 2010] aiming at hand or finger vein recognition; and [Kannavara et al., 2009], [Ramos-Lara et al., 2009], [Whittington et al., 2009] and [Cheng et al., 2011] focused on voice or speech recognition, among others biometric technologies. 4.6.5. Multibiometric Systems Multibiometric systems refer to those applications that merge several biometrics in order to improve the performances exhibited by the resultant systems in terms of recognition accuracy, robustness and reliability features, and security and privacy concerns. In this direction, the work [Zhu and Xie, 2007] provides the details of the development process, by means of hardware-software codesign techniques, of a multimodal biometric recognition system based on fingerprints and iris. The computational platform used to develop this work is based on a 32-bit floating-point DSP TMS320C6713B from Texas Instruments and one FPGA Cyclone EP1C12 device from Altera Corporation. The selected fingerprint sensor is the Veridicom FPS200 scanner of solid-state technology, able to acquire fingerprint images of size 256×300 pixels with a resolution of 500 dpi. The DSP processor runs at 300 MHz, and the hardware coprocessors instantiated in the FPGA act as companion accelerators of those more compute-intensive tasks that take part in the recognition algorithm. The execution time performance of the final system is improved with regard to the purely software implementation of the recognition algorithm under the DSP processor alone: it is possible to reduce the fingerprint matching time from 900 ms to 50 ms in the suggested application, which proves the efficiency of the proposed system architecture. 172 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In [Wang et al., 2011b] authors provide several examples of implementation of multibiometric personal authentication applications in embedded system platforms based on computational units such as MPUs, DSP processors or multi-core processors like those available in current portable cell phones or PDAs. Other research works dealing with the physical implementation of multibiometric systems, either in the form of purely software or hardware-software solutions, can be found in [Yoo et al., 2007], [Moskovitch et al., 2009], [Wang et al., 2009] and [Derawi et al., 2010]. 4.7. Conclusions Biometrics addresses the usage of physiological or behavioural human characteristics to determine or verify an individual’s identity. Physiological biometrics is based on data derived from direct measurements of a part of the human body. Fingerprints, iris-scans, retina-scans, hand geometry, and facial recognition are some examples of physiological biometrics. Behavioural biometrics, in turn, is based on measurements and data derived from an action taken by a person, and indirectly measure characteristics of the human body. Voice recognition, keystroke-scans, and signature-scans are examples of behavioural biometrics. The individual’s biometric characteristics are captured and encoded, and compared then against previously encoded biometric data stored in an electronic database to determine or verify the individual’s identity. Because biometrics technology utilizes unchanging and unique characteristics of a person that cannot be lost, stolen, shared or forgotten; it has the capability to be more accurate and convenient than traditional methodologies based on physical ID tokens or knowledge-based IDs. The development of reliable biometric authentication/identification algorithms and the physical implementation of personal recognition systems based on biometrics have emerged in the last decade as an active field of research. All over the world, the biometrics market is showing exceptionally high growth rates. However, much more work needs to be done to reach the demanded performances in terms of recognition accuracy, privacy and security concerns, reliabilitycost tradeoffs, power consumption targets and so on. Biometric devices, biometric systems, and their respective performances need to evolve over time till reaching the desired targets in order to make biometrics really pervasive in many fields of application, able to deal with physical security – systems used for granting access to buildings, offices, garages, etc.– and logical security – systems used for granting access to computers, networks, websites, etc.–, and accessible at affordable costs to whoever may need it. 173 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 174 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 5. AFAS Architecture Approach with FPGA This chapter covers most of the experimental work carried out along this thesis. It aims at being a self-contained section where the physical implementation of the different stages that take place in an automatic fingerprint authentication system is exposed. Because of the fact that the advances reached in this work want to be applicable to the industry, the quality/cost trade-off is taken into consideration since the beginning when defining the architectural approach of the system. The first pillar of the physical system is the quality aspect, understood as the whole performance exhibited by the system, not only in terms of functional requirements like the recognition accuracy but also in terms of other standpoints such as the real-time responsiveness, its ease of use, the robustness and security exhibited by the system against external attacks, its low power consumption, autonomy and portability, etc. The second pillar refers to the cost aspect of the product, understood as the amount of physical resources and engineering efforts required to build the system and achieve the desired performances. In this area, some factors such as the level of system integration, the development time (time-to-market), the EDA tools or the maintenance costs linked to the AFAS application play an important role, as depicted in Figure 50. The good balance between both contributing factors will chart the way to the efficient implementation of the automatic fingerprint authentication application. AFAS Product / Service Recognition accuracy Real-time response Ease of use Security Power consumption Portability P E R F O R M A N C E R E S O U R C E S Engineering skills Hardware needs Software needs EDA tools Time to market Maintenance costs QUALITY VS COST Figure 50. Quality & Cost balance in the development of an AFAS application. The suggested system architecture approach is based on programmable logic devices of type FPGA. Over recent years, FPGA devices have gained an enormous amount of processing power and functionality thanks to the continuous advances in silicon technologies. The current FPGAs are able to embed much more memory and logical resources, as well as many DSP blocks, multiple clock management units, and large amounts of high-speed transceivers for fast communication purposes in one single device. The technology has evolved till the point that the size of today’s FPGAs is several orders of magnitude higher than that of the first FPGAs, reaching values above two millions of flip-flops and LUTs. The programmability performance of FPGAs makes them unique in the market. Those applications requiring continuous evolutions in functionality, where the functionality can be split into sequential and parallel processing, or where software- and hardware-based processing units can take care of the application, are suitable scenarios to exploit the on-field programmability performance of FPGAs. With such a technology available, once one product is launched to the market, if some upgrades of the application are needed later on, FPGA-based designs can afford such evolutions with a minimum impact on the system. FPGAs are therefore specially focused on those applications that require flexibility at both software and hardware levels. The trends in consumer electronics point to the deployment of compact systems in charge of the execution of multiple and different kinds of processing under embedded system platforms. Let us 175 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 identify such functional processing as the high level application (HLA) in this chapter. Nowadays, some of those high level applications require the integration of a personal recognition stage based on biometric features as a first filtering step needed to provide certain level of security or customization (e.g. personalized access to laptops, mobile phones, personal digital assistants, personal computers, ipads, cameras, buildings, internet or other networks, e-commerce, health monitoring, border control, etc.). Let us identify the biometric recognition kernel as a low level application (LLA). The embedded system must be equipped with the needed resources to execute the specific application (HLA) and the personal recognition kernel (LLA). Therefore, it is needed to define the proper system architecture able to execute both applications (i) meeting the expected performances, (ii) making use of the minimum amount of resources, and (iii) at the lowest possible cost. Instead of thinking about the usage of specific resources for each application, it is obvious to try to define a unique system architecture able to share the same resources in the execution of both kinds of processing (HLA, LLA). In this way, the LLA will initially perform the user authentication task by making use of those available system resources, and once the user is properly verified, the HLA will take the control of the system resources in order to get full advantage of the functional density of the system at any time. Only the fingerprint sensor used to capture the biometric traits of the user is seen as a specific resource needed by the biometric LLA alone; the rest of system resources are generic and used by both the HLA and the LLA. Because of the limitations of state-of-the-art fingerprint recognition algorithms, it is needed to think about flexible embedded system platforms able to permit algorithms updates in the field. This flexibility required by the LLA is therefore also available to the HLA. Figure 51 shows the block diagram for the suggested embedded system platform. As it can be deduced from the figure, the programmable logic resources available in the system are used to synthesize at least those application-specific processing units required by the biometric recognition application. The programmable resources can also be used in order to develop one specific interface with the fingerprint sensor required in the application. Another possible option, as indicated in the figure, would be to use those standard interfaces (parallel ports, SPI link, I2C link, etc.) present in most of the systems to establish the required interface with the fingerprint sensor. FINGERPRINT SENSOR DATA & INSTRUCTION STANDARD I/O CACHES STANDARD INTERFACE I/O INTERFACE VOLATILE MEMORY NON-VOLATILE MEMORY FINGERPRINT SENSOR STANDARD I/O STANDARD INTERFACE I/O STANDARD INTERFACE I/O INTERFACE PROCESSING PROCESSING UNIT PROCESSING UNIT UNIT APPLICATION APPLICATION SPECIFIC APPLICATION SPECIFIC COPROCESSOR SPECIFIC COPROCESSOR COPROCESSOR MEMORY PROCESSING CONTROLLER PROCESSING UNIT UNIT APPLICATION APPLICATION USER INTERFACE APPLICATION USER INTERFACE USER INTERFACE SYSTEM BUS STANDARD I/O STANDARD INTERFACE I/O INTERFACE FPU STANDARD I/O STANDARD INTERFACE I/O INTERRUPT INTERFACE CONTROLLER STANDARD I/O STANDARD INTERFACE I/O COMMUNICATIONS INTERFACE CONTROLLER STANDARD I/O STANDARD INTERFACE I/O TIMER INTERFACE CONTROLLER STANDARD I/O STANDARD INTERFACE I/O OTHER INTERFACE CONTROLLERS HOST, NETWORK OR OTHER PERIPHERALS STATIC HARDWARE PROGRAMMABLE HARDWARE Figure 51. Embedded system architecture. 176 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In order to achieve high processing throughputs it is needed to develop multiprocessor systems. In the proposed system architecture it is done through the combination of (i) standard processing units like MPUs, MCUs, DSPs, multi-core systems, etc. equipped with additional peripherals like timers, interrupt controllers, communication controllers (UART, SPI, Ethernet, USB, CAN, etc.); and (ii) application-specific hardware coprocessors synthesized in the programmable logic resources of the system. The system is provided with enough volatile (SRAM, DRAM, etc.) and non-volatile (ROM, FLASH, E2PROM, etc.) memory according to the needs of the applications. The memory block and other resources are shared by some of the processors available in the system so arbitration mechanisms exist to avoid contention and critical bottlenecks in the access to the shared resources. Thus, it is possible to build any application and to embed biometric recognition on it by making use of the proposed system architecture. Although the system block diagram takes each block as an independent entity, most or some of them can be integrated in the form of a system-on-chip device. Many discrete devices such as memories, microprocessors, multiple peripherals or even FPGAs can now be integrated into one single system-on-chip device. Therefore, system-on-chip devices can be compared to small computers: they can store a personal key in such a way that it can never be read from outside, and only used internally by the system itself so the data are well protected against external attacks. Depending on the integration level featured by the system-on-chip device, more or less additional components will be needed to build the physical platform. This tendency to higher integration levels results in a notorious improvement in terms of reliability; and a dramatic reduction in terms of physical size, power consumption and cost of the embedded systems. In this direction, two physical platforms based on system-on-programmable-chip architectures that integrate programmable logic or FPGA blocks together with other hardware processing units have been evaluated to identify the advantages and disadvantages of the proposed system architecture when implementing one hybrid fingerprint recognition algorithm. In order to prove the validity of the proposed solution, the reached performance is compared with that achieved by other classical solutions based on either HPC platforms or low-cost embedded systems equipped with single-core or multi-core processors. 5.1. Background The required skills to develop a design based on FPGAs are more demanding than those needed to develop purely software applications. Some background on electronics and programmable logic design, as well as the knowledge of one hardware description language like Verilog or VHDL is additionally required to develop applications based on such a kind of architectures. Specific EDA tools, dependent on the device vendor, are normally available to reduce the development cycles when designing with FPGA or SOPC devices, but the designer needs to get familiar with the processing flow of each automated tool. Additionally, and similarly to what happens with software programming languages and their libraries of functions, some libraries with IPs, which embed certain popular functions/circuits, are available to speed up the development of designs based on programmable logic. As presented in chapter 4, some research works, carried out by academic institutions or commercial and industrial companies, have already exploited the acceleration performance provided by FPGA or SOPC devices in the implementation of some of the stages that take place in a biometric authentication application. One commonality of most of those research works is the fact that they use small devices with reduced amount of programmable logic resources, which constraints the functionality to be synthesized in programmable hardware in those designs. The main reason of this is the fact that, for a given technology, the cost of the FPGA increases as a function of its size: the larger the amount of resources available in the device, the more its cost, as graphically shown in Figure 52. Therefore, in those embedded applications oriented to the consumer electronics arena, which are strongly limited by the target costs, the usage of large and expensive devices is usually not feasible. For this reason, the size of the device has to be limited and consequently only a reduced part of the functionality can be accelerated by dedicated hardware. In order to avoid such 177 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 limitations in the development of real-world applications, some FPGA or SOPC devices provide run-time reconfigurability performance. Although this topic is not in the scope of this thesis, the suggested platforms that have been used in the experimental tests carried out along this thesis feature FPGA or SOPC devices with run-time reconfiguration performance. Run-time reconfiguration of FPGAs is the ability to dynamically modify the functional content of at least some portions of the device while the remaining portions continue to operate without interruption. Additionally to the capability of being programmed in the field, the run-time reconfiguration performance is a new feature unique to FPGAs and SOPCs. The main benefits of run-time reconfiguration are the reduction of physical resources (and therefore cost) provided by the timemultiplexing of the functionality, the reduction of dynamic power consumption that can be achieved by loading functions in the reconfigurable regions of the device on demand –keeping them empty or not used when not needed–, and the consequent increase of flexibility; whereas the main limitation is the reconfiguration overhead –the additional resources needed to change the functional content of the reconfigurable regions on demand, as well as the reconfiguration latency that takes place each time the reconfigurable regions are updated with new functional circuits–. The device can be split into one static region, and one or more reconfigurable regions. The flexible use of some of the programmable logic resources (corresponding to the reconfigurable regions) while the rest (corresponding to the static region) remains permanently fixed makes possible to partition one application in a way where those critical functionalities can operate without interruption and without loss of communication link with external modules or external hosts, whereas those other non-permanent functionalities can be swapped on-the-fly in the reconfigurable regions of the chip. The functional circuits that are implemented in both the static and dynamic regions of the device are stored in the form of bitstreams. The bitstreams that reside in the configuration memory of the device at a given time define its functionality. During power up, one bitstream that defines the functional content of the static region and the initial content of the reconfigurable regions is downloaded into the device. After the power up stage, partial bitstreams with new definitions of the functional content of each reconfigurable region can be downloaded on demand along the application execution time. This capability offers valuable advantages in those applications that can be scheduled as a set of mutually exclusive processing stages since flexible and dynamic reconfiguration of programmable hardware is possible to lodge different processing units at different time making use of the same physical resources of one single device. COST 60.00$ 52.00$ SPARTAN6 XC6SLX Family Optimized for Lowest Cost Logic, DSP, and Memory 23.00$ 11.80$ 7.80$ 5.35$ 3.75$ 2.00$ LX4 LX9 LX16 LX25 LX45 LX75 LX100 LX150 SIZE Figure 52. Relationship between the size (amount of resources) of one FPGA device and its cost. Costs details are approximate (only for reference). The research work carried out in this thesis is focused on the search of an efficient system architecture suitable for the development of biometric applications under embedded system platforms. SRAM-based devices like FPGAs or SOPCs offer the ability to exploit parallelism and to self-adapt to a wide variety of algorithms. Their hardware architecture results flexible, and it can be 178 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 tailored to the applications needs. Some benefits are proven to exist when affording the implementation of biometric applications with such a technology at the expense of the development time invested to do it. 5.2. Design Flow The physical implementation of the AFAS application under the suggested system architecture is a complex process that has been split in several stages. The development flow that has been carried out in this work is as follows: i) The first stage addresses the study of the fingerprint recognition algorithm to be used in the AFAS application. The proposed fingerprint recognition algorithm is not developed from scratch but based on those best-in-class reference biometric algorithms and known techniques described in literature. The fingerprint recognition process is composed of a set of image processing and pattern matching operations carried out in a sequential way. Given one template and one query fingerprints, the authentication process consists of the transformation of both images into the spatial distribution of two sets of distinctive features deduced from them. Those sets of features become the discriminatory information that unequivocally defines each of the fingerprints. Those sets of features are homogenized through the fingerprint alignment process so such information can be ported to the same space for fast comparison. A similarity score is deduced from the comparison of both fingerprint impressions and weighted in the range [0,1]. The final step consists of the decision whether both fingerprints belong or not to the same finger (user). The authentication result is determined through the comparison of the previously deduced similarity score with a certain threshold that depends on the reliability demands of the final application. Each one of the processes in the chain –fingerprint acquisition, image enhancement, feature extraction, feature sets alignment, feature sets matching and authentication result– aims at specific purposes. All the processes have been studied and defined separately under a top-down design methodology, and once each of the processes is established, the composition of all those permits the development of both enrolment and authentication stages in the recognition application, as indicated in Figure 53. ii) The next step consists of the implementation of the suggested algorithm. For such a purpose, high-level computing and programming languages like Matlab and C/C++ have been used for processing analysis, numeric computation, data visualization and performance evaluation. The recognition application has been initially coded in C/C++ interfacing with specific add-on toolboxes of Matlab under a personal computer platform in order to speed up the exploration of the proposed algorithm. After the initial composition of the algorithm, as set of iterative loops is carried out in order to tune the algorithm to the properties of the input fingerprints. A reduced set of fingerprint images, characterizing the different quality conditions of the acquired fingerprints in a real application, has been used to tune the algorithm. Because of the fact that the main purpose of this thesis is not the development of very accurate algorithms but its efficient implementation on lowcost computational platforms, the suggested algorithm has been submitted to few tuning loops till reaching acceptable recognition accuracy rates only. The proposed algorithm handles most of the processing steps carried out by typical fingerprint-based recognition algorithms today; therefore, it can be representative of the computational effort needed in state-of-the-art systems just to evaluate alternative system architectures based on low-cost embedded systems. iii) Once the algorithm is tuned and its recognition performance –when tested with a reduced but representative set of fingerprints– is accepted, the next step consists of the performance evaluation of the resultant algorithm when submitted to a large database of fingerprints. A public database is used for such a purpose. It is important to note that the fingerprint impressions used in the performance evaluation stage are different from those initial samples used in the algorithm tuning phase. The validity of the suggested algorithm is proven by comparing the achieved performance against those published performance results corresponding to other algorithms –coming from both the academia and the industry– submitted to evaluation under the same public fingerprint database. 179 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 iv) A new version of the proposed fingerprint recognition algorithm, entirely coded in C programming language, provides a compact version of the algorithm that can be easily ported and executed under personal computer platforms or microprocessor-based embedded systems for timing performance evaluation purposes. Different computational platforms have been used in order to analyze the processing efforts requested to the physical systems, as well as to measure the execution times reached by the application in each of the platforms. Hard- or soft-real-time performance is usually demanded to the AFAS applications. Therefore, those computational platforms under evaluation that do not reach such timing performances must obviously be discarded. FINGERPRINT-BASED PERSONAL RECOGNITION ALGORITHM STAGES Template Fingerprint Query Fingerprint ACQ ACQ ENH ENH EXTR ALIGN EXTR MATCH AUTH Match / Non Match ENROLMENT Template Fingerprint AUTHENTICATION Query Fingerprint ACQ ACQ ENH Legitimate ID Claimed ID ENH EXTR EXTR Query Features Set ALIGN Template Features Set Template Features Set MATCH AUTH Match / Non Match Figure 53. Fingerprint recognition stages: composition of Enrolment and Authentication processes. v) A new system architecture approach, driven by flexibility and high performance at low cost, is addressed at this stage trying to overcome the results achieved in the previous platforms. In order to achieve all three demands –flexibility, high processing power, and low cost– a topology based on embedded systems is suggested. It consists of at least one general-purpose microprocessor unit, acting as the master or the main processor of the system, and one programmable (and run-time reconfigurable) logic region or device of type FPGA where to embed highly flexible hardware coprocessors which will act as secondary or slave companion cores oriented to the execution of application-specific tasks. Moreover, volatile and non-volatile memory blocks are needed in order to store the program code to be executed by the master CPU, as well as the application data and the bitstreams that define the different coprocessors instantiated in the FPGA. All they are the key components of the suggested embedded system and can be either instantiated in the way of discrete components, or grouped together in the form of a system-on-programmable-chip device. 180 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Two different evaluation platforms, based on FPGAs from Altera Corporation and Xilinx Inc., have been used in this work in order to evaluate the proposed system architecture. A first implementation of the fingerprint recognition algorithm purely by software under the action of the master CPU in each of the evaluation boards is performed in order to identify which are those computationally expensive tasks that constraint the real-time performance of the application in the suggested platforms. Those critical tasks are identified and ported to hardware in further design loops by means of VHDL descriptions of made-to-measure coprocessors instantiated in the programmable logic portion of the embedded systems. The original software-only application is partitioned in this way into hardware and software tasks through hardware-software co-design techniques. Therefore, the proposed system is able to merge sequential processing (CPU-related) with the efficiency of parallel hardware (programmable logic-related). As many tasks as required are ported to hardware in order to achieve the real-time performance demanded to the application. The integration of both hardware and software tasks into the evaluation boards, and the hardware-software co-verification process of the resultant AFAS application permit to confirm that the embedded system meets the execution time and recognition accuracy targets. The AFAS development workflow addressed in this thesis is depicted in Figure 54. The latest step, corresponding to the development of a physical prototype adjusted to the specific requirements of the real application (memory size, peripherals, programmable resources, etc.) for validation purposes prior to the release of a final product to the market, has been discarded in this work. CONCEPT: Algorithm Definition MODELING: Algorithm Coding HIGH-LEVEL PROGRAMMING MATLAB & C/C++ Personal Computer Workstation VERIFICATION: Accuracy Evaluation PORTING: Embedded Implementation EMBEDDED PROGRAMMING C & ASM PCs & Embedded Systems (Vendor Development Boards) HW-SW PARTITIONING HW-SW INTERFACING HW-SW CO-DESIGN SW Programming (C / ASM) HW Synthesis (VHDL) C/ASM & VHDL Embedded Systems (Vendor Development Boards) FUNCTIONAL CO-VERIFICATION PHYSICAL PROTOTYPING Figure 54. AFAS application development workflow. 5.2.1. Authentication Algorithm: Processing Stages The design of reliable, efficient, and automated personal recognition systems based on fingerprint biometrics becomes a real challenge in the current technological age. The personal recognition system aims at verifying the identity claimed by an individual through the analysis of his/her own fingerprint characteristics. The user verification process is usually implemented as a checking point in those applications where certain security level is required prior to giving –or denying– access to 181 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 any individual to restricted areas, confidential information or constrained resources. The user who requests the access to the application must exhibit first his/her identity to the recognition system. The personal verification or authentication system is then responsible for confirming that the user is really the person who claims to be, by comparing the ID traits of the user against the ID traits of the claimed person, before providing him/her with the proper application privileges. The user’s verification process is split in two main stages: (i) Enrolment and (ii) Authentication; and each of the stages is structured into a set of sequential processing steps in the suggested application, as indicated in Figure 55. A: Legitimate Fingerprint B: User Fingerprint Image Acquisition Fingerprint Acquisition & Image Reconstruction Image Segmentation Image Normalization Image Isotropic Filtering Fingerprint Acquisition & Image Reconstruction Image Segmentation Image Normalization Image Isotropic Filtering Field Orientation Computation Filtered F. O. Computation Image Directional Filtering Image Binarization Image Smoothing Image Thinning Minutiae Extraction Minutiae Filtering Image Enhancement Field Orientation Computation Filtered F. O. Computation Image Directional Filtering Image Binarization Image Smoothing Image Thinning Feature Extraction Minutiae Extraction Minutiae Filtering Feature Storage B’: Query ID Field Orientation Map & F. O. Maps Alignment Minutiae Set Storage A’: Template ID Minutiae Sets Alignment Feature Sets Alignment Enrolment Stage Region of Interest Retrieval F.O. Maps (RoI) Matching Minutiae Sets (RoI) Matching Similarity Score Computation & Authentication Result Decision Feature Sets Matching Authentication Result A = B or A ≠ B ? Authentication Stage Figure 55. Processing stages of the suggested fingerprint verification system. The enrolment stage must be an inherently secure process by nature. The reliability of the whole recognition system depends to a great extent on the reliability of the enrolment process of legitimate users. The recognition system bases it modus operandi on the fact that only those users enrolled in the system have the right application privileges. The enrolment stage is normally implemented offline, and it associates the identity of an individual to a certain set of features deduced from his/her genuine fingerprint(s) impression(s). Owing to the uniqueness and permanence characteristics of 182 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 human fingerprints, the link between the user and his/her identity is established by means of that set of distinctive features extracted from his/her fingerprint impressions, which act in the end as the user’s identity token –Template ID– in front of the world. This information is stored by the recognition system, either in a secure database (in case of centralized AFAS) or in a smart card (in case of decentralized or individualized AFAS). Once the user is enrolled into the recognition system, he/she becomes a valid holder of the application privileges. In order to make use of those privileges, the user needs to claim his/her identity and to exhibit his/her fingerprint to the recognition system in the authentication stage. The AFAS application is then responsible for extracting those biometric features from the acquired fingerprint (query fingerprint), and matching them against those template features corresponding to the claimed identity (stored in the system during the enrolment stage). The comparison of both sets of features results in the decision whether both biometric characteristics belong or not to the same user. In case of a positive match, the authentication system assesses the individual as the genuine user and provides him access to the application. Otherwise, the individual is judged as an impostor and the access to the application is denied. In a similar way as a human expert methodology, to decide if two fingerprint impressions correspond or not to the same user, it is needed to determine those salient features of the finger impressions that can discriminate between different identities, as well as remain invariant when dealing with several images from the same finger/individual. A hybrid fingerprint matching algorithm has been implemented for such a purpose. Among the different features used in literature –coarse-level features such as ridge maps or field orientation maps, fine-level features like minutia points, or very-fine level features like ridge pores–, the proposed algorithm makes use of the minutiae set and the field orientation map as the personal identifier features [Maltoni et al., 2009]. The different processing steps that take part of the proposed fingerprint recognition algorithm are briefly discussed next. As it can be deduced from Figure 55, some of the processing steps are common in both stages. Fingerprint acquisition The first step in any automated fingerprint recognition system, in both enrolment and authentication stages, consists of the acquisition of at least one digital impression of the user’s fingertip. As already discussed in section 2.2, there exist in the market many techniques and many sensing devices for acquiring fingerprints. Among others, solid-state fingerprint sensors result in small size and low cost devices, suitable to be integrated in embedded systems [Mainguet et al., 2000]. In this direction, one solid-state fingerprint sensor based on sweeping technology is selected in this work. Image reconstruction The sweeping sensors feature a sensing surface with a width similar to one-touch sensors, but their height is only several pixels tall. In the acquisition stage, the user needs to sweep his/her finger over the sensing surface and a series of small slices of the fingerprint are collected by the acquisition system. An additional image reconstruction stage is then needed to get the acquired fingerprint impression in the same format as the one-touch sensors do. To enable image reconstruction without depending on the finger sweeping speed, the sensing area has enough pixel lines per slice, and the slice acquisition process is done fast enough to guarantee the overlap between consecutive slices. A specific processing algorithm, adjusted to the features of the selected sensor, is developed in order to reconstruct the fingerprint image on-the-fly, while the finger bitmap is being acquired. Image segmentation After the fingerprint image acquisition and reconstruction processes, the next stage consists of the fingerprint image enhancement process [Hong et al., 1998]. It aims at improving as much as possible the quality of the acquired fingerprint image in order to make easy the following feature extraction stage. Basically, the fingerprint image enhancement algorithm tries to satisfy two conditions: (i) to improve the clarity and distinctiveness of the ridge-valley structures, and (ii) to 183 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 remove that undesired noise that may be present on the original fingerprint impression caused by multiple factors. For such a purpose, the image enhancement process is normally composed of a set of sequential steps. The segmentation of the acquired and reconstructed fingerprint image is the first of the pre-conditioning steps in the suggested algorithm. It aims at isolating the valid fingerprint area –foreground– from the rest of the image –background–. Those non-ridge regions and those unrecoverable low quality fingerprint regions (smudged or noisy areas) are excluded as background of the image, whereas those remaining good quality areas, where valid information is available and from which true fingertip features can be extracted, are kept as image foreground. Only the image foreground is taken into consideration for further processing in the next steps. Image normalization The next preprocessing step consists of the normalization of the previously segmented image. The aim of the normalization process is to adapt the variations of grey level pixel intensities along ridges and valleys in the different regions of the fingerprint. Those too dark or too light fingerprint impressions affected by the environmental conditions present in the acquisition process (too dry fingers, too wet fingers, prints with smudges, etc.) can thus be compensated. As a result of the image normalization process, a more homogeneous greyscale fingerprint impression is obtained. Image isotropic filtering After image normalization, an isotropic filtering step based on Gaussian filters is applied in order to remove some of the hazard noise that may still be present on the ridge-valley pattern. The aim of this filtering stage is to reduce the influence of those noisy pixels by modifying the intensity level of each image pixel according to that of the pixels in its local neighbourhood. The proposed filtering technique is inspired by those enhancement methods based on Dyadic scale space theory [Cheng and Tian, 2004]. Field orientation map computation A fingerprint image can be seen as an oriented texture defined by the ridge-valley structure of the fingertip. The orientation field of a fingerprint image is represented by the set of angular directions of the ridges and the valleys around each local neighbourhood. In this work, the widely employed gradient-based methods to compute the field orientation map of fingerprints are exploited. The suggested algorithm is abstracted from [Rao and Schunck, 1989]. The input image is tessellated into square blocks of size 8×8 pixels and the dominant direction of the ridges and valleys in each block is estimated. As a result of the field orientation map computation process, a two-dimensional matrix that represents the orientation of ridges and valleys along the fingerprint pattern is obtained. Filtered field orientation computation An additional low-pass filtering process is applied to the previously computed field orientation map in order to help in compensating those incorrect estimations of the local ridge orientation that may be generated by multiple factors such as the presence of big noisy regions in the bitmap, or corrupted ridge and valley structures that have not been segmented, etc. The resultant twodimensional matrix defines the ridge-valley orientation map, and is considered valuable information to be used in the next processing steps for further enhancement of the fingerprint. Image directional filtering A set of up to 180 directional Gabor filters covering any possible angular orientation of the ridgevalley pattern in the range [0º,180) with 1º of resolution is used in order to reinforce the spatial distribution of ridges and valleys in each valid region of the fingerprint impression [Hong et al., 1998]. The fingerprint image is adaptively enhanced by convolving it with directional Gabor filters oriented according to the dominant direction of the ridge-valley pattern in every local neighbourhood. This directional filtering step, driven by the previously computed field orientation map, permits to remove some of the noise that could still be present in the input image while it preserves the true ridge-valley structures. 184 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Image binarization The convolution of the enhanced image with the oriented Gabor filters provides a direct measurement of the strength exhibited by the ridges and valleys in each local region of the fingerprint. The comparison, performed at pixel level, of such a measurement with an adaptive threshold makes possible to discern whether the pixel under consideration corresponds to a valley or to a ridge. As a result of this computational stage, a coarse-level binary ridge map is deduced from the greyscale input fingerprint, which preserves its local orientation information. Image smoothing The black & white image obtained in the binarization process can have a large amount of ridge pixels labeled as non-ridge pixels because of the presence of noise, spurious valleys, abrupt creases, smudges, spikes, etc. An additional processing step is usually needed in order to improve the binary ridge-valley pattern, correcting those deficiencies and imperfections present in the image. A directional smoothing process, abstracted from [Ratha et al., 1995b], is applied to filter the binarized fingerprint. The aim of this processing stage is to remove isolated valley clusters, noisy branches; and to improve ridge bifurcations, ridge endings, and the clarity of the complete image. Image thinning Image thinning or image skeletonization refers to the process that aims at reducing the fingerprint ridge pattern to a thin-line representation of one-pixel wide that preserves the topological and geometric properties of the binary image, and maintains its connectivity [Lam et al., 1992]. From the thinned image, it is easy to extract those characteristic points like ridge endings and ridge bifurcations –minutiae–. The proposed algorithm is abstracted from [Gao and Hall, 1989], and iteratively and selectively deletes the ridge pixels of the image till reducing to a single pixel the width of the ridges. Apart from retaining the significant features of the original image, the thinning algorithm intends to eliminate local noise without introducing distortions to the thinned pattern. Minutiae extraction Besides the field orientation map, the proposed algorithm makes use of minutia points as distinctive and discriminatory information that characterizes one fingerprint. One specific processing step is in charge of sweeping the thinned version of the fingerprint up looking for those local discontinuities (ridge endings and ridge bifurcations). The fingerprint field orientation map and the spatial distribution of minutia points along the ridge-valley pattern define the identity of the user [Mital and Teoh, 1996], [Jiang and Yau, 2000]. Minutiae filtering The image binarization and image thinning processing stages are critical for proper minutiae extraction. In the best possible case, a good binarization algorithm should be able of providing an image consisting of black ridges of uniform width on a white background. An image of this type can easily be transformed into a thinned image. However, when the binary image deviates from its ideal case, problems can appear. There are some basic effects in a binary image, generated by factors such as noise or contrast deficiency, which can lead to the development of noisy structures in the thinned image that can provoke either the appearance of fake minutia points, or the removal of valid minutia points. For this reason, all minutia points deduced in the in the feature extraction phase are considered as potential candidates only, and specific examination rules are applied to each minutia separately to evaluate its validity. This verification process is called minutiae filtering stage, and it pursues to confirm that only legitimate minutia features will be used to characterize the user’s fingerprint, so only true distinctive traits will be taken into consideration in the next stages. Feature sets (field orientation map & minutiae) storage In the enrolment stage, those extracted features, composed of the minutiae set and the field orientation map of the user’s fingerprint, are saved, together with any other relevant information of the user (his/her name, some PINs or passwords, details of his/her bank accounts, etc.), in a secure 185 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 database or a personalized smart card. Those reliable and genuine feature sets, inherent to the user, are linked to the user’s identity in the recognition system. Field orientation maps alignment In the authentication stage, the template and query fingerprints need to be properly aligned before proceeding with the matching process of those corresponding feature sets. Among the different alignment methods discussed in the specialized literature, this work makes use of those methods based on field orientation maps because of their outstanding performance with regard to other methods based on singular points (core and delta) or minutia points [Wakahara et al., 2007]. The field orientation map provides information about the complete fingerprint impression, and not only partial regions as it can happen with minutia points or singular points, so it overcomes the problem of not sharing common features in partial prints. Moreover, it tolerates the deformations inherent to the elasticity of the skin better than previous methods. The proposed algorithm uses the information provided by the field orientation maps of template and query prints and performs a correlation analysis in order to find the best possible alignment taking into consideration any relative displacement, rotation and elastic deformation between both prints. Even in case of partially overlapped prints, alignment by orientation field is possible. A correlation analysis of both two-dimensional feature matrices, in which all possible relative orientations – ∆Φ – and relative positions in horizontal – ∆X – and vertical – ∆Y – directions are considered, permits to determine the best alignment between both prints. For each relative position, a similarity score of both field orientation maps in the overlapped area is computed, and among all possible alignments, the one with the best similarity score that satisfies certain minimum requirements is selected as the alignment result. If, on the contrary, those certain minimum requirements demanded to guarantee the valid alignment are not met by any of the potential alignments under analysis, the query fingerprint is discarded and considered as a non-matched print. Minutiae sets alignment In case of successful alignment of template and query fingerprint impressions, those transformation parameters (∆X, ∆Y, ∆Ф) to be applied in order to align template and query fingerprints were deduced. Those transformation parameters are now used at this stage in order to align both template and query minutiae sets. Therefore, both field orientation maps and minutia points are aligned for proper matching in the next processing steps. Region of interest retrieval Prior to determining the degree of similarity between template and query fingerprints, it is needed to identify the overlapped region between both prints, which becomes the region of interest –RoI– for further inspection. From the total amount of features deduced from template and query prints, only those subsets of features available in the RoI –normally some portions of the field orientation maps and some minutiae subsets– will be considered in the matching stage. Feature sets (field orientation map & minutiae) matching Once template and query fingerprints are aligned and the RoI is identified, the matching process consists in quantifying the level of correspondence between those pairs of feature sets –the field orientation maps and the sets of minutia points– available in the RoI. A partial similarity score – ScoreFO – is deduced from the analysis of the field orientation maps in the RoI, and another partial similarity score – ScoreM – is computed from the comparison of the spatial distribution of template and query minutiae sets in the RoI. Similarity score computation The matching score generated from the comparison of the local and global minutiae structures in the RoI – ScoreM – is combined with the matching score linked to the field orientation maps – ScoreFO – in order to get a whole similarity score result – ScoreMatch –. The resultant matching score is computed as: 186 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ⎧η1 ⋅ ScoreFO + η 2 ⋅ ScoreM , ⎪ ScoreMatch = ⎨ ⎪ 0, ⎩ if ScoreFO ≠ 0 and ScoreM ≠ 0 otherwise. where: η1 + η2 = 1 , ScoreFO ∈ [0,1] , ScoreM ∈ [0,1] , ScoreMatch ∈ [0,1] . Authentication result decision The authentication result decision is made by comparing the value of the similarity score – ScoreMatch –, in the range [0,1], with a pre-specified threshold – ThresholdMatch –, also in the range [0,1] and driven by the own requirements of the final application, as indicated in the processing flow of Figure 56. // Match and Threshold score ranges: 0 – 100% if (Score Match >= Threshold Match) { Result = 1; // Authentication OK } else { Result = 0; // Authentication NOK } Figure 56. Authentication result decision. 5.2.2. Recognition Accuracy Evaluation In order to prove the validity of the suggested fingerprint recognition algorithm it is needed to proceed with the evaluation of its accuracy performance when submitted to test under a large fingerprint database. The fingerprint recognition algorithm needs to be properly tuned to the environmental conditions of the real application, and this includes the characteristics of the fingerprint sensor that is going to be used. For this reason, the database selected for evaluation purposes in this work has been collected with the same kind of fingerprint sensor that is planned to be used in the physical embedded AFAS. By selecting the database in this way, more objective results about the expected performance of the suggested algorithm in the real application can be deduced. In this direction, the selected database corresponds to the database DB3 of FVC2004 contest. This public database is 110 fingers wide, and 8 samples per finger in depth, which results in 880 fingerprint images. All the images were collected by using a thermal sweeping sensor FingerChip FCD4B14 from Atmel Corporation. The complete database is split in two subsets A and B. The subset A is composed of 100 fingers (800 images) and the subset B is composed of 10 fingers (80 images). The subset B is firstly used in order to adjust some of the parameters of the algorithm to the properties of the fingerprint images acquired with the selected sensor, and once the algorithm is properly tuned, the subset A is used in order to verify the real performance of the application. The performance evaluation procedure follows the same criteria than in FVC contests: (i) In order to get the impostor distribution, one sample of each finger in the subset A is collected. A total of 100 images are used, and each of the images is matched against the others to compute the False Match Rate –FMR– or False Acceptance Rate –FAR– distribution. If the matching of g against h is performed, the symmetric one (i.e., h against g) is not executed in order to avoid correlation. A total of 4950 matches are carried out. (ii) In order to deduce the genuine distribution, each of the samples corresponding to one finger is matched against the other samples of the same finger. Similarly to the impostor distribution procedure, if the matching of g against h is performed, the symmetric one (i.e., h against g) is not executed in order to avoid correlation. The amount of genuine tests results 2800, and from them it is possible to compute the False Non-match Rate –FNMR– or False Rejection Rate –FRR– distribution. 187 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 % Population 12 10 8 I (s) G (t) G (s) I (t) 6 4 2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t Figure 57. Genuine and Impostor distributions. Given one template and one query fingerprints, the recognition algorithm provides a similarity score between both images within [0,1]. Similar images, understood as images belonging to the same finger, will have scores close to 1; while dissimilar images, understood as images from different fingers, will present scores close to 0. After performance evaluation with the subset A, the algorithm features an EER = 4.162%. The Genuine and Impostor distributions –G(t) and I(t)–, the representations of the performance indicator rates FMR and FNMR as a function of the similarity threshold score t –FMR(t) and FNMR(t)–, and the Receiver Operating Characteristic curve of the tested algorithm (ROC) are shown in Figure 57, Figure 58 and Figure 59 respectively. % Population 100 90 80 70 60 50 40 30 20 10 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 FAR (s)FNMR (t) FRR (s)FMR (t) t Figure 58. False Match and False Non-match distributions. The parameter EER is the main indicator used to evaluate the performance of the recognition algorithms in FVC contests. If comparing the performance of the proposed algorithm against those published in FVC2004 dealing with the same database, the proposed algorithm would be ranked in 17th position from a total of 41 participants in the open category, where the winner algorithm presented an EER = 1.18% and the last classified algorithm an EER = 43.95%; or ranked in 5th position from a total of 26 participants in the light category, where the winner algorithm presented 188 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 an EER = 2.92% and the last classified algorithm an EER = 54.28%, as indicated in Table 48 and Table 49 respectively. 1,0E+00 FNMR EE 1,0E-01 Rl ine 1,0E-02 1,0E-03 1,0E-03 1,0E-02 1,0E-01 FMR 1,0E+00 Figure 59. Receiver Operating Characteristic curve. Position 1 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th st EER 1.18% 1.20% 1.64% 1.78% 1.85% 1.89% 2.05% 2.39% 2.70% 3.39% 3.54% 3.74% 3.82% 3.86% 4.03% 4.16% 4.162% 4.19% 4.65% 5.38% 5.94% 5.97% 6.15% 6.77% 6.97% 7.07% 7.07% 7.11% 7.30% 7.56% 8.22% 8.73% 9.46% 9.60% 10.07% 11.41% 11.46% 11.64% 14.39% 19.74% 37.92% 43.95% FMR100 1.68% 1.32% 1.86% 2.32% 1.89% 2.61% 3.11% 3.29% 4.04% 6.64% 4.89% 6.46% 7.00% 7.00% 7.43% 6.36% 29.85% 6.07% 9.89% 8.39% 8.71% 12.00% 17.07% 12.07% 11.64% 15.71% 15.79% 12.21% 14.32% 19.57% 14.96% 16.68% 17.82% 17.25% 15.46% 22.75% 23.54% 23.14% 24.96% 47.25% 92.32% 89.54% FMR1000 2.68% 3.11% 4.96% 5.64% 50.79% 7.14% 7.68% 6.50% 8.36% 11.75% 7.39% 9.93% 14.04% 11.96% 12.64% 8.79% 50.12% 9.25% 17.04% 11.46% 13.86% 20.54% 35.07% 19.11% 17.89% 25.46% 27.46% 19.96% 22.32% 22.71% 22.46% 26.79% 26.79% 31.68% 17.86% 22.75% 35.57% 24.11% 27.25% 60.96% 97.71% 96.04% ZeroFMR 4.89% 3.79% 9.89% 99.61% 53.96% 8.75% 9.57% 8.25% 14.04% 19.57% 7.96% 11.68% 18.07% 19.21% 14.32% 15.43% 61.86% 11.11% 23.89% 14.04% 14.93% 25.89% 54.21% 26.36% 21.96% 28.14% 50.93% 25.21% 40.25% 29.96% 31.96% 41.00% 32.29% 32.25% 21.96% 26.46% 60.46% 26.64% 28.86% 63.46% 99.68% 99.50% REJENROLL 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.14% 0.00% 0.00% 0.14% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% REJMATCH 0.00% 0.00% 0.04% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.15% 0.12% 0.00% 0.18% 0.00% 0.00% 0.09% 0.00% 0.00% 0.00% 0.00% 72.05% 0.00% 72.14% 32.90% 0.00% 0.00% 0.00% 17th 18th 19th 20th 21st 22nd 23rd 24th 25th 26th 27th 28th 29th 30th 31st 32nd 33rd 34th 35th 36th 37th 38th 39th 40th 41st Table 48. Performance comparison of the proposed algorithm against FVC2004 DB3 Open Category contest results. 189 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Position 1 2nd 3rd 4th st EER 2.92% 3.21% 3.53% 3.57% 4.162% 4.24% 4.40% 4.69% 5.07% 5.13% 5.38% 6.14% 6.93% 7.28% 7.56% 7.59% 8.20% 10.45% 11.46% 13.57% 50.00% 50.00% 50.00% 50.00% 50.00% 50.00% 54.28% FMR100 4.43% 4.68% 4.64% 4.86% 29.85% 7.04% 8.07% 7.43% 10.39% 7.32% 8.39% 10.04% 11.43% 13.07% 19.57% 11.86% 16.25% 29.04% 23.54% 25.46% 96.25% 100.00% 70.32% 100.00% 100.00% 100.00% 59.29% FMR1000 9.11% 8.50% 6.32% 6.43% 50.12% 11.11% 12.39% 11.25% 17.04% 10.93% 11.46% 18.46% 14.46% 20.64% 22.71% 15.32% 26.68% 53.50% 35.57% 32.61% 96.36% 100.00% 71.07% 100.00% 100.00% 100.00% 64.00% ZeroFMR 11.04% 14.39% 8.71% 8.14% 61.86% 14.29% 18.14% 21.11% 19.25% 13.14% 14.04% 22.93% 20.61% 24.36% 29.96% 18.71% 42.79% 68.36% 60.46% 41.43% 97.89% 100.00% 71.64% 100.00% 100.00% 100.00% 70.39% REJENROLL 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.71% 0.00% 0.00% 0.00% 0.00% 0.00% 0.14% 0.00% 0.00% 0.14% 0.00% 0.00% 0.14% 0.00% 100.00% 67.71% 0.14% 100.00% 100.00% 0.57% REJMATCH 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.53% 0.00% 0.00% 0.00% 0.00% 0.00% 0.18% 0.00% 0.00% 0.09% 0.00% 0.00% 0.09% 92.08% 100.00% 65.34% 100.00% 100.00% 100.00% 43.72% 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th 19th 20th 21st 22nd 23rd 24th 25th 26th Table 49. Performance comparison of the proposed algorithm against FVC2004 DB3 Light Category contest results. Other indicators used to evaluate the performance of the algorithms in the FVC contests are the following, which are also shown in Table 48 and Table 49: a) FMR100: the lowest FNMR for FMR <= 1%. b) FMR1000: the lowest FNMR for FMR <= 0.1%. c) ZeroFMR: the lowest FNMR for FMR = 0%. d) REJENROLL: Number of rejected fingerprints during the enrolment process. e) REJMATCH: Number of rejected fingerprints during the authentication or matching process. The first implementation of the recognition algorithm is carried out with floating-point operands in order to be as much accurate as possible in the different computations carried out along the recognition process (trigonometric computing, square root calculation, statistical analysis parameters like variance, standard deviation, etc.). After proving the validity of the proposed algorithm, a new version of the algorithm is developed by replacing those floating-point operations by fixed-point operations in order to reduce the complexity of the processing, and the computational demands of the physical platforms where to finally implement the AFAS application. A new evaluation performance loop of the modified version of the algorithm is performed with very similar results –the EER evolves from 4.162% to 4.242%–. Therefore, the new version of the algorithm is also accepted and used as reference to be implemented under low-cost and lowperformance microprocessor units without FPU processors in embedded system platforms in the next stage. 5.2.3. Processing Speed Evaluation under Software-based Platforms Nowadays, most of the applications that exploit biometrics-based personal recognition demand a fast response time to the physical systems in charge of the processing. In case of fingerprint-based authentication systems, soft real-time performance is normally required. In this specific context, soft real-time is understood as providing the proper recognition response within a reaction time short enough to be unnoticed by the user. This reaction time covers the interval elapsed since the user presents his identity credentials to the system and puts his finger on the sensing surface of the scanner device, till the moment when the automatic authentication system provides the result of the verification process. Reaction times in the range between 1.5 s and 3.5 s are usually accepted as 190 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 normal and valid authentication response times for any AFAS application. Therefore, this section focuses on the evaluation of the execution time performance of the proposed fingerprint recognition algorithm when implemented on different computational platforms. In order to perform a fair comparison among platforms, the same template and query fingerprints have to be used in all scenarios. Among the different images of FVC2004 DB3 database, two fingerprint impressions taken from the same finger have been selected as template and query fingerprints respectively, thus it is possible to build some representative enrolment and authentication processes to be used as reference for evaluation purposes. These two greyscale images, of size 268×460 pixels, and with a resolution of 8 bits and 500 dpi, are used as reference in order to properly compare the same processing effort in all scenarios. Different computational platforms addressing the execution of software-based applications have been selected for processing speed evaluation purposes. The scope covers from high-cost and highperformance personal computer platforms, to low-cost and mid-performance embedded system platforms. One personal computer and three embedded system platforms have been evaluated, as indicated in Table 50. The evaluation procedure permits to point out which advantages and disadvantages in timing performance are featured by each of the suggested architectures. Technical Parameters Platform Family Processor Processor data bus Number of cores Type of core Technology Clock speed Bus speed Cache Operating system AFAS program code AFAS application data SDRAM/SRAM data bus SDRAM frequency Personal Computer Platform Acer Aspire 9420 MPU Intel Core 2 Duo Intel Core 2 Duo T5600 64 bits 2 Hard-core 65 nm 1.83 GHz 667 MHz 2 MB L2 Windows XP DDR2 SDRAM (2 GB) DDR2 SDRAM (2 GB) 64 bits ≥ 200 MHz Embedded System Platform 1 Altera Excalibur EPXA10 SOPC EPXA10F1020C1 ARM922T 32 bits 1 Hard-core 180 nm 200 MHz 200/100 MHz 8 kB Inst. Cache – SOPC SRAM (256 kB) DDR SDRAM (128 MB) 32 bits 125 MHz Embedded System Platform 2 Xilinx Spartan 3AN FPGA XC3S700AN MicroBlaze 32 bits 1 Soft-core 90 nm 66.667 MHz 133.333/66.667 MHz 8 kB Inst. Cache 8 kB Data Cache – DDR2 SDRAM (64 MB) DDR2 SDRAM (64 MB) 16 bits 133.333 MHz Embedded System Platform 3 Xilinx Virtex4 ML401 FPGA XC4VLX25 MicroBlaze 32 bits 1 Soft-core 90 nm 100 MHz 200/100 MHz 32 kB Inst. Cache 64 kB Data Cache – DDR SDRAM (64 MB) DDR SDRAM (64 MB) 32 bits 100 MHz Table 50. Computational platforms used in the execution time performance evaluation process. The execution time performance reached in each of the platforms, in both enrolment and authentication stages, is presented in Table 51 and Table 52 respectively. Task ID Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Processing Stage Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Image directional filtering and binarization Image smoothing Image thinning Minutiae extraction and minutiae filtering Total Execution Time: Personal Computer Platform 2.810 ms 0.470 ms 7.030 ms 2.190 ms 0.620 ms 13.440 ms 12.350 ms 1.250 ms 0.630 ms 40.790 ms Embedded System Platform 1 1083.219 ms 178.940 ms 5304.010 ms 834.062 ms 97.061 ms 3792.712 ms 1536.114 ms 1695.930 ms 76.626 ms 14598.674 ms Embedded System Platform 2 299.578 ms 46.960 ms 719.703 ms 344.651 ms 26.646 ms 860.133 ms 360.012 ms 547.847 ms 35.404 ms 3240.934 ms Embedded System Platform 3 227.035 ms 32.772 ms 467.329 ms 244.916 ms 17.294 ms 609.518 ms 229.732 ms 404.085 ms 23.982 ms 2256.663 ms Table 51. Enrolment process execution time performance. The enrolment process of the template fingerprint and the authentication process of the query fingerprint with the enrolled template are evaluated. The authentication execution times are obviously longer than the enrolment times. Special attention needs to be done to the authentication stage since, unlike the enrolment stage, the authentication process is normally carried out on-line in the real application so real-time response is usually requested. The enrolment stage tends to be less 191 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 critical since it is normally carried out off-line –under the supervision of application staff to guarantee the reliable enrolment of the user in the system– so no real-time performance is usually demanded. As it can be deduced from the tables, the real time performance requested to the application is not achieved in all the scenarios. The personal computer platform is able to meet the requested performance, but those other scenarios based on low-cost and low-performance embedded processors running at low operation frequencies are far away from the requested timing performance. The bigger latency exhibited by the embedded system platform 1 with regard to the other two embedded system platforms (2 and 3) is justified by the fact that no data cache is enabled in that scenario, which severely affects the final timing performance of the application. Task ID Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Task B Processing Stage Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Image directional filtering and binarization Image smoothing Image thinning Minutiae extraction and minutiae filtering Field orientation maps alignment Minutiae alignment, feature sets matching and authentication decision Total Execution Time: Personal Computer Platform 2.810 ms 0.470 ms 7.030 ms 2.500 ms 0.620 ms 15.940 ms 14.220 ms 1.410 ms 0.630 ms 3224.530 ms 4.220 ms 3274.380 ms Embedded System Platform 1 1083.219 ms 178.940 ms 5304.010 ms 987.089 ms 113.959 ms 4460.569 ms 1752.322 ms 1767.383 ms 93.783 ms 279636.069 ms 370.712 ms 295748.055 ms Embedded System Platform 2 299.578 ms 46.960 ms 719.703 ms 407.445 ms 30.987 ms 1014.939 ms 412.503 ms 552.091 ms 45.002 ms 210269.854 ms 161.973 ms 213961.035 ms Embedded System Platform 3 227.035 ms 32.772 ms 467.329 ms 289.661 ms 20.171 ms 720.095 ms 261.745 ms 402.946 ms 29.487 ms 138208.006 ms 107.972 ms 140767.219 ms Table 52. Authentication process execution time performance. On the one hand, although the powerful processor embedded in the personal computer platform is able to reach the requested performance, its cost is excessive for those pervasive consumer applications demanding biometric recognition. On the other hand, although the embedded system platforms tested in this work are able to meet the system costs targeted in the consumer applications arena, the exhibited execution time performances are clearly insufficient. Therefore, it is needed to find alternative system architectures able to meet both key requirements: high performance and low cost. 5.2.4. Physical Platform: High-Performance and Low-Cost Driven Design The successful spread of products and services that exploit the advantages provided by fingerprint biometrics in both public and private sectors depend on several factors today. Although the universality, distinctiveness and permanence characteristics of human fingerprints are proven facts that make them reliable signs of identity; the acceptance of automated fingerprint-based personal recognition systems, focused on either identification or authentication purposes, is constrained by social and technical factors. Among those social factors, the most important ones refer to the security (protection) and privacy (integrity) concerns related to the user’s information. Among those technical factors, the most relevant ones are the recognition accuracy exhibited by the system, and the high cost of the physical platforms in charge of the processing. Privacy concerns Fingerprint authentication systems rely on “the fingers as passwords” for human recognition purposes. They aim at providing a fast and foolproof answer to the question “Are you the person you claim to be?”. One of the advantages of fingerprint biometrics is the fact that it bases recognition on an intrinsic aspect of a human being. However, those intrinsic human traits, which come from one or several fingertips, need to be intentionally provided to some computational 192 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 systems during the enrolment and authentication stages. Unlike other personal recognition systems based on physical tokens or knowledge-based tokens, in those personal recognition systems based on biometrics-based tokens, consistent and accurate information about the user’s identity is disclosed to the system. The usage of that inherent information –in the form of biometric identifiers– by third parties without the individual’s knowledge or consent would imply, up to some extent, the user’s intimacy invasion. On the one hand, this originates some concerns with regard to the privacy rights of the users of those recognition systems. Users want to have some control over who can sense their personal traits and they also want to have control over the intrinsic information provided to the recognition systems. Prior to making use of those systems, potential users have an interest in determining how, when, why and to whom that information about themselves will be disclosed. Because of the fact that the technology normally runs faster than the law, the first biometrics-based recognition systems were lacking any legitimate expectation of privacy in the sensitive information that the user voluntarily provides to the recognition system. Nowadays, however, biometrics privacy legislation is needed, for not only those public or governmental applications that make general use of biometrics but also those other private and industrial sectors that exploit biometric systems worldwide. Adopting privacy protections aimed at guaranteeing that the recognition systems make properly usage of the sensitive information provided by the users, and that such information is not disclosed to third parties is strongly encouraged to permit the acceptance of this technology by the current society. On the other hand, biometric systems can also be seen as a method to improve privacy when they are deployed as a security safeguard to prevent consumer fraud in either identity management (retirement reimbursement, ATM transactions, taxes payment, etc.) or information access (medical information, financial accounts, etc.) applications. Although absolute security does not exist, with biometrics-based systems identity theft becomes more difficult so user’s privacy is improved to some extent. Security concerns Those personal recognition systems that are based on something different to biometric traits are not always secure. Those systems based on physical ID tokens or things that you must physically possess to prove your identity feature a weak security level because of the fact that those tokens can be lost, duplicated, stolen or forgotten. Those other systems based on things that you must know such as passwords, secret codes or PINs are also compromised by the fact that those knowledgebased ID tokens can be easily forgotten, shared or observed. Biometric technologies are not susceptible to those previously cited issues because biometrics relies on things you are. In this sense, fingerprint-based recognition requires the person to be present at the point of authentication. Inherently, biometrics-based ID tokens in general, and fingerprints in particular, have significantly high information content, which is substantially larger than PINs or passwords. This obviously improves the security level of those recognition systems with regard to traditional (non-biometricsbased) systems. However, other security concerns raise up linked to fingerprint recognition systems: (i) Latent fingerprints are unintentionally left on the surfaces you touch with your fingers because of the grease available in the skin. Therefore, it is possible to retrieve a copy of your fingerprints in order to try to forge your identity in any application by means of latent fingerprints recovered on paper, or the physical reproduction of fake fingers made of silicone, gelatine, glue, wax, etc. Some countermeasures to protect the systems against such fraudulent attacks consist in developing systems able to detect the aliveness of the fingers submitted to the acquisition process. (ii) Other fraudulent attacks are based on the interception of the digital information delivered by the fingerprint sensor to the processing system, or the information that is exchanged between the different modules of the system (fingerprint acquisition unit, feature extractor, feature matcher, database, host, etc), or directly retrieve the template information from the system database. Some countermeasures are based on developing cryptographic systems to provide encrypted data through 193 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 the internal (recognition system) and external (user-host) communication channels. If fingerprint data are encrypted and the hacker who attempts to attack the system cannot decrypt them, those data are then not usable. In addition, because of security reasons, most of the personal recognition systems based on fingerprints do not store complete fingerprint images, but only those sets of distinctive features –template– extracted from them. In this way, in case of fraudulent template retrieval, it is more difficult to deduce the fingerprint from the template so the security is improved. (iii) A legitimate user can be coerced by the hacker to hand over his fingerprint data or to be directly authenticated by the system in order the hacker to gain fraudulent and illegitimate access. Also, the system administrator of the recognition system can be coerced by the hacker to obtain the templates of one or all users, or to enrol the hacker as an authorized user. No specific countermeasure exists to prevent these attacks, which also exist in case of automatic recognition systems based on physical or knowledge-based tokens. The trends on personal authentication point to the design of compact electronic systems that integrate the fingerprint sensor, the processing unit and volatile and non-volatile memory blocks so a closed system is designed able to perform all the stages involved in the authentication process. In this direction, embedded systems have been thought and developed, which are based on smart cards, system-on-chip or stand-alone devices characterized by the fact that they store the individualized biometric template of the user and perform the acquisition, feature extraction and matching processes internally in the device (on-chip) so the template does not leave the device, there is no need to access external databases, and the template is not exposed to attacks in the communication channels thus some critical security and privacy concerns can be overcome. The security of the entire system is only as good as its weakest link. Although some attacks can fool an automatic fingerprint recognition system, some of them could not fool a human expert since the human mind is extremely complex. Therefore, it is needed to convert the automatic system in a more complex engine similar to the human mind, and take advantage of the processing power of today’s computational platforms to increase security. In summary, improvements on security imply additional complexity and processing power for the physical platforms in charge of the processing. User convenience concerns Another point of criticism related to biometric systems refers to the user convenience in the acquisition process of the biometric traits. This is specially true when dealing with systems based on traits like iris, retina, DNA, etc.; where some invasive actions are done in the acquisition process. Such a concern is strongly mitigated when dealing with fingerprints thanks to the existence of userfriendly acquisition systems based on small and low-cost fingerprint sensors able to acquire a digital impression of the user’s fingertip in a split second with the only need for the user of having to keep his finger in a specific position in case of touchless devices, or to place it on the sensing surface in case of one-touch devices, or to sweep his finger over the sensing surface in case of sweeping technology sensors. Non-invasive techniques to image finger surface details or even sub-surface details (by means of sensors based on ultrasonic waves, or solid-state sensors based on electrical field) are available. Even portable embedded systems that integrate the fingerprint sensor in the way of keyrings or smart cards exist, as already disclosed in chapter 4, which make easy the fingerprint acquisition process in many daily use applications (PDAs, USB pen drives, laptops, mobile phones, etc.). Accuracy concerns The limiting factors of the recognition accuracy exhibited by current AFAS applications can be classified in three groups: a) The first and most important factor relies on the inherent difficulty of the matching process motivated by the large intra-class and the small inter-class variabilities of human fingerprints. Complex algorithms are needed to emulate the human mind when matching fingerprints. Although the development of foolproof automated algorithms are seen as a big challenge today, meticulous 194 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 and precise operations can be performed with computer systems to help in reaching similar performances to the ones achieved by human expert teams. b) The quality level of the fingerprint images submitted to the recognition process is the second of the factors. Either natural or technical agents influence the quality of those fingerprint images acquired with live-scan devices. It is estimated that about 90% of the population have “good” quality fingerprints: wet and flexible skin with well-defined ridges and valleys. The other 10% of the population, however, have “bad” quality fingerprints: dry and poor or worn ridges due to factors such as their job (miners, farmers, etc.) or age. Also, the environmental conditions present in the fingerprint acquisition process affects the final quality of the acquired fingerprints (influence of temperature, light, moisture –dry or wet fingers–, etc.), as well as the habituation or expertise of users to the acquisition process (image deformation, irregular impressions, etc.), or the own characteristics of the fingerprint sensors (robustness against external factors, technology issues, temperature effects, image contrast, image resolution, ageing effects on quality, etc.). c) The performance of the physical platforms in charge of the processing (allowed resolution of the intermediate processed images –multi-colour, greyscale, black&white, etc.–, precision in the mathematical operations –floating-point operations versus fixed-point operations, rounding, etc.–, and others) also limits the final accuracy of the recognition application. It is proven that when imposing constraints to the system computing resources (limited execution time, system memory size, operating frequency, etc.) a drastic decrease of the recognition accuracy performance occurs with regard to the scenario in which the same input images are processed under unlimited computational platforms without constraints. Therefore, more research work is needed in order to improve the recognition accuracy of those algorithms in charge of matching fingerprints. It is expected that more compute-intensive stages will be introduced to the existing recognition algorithms in order to improve the matching accuracy, so more processing power will be demanded to the physical platforms in charge of the processing. A good balance between the computational power demands and the system costs needs to be achieved in the design of those systems. Cost concerns The costs linked to the design, the manufacturing, the distribution and the maintenance aspects of one product or service limit to some extent its selling price. The costs linked to an AFAS are key parameters that limit the deployment and spread of cost-effective fingerprint-based recognition applications accessible to any user worldwide. Time-to-market pressures and cost constraints are pushing embedded systems to new levels of flexibility and system integration. At system level, higher integration means lower costs. The fact of using fewer components to build an electronic circuit improves the reliability of the final product and minimizes its manufacturing costs. In this direction, the usage of FPGA or SOPC devices together with a reduced set of companion chips such as memory blocks and peripherals is a promising architecture concerning to system costs, which can be understood as a first step prior to evolving to that ideal system-on-chip architecture where one single device is able to afford the complete application. The scheduling of the AFAS application into a set of sequential and mutually exclusive tasks makes possible the exploitation of run-time reconfigurable computing technologies, which also leads to cost savings in terms of system physical resources. Flexibility at both hardware and software levels also means lower costs in terms of maintenance aspects. The fact of basing the system architecture on programmable and run-time reconfigurable FPGA or SOPC devices permits to minimize the costs with regard to those issues that can appear once the product is released to the market such as bugs fixing, algorithms upgrades, etc. The trends in the biometrics market points to a progressive increase of the research efforts intended for the development of biometric recognition algorithms in the coming years. A gradual improvement of the FMR/FNMR accuracy indicators is thus foreseen, caused by the growing interest of the society in reliable and secure personal identification/authentication systems. As a 195 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 consequence of this trend, it is important to design automatic personal recognition systems that allow upgrades of their functionality without affecting aspects such as the architecture and the costs of the physical systems, which should remain invariable to functional changes of the biometric recognition algorithm once the product is released to the market. In this sense, the exploitation of embedded systems based on flexible or dynamically reconfigurable hardware provides valuable advantages since it permits functional changes of the biometric recognition application (hardware and software content) with minor effects on the system costs. Once the physical system that supports the application is defined (type of biometric sensors to be used, size of the programmable logic device, capacity of the configuration memory, type and amount of peripherals, etc.), the maintenance tasks or application algorithm upgrades are mainly oriented to the modification of the content of one non-volatile configuration memory. The application upgrades can be done in the field by updating the content of that non-volatile memory where the configuration of the whole system (application program code and HDL descriptions of those hardware processors that need to be instantiated in the static and dynamically reconfigurable regions of the programmable logic device along the processing time) is stored, without the need of modifying the physical board (if the memory needs of the new program code and hardware instances fit into the existing memory). At device level, SOPC and FPGA vendors supply libraries of IPs pre-verified to reduce risk in the development phase, as well as automated EDA tools featuring simple and smart methodologies with integrated capabilities to debug and verify the performance of any design in order to improve in efficiency and time-to-market. Processing speed concerns Soft real-time performance is usually demanded to the biometric applications. Therefore, and in order to speed up the processing as much as possible while minimizing the processing resources required for the embedded system, hardware-software co-design methodologies along with hardware acceleration techniques based on parallelism and pipelining are the suggested means to achieve the execution time requirements in this work. The AFAS application is scheduled and partitioned into a set of tasks, and each of the tasks is ported to hardware or software depending on the execution time demands. Those compute-intensive tasks that would constrain the application in case of being executed by software are identified and ported to hardware, to be executed by application-specific coprocessors. Those other less critical tasks remain as software tasks under the execution of the system processor. As many tasks as needed are ported to hardware in order to reach the requested soft real-time performance. Power consumption concerns Other concerns linked to the development of AFAS applications, specially in case of autonomous or portable embedded systems, refer to the power consumption of the physical platforms responsible for the execution of the recognition process. The continuous evolution of the CMOS semiconductor technology, with thinner geometries (65-nm, 40-nm, 28-nm, 22-nm) and lower voltages (1.2V, 1V, 0.9V), permits to reduce the power consumption at component level, but additional low power consumption strategies are needed at system level to lower the average power consumption of the whole application. When thinking about the physical implementation of an AFAS application, the system architecture of the computational platform in charge of the processing needs therefore to address all those previously discussed requirements: (i) high security, (ii) high accuracy, (iii) high speed, (iv) low power consumption, and (v) low cost. All these factors are barriers against the broad adoption of biometric systems worldwide. If fingerprint recognition technology continues to mature and efficient and reliable systems able to overcome all those barriers can be designed, automated fingerprint-based recognition can have a profound influence on the way we conduct our daily business in the near future. 196 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The development of an AFAS application under an embedded system architecture is suggested in this work. Among those embedded system architectures studied in the scientific literature: - Match-on-Card, where the storage of the user template and the biometric matching process are carried out on a secure platform well protected against external attacks in the way of a smart card or other types of embedded systems; and - System-on-Device, System-on-Board, System-on-Card or System-on-Chip architectures, where all the processes involved in the recognition algorithm (fingerprint acquisition, image preprocessing, feature sets extraction, fingerprints matching, and communication of the authentication result to the external host) are embedded in a secure platform, either in the way of one single chip (System-on-Chip or System-on-Card in case of integrated circuits or smart cards that embed the fingerprint sensor respectively) or in the way of a reduced amount of ICs in a small and highly integrated electronic board (System-on-Device or System-on-Board); the author has focused this work on System-on-Board architectures under FPGA- or SOPC-based platforms. Both system architectures are shown in Figure 60 and Figure 61, respectively. In the Match-on-Card architecture, once the template is stored in the smart card, it does not leave the smart card so the template is well secured. The communication of the smart card with the host is normally done through encrypted communication protocols. The input data transferred from the host to the Match-on-Card system refer to the query features sets, whereas the output datum transferred from the matcher unit to the host is at least the authentication result, also in encrypted form in case of applications demanding high security. FINGERPRINT ACQUISITON UNIT IMAGE PROCESSOR UNIT FEATURE EXTRACTOR UNIT ENCRYPTION UNIT DECRYPTION UNIT USER’S TEMPLATE SENSOR ENROLMENT – AUTHENTICATION HOST FEATURE MATCHER UNIT HIGH LEVEL APPLICATION DECRYPTION UNIT ENCRYPTION UNIT USER’S SMART CARD Figure 60. Match-on-Card system architecture. The selected system architecture is shown in Figure 61. A system-on-programmable-chip device featuring most of the processing resources, additional memory devices, and some peripherals such as one fingerprint sensor and one serial communication transceiver are the basic components embedded in a platform able to compute all the processing stages of the recognition algorithm. Only one input-output interface with the external world, based on the live-scan fingerprint sensor and the encrypted communication channel with the host, is established in the proposed system. Nowadays, some limitations exist with regard to the maximum volatile and non-volatile memory that can be embedded in those system-on-chip devices available in the market, as well as in the integration of the fingerprint sensor on-chip. Therefore, the experiments carried out in this work are following the cited System-on-Board architecture as an alternative to those ideal biometric Systemon-Card or System-on-Chip architectures not fully accessible to modest developers. In both scenarios –Match-on-Card and System-on-Board architectures–, the decentralization of the users’ database in the way of individualized smart cards or customized embedded devices that store the template of the user reduces risk and offers personalization since the user is in possession of his own fingerprint template in a tamper-proof hardware instead of being in a centralized database. 197 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FINGERPRINT ACQUISITON UNIT IMAGE PROCESSOR UNIT FEATURE EXTRACTOR UNIT USER’S TEMPLATE SENSOR FEATURE MATCHER UNIT EMBEDDED SYSTEM ENCRYPTION UNIT INPUT: User’s Finger OUTPUT: Authentication Result Figure 61. Authentication-on-Board embedded system architecture. The experimental tests are carried out on commercial evaluation platforms based on FPGA or SOPC devices interfaced with volatile and non-volatile memory chips, one fingerprint sensor, and transceiver ICs used to establish a bidirectional communication link with an external PC platform acting as the application host. Devices from the two market leaders in the FPGA industry, Altera Corporation and Xilinx Inc., are the main focus of this study. Starter kits or development boards of both vendors are used to evaluate the performance of the suggested system architecture in different scenarios. In all of them, the embedded system architecture is based on five key factors to afford the challenging demands: a) General-purpose microprocessor system b) Programmable logic device embedded in the system c) Hardware-software co-design techniques d) Run-time reconfigurable FPGA e) System-on-programmable-chip platform General-purpose microprocessor system As in most of the embedded systems in the market today, the usage of low-cost and midperformance microprocessors (of 32-bits, with operating frequencies of up to 200-800 MHz) provides certain flexibility required in any application. Software-based solutions have additional advantages such as the rapid development of the application by making use of a set of libraries with application-specific functions, which makes unnecessary writing the application software from scratch, and provides a cost-effective solution. However, in those applications demanding a high computational power and real-time performance, certain limitations exist when trying to develop the entire application with purely software platforms based on either one single processor or multicore processor systems due to the inherent limitations in working frequency, restricted datapath, shared resources, sequential workflow execution, and reduced parallelism characteristics featured by those standard products. Programmable logic device embedded in the system When purely software-based systems are not enough to meet the expected real-time performances in any application, the usage of hardware-based accelerator devices as complementary processor units has been proven to be an efficient solution. Programmable logic devices such as FPGAs are much more flexible than semi-custom or custom devices like ASSPs or ASICs. ASSPs and ASICs have a fixed peripheral set that limits the number of applications that they can be efficiently used in; but FPGAs allow implementing custom peripherals and made-to-measure glue logic tailored to the requirements of any application. The continuous improvements in the semiconductors field allow reducing the costs of FPGAs, making them more and more competitive. The flexibility offered by 198 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 FPGAs eliminates the long design cycle associated with ASICs, and the usage of IP libraries written in standard hardware description languages and automated design/verification tools reduce the development cycles of those applications based on programmable logic devices so it becomes a valid alternative to be studied in this work instead of other solutions based on multiprocessor systems or GPUs. Hardware-software co-design techniques The usage of a general-purpose MPU, and an FPGA as a companion chip, offers a much greater degree of flexibility and allows the development of any application by means of hardware-software co-design techniques. The same happens in case of synthesizing in the FPGA a soft-core processor acting as system MPU. The exposed system architecture approach gives flexibility at two levels: at software level, with the MPU-based execution of the application; and at hardware level, with the design of modular cores synthesized in the FPGA. The introduction of one FPGA in the system as a general-purpose device where to instantiate those application-specific hardware coprocessors required to speed up those critical tasks allows implementing an adaptive and highly integrated multiprocessor system oriented to the development of real-time applications. Apart from the inherent flexibility featured by the MPU, the programmable logic device provides additional flexibility. In the FPGA it is possible to instantiate either additional microprocessors (e.g. VHDL instances of soft-cores) or made-to-measure VLSI hardware accelerators in charge of specific tasks aiming at offloading those MPU algorithmintensive operations. With an improved bandwidth among the MPU, the FPGA, the memory resources and the rest of peripherals available in the embedded system, soft and hard real-time applications can be successfully developed. I = i1 STAGE 1 i2 = o1 STAGE 2 i3 = o2 STAGE i TASK i 1,0 Set of sequential stages Ii . . . ii = oi-1 APPLICATION STAGE i ii+1 = oi TASK i 2,0 TASK i 2,1 TASK i 2,2 . . . in-1 = on-2 STAGE n-1 in = on-1 STAGE n O = on Parallel tasks TASK i 3,0 Sequential tasks TASK i 4,0 Oi Figure 62. Development of one application as a set of sequential stages, and partitioning of each of the stages into hardware and software tasks that can be executed either sequentially or in parallel. Run-time reconfigurable FPGA The FPGA device embedded in the system allows exploiting the parallelism and acceleration features of programmable logic design, so it is possible to meet real-time performance by spreading the functionality across the different core resources available in the system (MPU and parallel hardware coprocessors), as depicted in Figure 62. However, the resources available in the FPGA are 199 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 not unlimited, and the cost of the device increases with the amount of resources. Therefore, it is convenient to reduce the size of the FPGA in the design to reach affordable costs for the complete system. In this direction, and owing to the fact that the proposed biometrics-based personal recognition applications feature a sequential nature (the personal recognition algorithm consists of a set of mutually exclusive image processing tasks executed one after the other), it is possible to exploit the reconfigurability performance available in some FPGA devices in order to minimize the system hardware needs. Flexible and dynamically reconfigurable hardware allows a more efficient usage of the system resources by having hardware present in the FPGA only when it is in use. Thus, given a fixed size for the FPGA, it is possible to instantiate specific coprocessors at a given time, and to eliminate them after they have been used in order to allow further coprocessors to be instantiated making use of the same FPGA resources in the following stages of the application, as depicted in Figure 63. This technique permits to reduce the overall hardware system size at nearly null cost –FPGA reconfiguration overhead–. Figure 63. Comparison of static FPGA-based design concept (left side) and run-time reconfigurable-FPGA-based design concept (right side). In the suggested work, the usage of FPGAs with dynamic reconfigurability performance provide the proper balance to minimize cost while providing the required processing power for the application. System-on-programmable-chip platform The usage of a general-purpose MPU together with programmable and reconfigurable logic gives a high level of flexibility to the system and provides the mechanisms to achieve real-time performance. However, higher integration means lower costs. Therefore, the integration of those main resources and other key peripherals such as memory, timers, interrupt controllers, etc. on a single chip provides an efficient way of optimizing the whole system cost. Embedded biometric recognition is therefore possible by making use of highly integrated platforms. Additional benefits of the system integration are the improvements in reliability and security. For this reason, the usage of SOPC devices or system-on-chips embedding an FPGA is specially encouraged in the experimental tests carried out in this work. The suggested system architecture is depicted in Figure 64. At least one run-time reconfigurable region is present in the programmable logic device to synthesize those flexible application-specific hardware coprocessors that can be dynamically instantiated on demand along the application execution time. One specific reconfiguration controller is in charge of this task, supervised by the master processing unit. There are endless uses for embedded systems based on SOPC or FPGA devices in the consumer, military, aerospace, automotive, communications, and industrial markets worldwide. As computer technology continues to advance and economies of scale reduce costs, fingerprint biometrics based on the suggested topology can become a more efficient and cost-effective means for personal verification in both the public and private sectors. Although multibiometric systems that combine multiple biometric traits or multifactor systems that combine different modalities (several biometric traits with passwords and/or physical tokens) provide greater recognition accuracy, security and privacy protections than does fingerprint technology alone, the optimization of each single biometric technology is always welcome. In the next sections, the physical implementation of each of the stages that take place in the proposed fingerprint recognition algorithm is discussed in detail. 200 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 DATA & INSTRUCTION CACHES RECONFIGURABLE REGION RECONFIGURATION CONTROLLER DYNAMIC VOLATILE COPROCESSORS DYNAMIC MEMORY COPROCESSORS STATIC STATIC COPROCESSORS STATIC COPROCESSORS COPROCESSORS PROCESSING UNIT INTERRUPT CONTROLLER SYSTEM BUS MEMORY CONTROLLER COMMUNICATIONS CONTROLLER APPLICATION USER INTERFACE TIMER CONTROLLER VOLATILE MEMORY NON-VOLATILE MEMORY HOST, NETWORK OR OTHER PERIPHERALS FINGERPRINT SENSOR Figure 64. Embedded system architecture suggested in this work for the physical implementation of the AFAS application. 5.3. Fingerprint Acquisition Approach The fingerprint acquisition stage is the first of the steps that take place in any personal recognition process. It provides the inputs to the verification system in both enrolment and authentication phases. When dealing with automated personal recognition systems, electronic fingerprint sensors need to be used instead of other alternatives like manual acquisition processes through ink & paper techniques. The fingerprint acquisition process, at least in the authentication stage, is performed on-line through the voluntary action of the user, who places one or several fingers on the sensing surface of the fingerprint scanner device. The fingerprint acquisition stage starts when the recognition system detects the presence of the finger(s) on the sensing surface, and ends once a digital impression of the user’s fingerprint(s) is properly stored in the recognition system. Real-time acquisition performance is demanded to the recognition system. The fingerprint acquisition process is normally handled by a processing unit in charge of driving the fingerprint sensor device. The fingerprint sensor usually comprises an array of sensing elements with analog-to-digital conversion and memory capability to image the fingerprint via different technologies. The fingerprint sensor is provided with a specific I/O interface to be driven externally. The sensor is able to deliver a digital impression of the user’s fingerprint through such an I/O interface, either serial or parallel. Depending on the characteristics of the fingerprint sensors, either greyscale or binary representations of the acquired impressions are provided. Typical resolutions are 1-bit (black and white), 4-bits (16 grey levels) or 8-bits (256 grey levels) per pixel. Ridges in fingerprint impressions are represented as dark pixels in either greyscale or binary images, and valleys are represented as clear or white pixels. 5.3.1. Selection of the Sensing Technique As presented in chapters 2 and 4, many technologies and many sensors exist to acquire fingerprint impressions in an automatic way. The selection of the sensing technology and the fingerprint sensor device is driven by the requirements of the final application. Multiple factors need to be taken into consideration to select the proper sensor for each application, covering the sensing technique (onetouch/sweeping/touchless), the image resolution, the available communication interface with an 201 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 external processor, acquisition performance aspects such as image quality or image contrast, its durability and ergonomics, its operating supply range, its physical size and package, and its cost. This work is focused on the development of personal authentication systems in the way of embedded system architectures mainly oriented to large-volume markets in the consumer electronics arena (PDAs, mobile phones, access control systems, smart cards, keyless entry systems, e-commerce, etc.). Therefore, when selecting the fingerprint sensor device to be used in the application, apart from those technical features like image quality, durability or ergonomics, other factors such as device size, integration capabilities and cost have to be considered. For these reasons, solid-state sensors have been selected in this work instead of other available technologies such as ultrasound or optical devices. In general, independently of the technology used, the larger the sensing area, the greater is the device cost. This is specially true for silicon-based sensors, where the area of the device is extremely important because silicon manufacturing is a model of collective fabrication and, as a rule of thumb, the cost is proportional to the silicon area. The lower the area of each individual die, the bigger the amount of dies per silicon wafer and the higher its yield, so the lower the cost per die. Figure 65. In the semiconductor industry, the size of the chip drives its cost. Design of a chip of big dimensions (36 chips per wafer, left side), design of a chip of smaller dimensions (164 chips per wafer, central side), and real wafer photo (right side). The sensing surface of sweeping sensors is dramatically reduced in comparison with one-touch sensors, as depicted in Figure 65. Therefore, solid-state sensors based on sweeping techniques have been selected as the proper option in the proposed embedded system. Apart from its cost optimization, other advantages are provided with regard to one-touch silicon sensors: - Sweeping sensors eliminate the risk of latent fingerprints. No latent fingerprint images can remain on the sensing surface as a result of the oil residue of the skin after a scan, as it does happen with one-touch sensors. The sweeping action leaves no more than a slice size of residue, which improves the security of the system against fraudulent attacks. - Their small size allows fitting them into small devices such as mobile phones, PDAs, smart cards, etc. and their lower power consumption is an added value in portable devices. In addition, however, the sweeping technique introduces the need of one additional processing stage when compared against one-touch sensors: the reconstruction of the fingerprint image from the set of acquired slices. The voluntary action of the user, who needs to sweep his finger vertically over the sensing window at a fairly constant speed, is needed to acquire the fingerprint. The sweeping sensor captures then a set of successive image frames as the finger is swept, and transmits that information to the processing unit in charge of the reconstruction process. It order to have acquisition times similar to those of one-touch sensors, the reconstruction process is normally performed on-the-fly, in parallel with the slice acquisition process. In this way, once the latest slice of the fingerprint is acquired, the reconstructed image keeps stored in the system memory, and the next processing stages (image pre-conditioning and image enhancement tasks) can follow. Among the different fingerprint sweeping sensors available in the market, the FingerChip FCD4B14 device, which is based on thermal effect, has been selected in this work. Its main features are shown in Table 53 and Figure 66: 202 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Supplier Commercial Part Number Sensing technology Sensing technique Pixel size Image resolution Sensing window Sensing area Dynamic range Acquisition performance Interface Maximum transfer speed Durability Operating temperature range Operating supply range Power consumption Automatic fingerprint detection Encryption Die size Package Atmel Corporation FCD4B14 Thermal sensor Sweeping 50 µm x 50 µm 500 dpi 0.4 mm x 14 mm 8 x 280 pixels 4 bits/pixel Adjustable image contrast 8-bit parallel interface (2 pixels) 2 MHz, up to 1780 frames per second 1 million finger sweeps 0ºC to 70ºC 3.0V to 3.6V, 4.5V to 5.5V 20mW at 3.3V, 1MHz, 25ºC No No 1.7 mm x 17.3 mm COB (chip-on-board) or DIP20 Table 53. FCD4B14 fingeprint sensor characteristics. Figure 66. FCD4B14 fingeprint sensor. FCD4B14 has a linear shape. It is able to acquire and deliver externally through its I/O interface a programmable number of sub-images (or slices) of size 8×280 pixels per second. The programmable acquisition and transmission speed is driven by an input clock signal that ranges up to 2.0 MHz. The transmission bus width is 8 bits, and each pixel is codified in 4 bits so 2 pixels are transmitted in one single clock pulse. Pixel 9 Pixel 1 Pixel 2233 VCC CLK ACK DATA 7-0 /OE RST Pixel 8 8 x 280 Pixels Pixel 16 Pixel 2240 CLK 1 ACK DATA 3-0 DATA 7-4 /OE RST Pixel 1 Pixel 3 Pixel 5 Pixel 7 P 2239 Data 1 Data 3 Data 5 Data 7 2 3 4 1119 1120 1121 1122 1123 1124 1 Pixel 2 Pixel 4 Pixel 6 Pixel 8 1 P 2240 Data 2 Data 4 Data 6 Data 8 Figure 67. Frame transmission sequence (1124 clocks). 203 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 A fingerprint slice is composed of 8×280 pixels. The slice is transmitted ordered by columns from left to right, and ordered by rows from top to bottom, as depicted in Figure 67. A total of 1120 clocks are needed to transmit a complete slice (Pixel 1 – Pixel 2240 in the figure). Besides, 4 additional bytes of information are added at the end of one slice (Data 0 – Data 8 in the figure) so a complete frame is composed of 1124 bytes and transmitted in 1124 clocks. Each pixel is a sensor in itself. The sensitivity of the pyroelectric sensing layer permits to detect any temperature differential between the beginning of the acquisition and the reading of information. This period of time is called integration time, and it must be regular in order to get consistent fingerprint slices. If the integration time is not regular, the contrast of slices can vary from one frame to another. It is possible to introduce some waiting time between each set of 1124 clock pulses, but the overall time of one frame must keep regular, as shown in Figure 68. This additional waiting time is generally the time needed by the external processor to perform some calculations over the acquired frames (to scan the frames to detect the presence of the finger, or to reconstruct on-the-fly the fingerprint image from the acquired slices). FRAME n-1 CLK 1124 CLKs INTEGRATION TIME 1124 CLKs INTEGRATION TIME 1124 CLKs INTEGRATION TIME 1124 CLKs INTEGRATION TIME FRAME n FRAME n+1 FRAME n+2 Figure 68. Fingerprint slices transmission sequence. In order to guarantee the right reconstruction of the fingerprint image from the acquired slices, some overlapping between consecutive slices need to exist. The integration time has to be short enough to guarantee such overlapping between consecutive slices for any reasonable finger sweeping speed. After several tests, it has been confirmed that an acquisition rate of 200 frames per second is a good balance to guarantee slice overlapping in all the range of fair finger sweeping speeds. 5.3.2. Hw/Sw Partitioning The implementation of the fingerprint acquisition stage heavily depends on the characteristics of the fingerprint sweeping sensor that is selected to build the system. Image acquisition consists in obtaining a bitmap of the fingerprint at an adequate resolution. There exist two possible ways to reconstruct the fingerprint image from the acquired slices: (i) In case of off-line methods, the acquisition stage and the reconstruction stage take place sequentially one after the other. First, a set of fingerprint slices are collected and stored in memory. After the acquisition stage, the overlap between consecutive slices allows to reconstruct the fingerprint image. With this solution, additional time and resources are needed owing to the fact that all finger slices need to be stored during the acquisition stage, and both acquisition and reconstruction stages are sequential tasks. This kind of solution can be used in those applications driven by physical systems with big amounts of memory resources, and without real-time constraints. (ii) When dealing with on-line methods, the acquisition and reconstruction stages take place concurrently and simultaneously while the user sweeps his/her fingertip on the sensing surface. The main advantage of these techniques is the fact that less memory resources are needed since only the reconstructed image needs to be permanently stored. Moreover, the reconstruction process takes place on-the-fly, so once the user finishes the swept motion, the fingerprint composition is already available in the system memory so further processing stages related to the recognition application can proceed. This solution is the one selected in most embedded systems requiring real-time performances. In this work, reconstruction on-the-fly methodology has been implemented. 204 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 As detailed in Figure 66, the selected sweeping sensor device is embedded in one plastic cover that presents one longitudinal cavity to support the right movement of the finger along the sensing surface. It is assumed that two adjacent fingerprint slices can feature a significant vertical displacement between them, but only slight horizontal and rotational displacements can be present because of those mechanical guides (larger horizontal and rotational displacements might happen only if the user makes it intentionally). Hence, by computing the vertical translational shift and identifying the image overlap between consecutive slices, it is possible to reconstruct the fingerprint impression. The reconstruction process of the fingerprint image starts after the proper acquisition of the first two slices. A line comparator is in charge of establishing the level of similarity between the bottom line in the second slice and all those 8 lines corresponding to first slice. The similarity analysis task is in charge of comparing the greyscale intensity levels of all 280 pixels of a line. A total of 8 similarity scores are deduced, one per each line, and the one with the better similarity score, which is greater than a certain similarity threshold, is determined as the overlapped line. The reconstructed image is then build by fusioning both slices from the overlapped line, as graphically shown in Figure 69. In case none of the 8 similarity scores is above the specified threshold, it is understood that no overlap exists between both slices and the fusion of both images is done by adding the second slices just over the first one. The described reconstruction process is repeated on each of the acquired slices. P O N M L K J I SLICE N+1 I Fingerprint Slice N+1 H G F E D C B A INs Ref SIMILARITY ANALYSIS Ov H G F E D C B A SLICE N R E C O N S T R U C T I O N P O N M H=L G=K F=J E=I D C B A Reconstructed Image from Slices N and N+1 Fingerprint Slice N Figure 69. Fingerprint image reconstruction process from the consecutive acquired slices. The application is split in three stages: a) Fingerprint sensor initialization phase, which aims at configuring the sensor device to keep it ready for the slice acquisition process. b) Finger detection phase, where the sensing surface is periodically scanned in order to detect when the user places one of his fingers on the sensor. Consecutive frames are captured by the system in this stage. Owing to the usage of thermal sensors, in case no finger is on the sensing surface, white or clear slices are obtained. As soon as one finger is placed on the sensor, the acquired slices detect the difference of temperature that can exist between the ridges of the fingertip and the surface of the sensor. Dark and clear pixels are present along the captured slices in this situation. Once the presence of the finger on the surface of the sensor is confirmed, a finger detection event (DET) is raised by the scanning system in order to switch to the next processing stage: the fingerprint image acquisition process. In order to design a robust finger detector, certain filtering is applied to the acquired frames to properly confirm the presence of the finger before raising the detection event and transitioning to the fingerprint acquisition phase. c) Fingerprint acquisition and reconstruction phase, based on the acquisition of a fixed amount of slices and the reconstruction in parallel of the fingerprint image. A total of 100 slices are acquired after the DET event. Because of each slice has a size of 280×8 pixels, it results in a sensed area of up to 280×800 pixels. The complete fingerprint image acquisition process flow is shown in Figure 70. 205 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 START INITIALIZATION PHASE FINGERPRINT SENSOR INITIALIZATION FINGER DETECTION PHASE READ FRAME No FINGER DETECTION ? Yes READ 1ST FRAME FINGERPRINT IMAGE STORAGE FINGERPRINT ACQUISITION & RECONSTRUCTION PHASE READ NEXT FRAME FINGERPRINT SLICES SIMILARITY ANALYSIS IMAGE RECONSTRUCTION & STORAGE No ACQUISITION END ? Yes END Figure 70. Fingerprint image acquisition process flow. The temporal scheduling of the process is shown in Figure 71. PLACEMENT OF USER’S FINGER ON SENSING SURFACE DET tTASK ii tTASK iii Execution time START ACQ #1 tIT ACQ #2 tIT ACQ #3 ACQ #M-1 tIT ACQ #M tIT STOP tACQ = M ⋅ tIT START: DET: ACQ #x: STOP: FINGERPRINT ACQUISITION START COMMAND FINGER DETECTION SLICE ACQUISITON COMMAND FINGERPRINT ACQUISITION STOP COMMAND TASK i: FINGERPRINT SENSOR POLLING FOR FINGER DETECTION TASK ii: FRAME/SLICE ACQUISITION TASK iii: FINGERPRINT IMAGE RECONSTRUCTION FROM ACQUIRED SLICES tIT : tACQ : INTEGRATION TIME (FINGERPRINT FRAMES ACQUISITION PROCESS) ACQUISITION TIME (FINGERPRINT FRAMES ACQUISITION AND IMAGE RECONSTRUCTION ON-THE-FLY) Figure 71. Fingerprint acquisition tasks scheduling. The execution time of Task i depends on the time elapsed since the initialization of the system till the moment the user places one of his fingers on the sensing surface. As soon as the finger is put on the sensing surface, this condition is automatically detected by the system, which raises the DET event. Once the finger detection event takes place, the next acquisition and reconstruction processes take a fixed execution time, which is delimited by the number of slices acquired (M = 100 slices, which have been proven to be enough to cover a reasonable region of the fingerprint ridge-valley pattern) and the integration time between consecutive acquisitions (tIT = 5 ms, which has been proven to guarantee the overlapping between consecutive slices for any sliding speed of the fingertip across the sensor). Therefore, the application acquisition time is in the range of 500 ms. Based on the tasks scheduling shown in Figure 71, a condition that needs to be met in the proposed application is: tTASK ii + tTASK iii ≤ t IT 206 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The execution time of Task ii is limited by the 1124 clocks needed to acquire a complete frame. The maximum acquisition frequency allowed by the sensor is 2 MHz. Depending on the physical platform in charge of the processing and its available resources, the fingerprint acquisition frequency and the execution time of Task iii need to be balanced to meet the above timing condition. If so, the reconstruction on-the-fly is possible and once detected the presence of the finger on the sensing surface, the execution time of the application is deterministic, in the range of 500 ms. The size of the acquired image, however, is not fixed. It depends on the level of overlap between the consecutive slices, which mainly depends on the sweeping speed of the user’s movement. In order the next processing stages in the recognition process to deal with images of a fixed size, the reconstructed image is finally adjusted to a size of 268×460 pixels. Bigger reconstructed images are limited to that specified size, whereas smaller reconstructed images are enlarged with white (background) pixels to reach the specified size. Figure 72. Fingerprint image acquisition and reconstruction on-the-fly process. 207 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 5.3.3. Physical Implementation Three different fingerprint acquisition approaches have been implemented along this thesis. Each approach corresponds to a different computational platform. All the platforms have in common the fact that the main component is a SRAM-based monolithic SOPC/FPGA device featuring a generalpurpose microprocessor and a programmable logic region where to instantiate made-to-measure hardware accelerators. Some details about each of the platforms are shown in Table 54. The list of devices, operating frequencies and HW/SW partitioning options are indicated. The overall workload is balanced among those application-specific hardware coprocessors instantiated in the programmable logic and those general-purpose microprocessing units present in each platform. In all the implementations, fingerprint acquisition rates of 200 frames per second are achieved so the fingerprint acquisition process takes around 500 ms. Reconstruction on-the-fly performances are reached in all the platforms. Live-scan SW HW Sensor Processor Processor Atmel FPSLIC (350 nm) Atmel FINGERCHIP 8-bit AVR AT40K Made-to-measure prototype AT94K40AL SOPC 400 kHz 12.5 MHz 25 MHz Altera EXCALIBUR (180 nm) Atmel FINGERCHIP 32-bit ARM9 APEX 20KE Evaluation board EPXA10 EPXA10F1020C1 SOPC 1.0 MHz 200 MHz 48 MHz Xilinx VIRTEX4 (90 nm) Atmel FINGERCHIP 32-bit MicroBlaze VIRTEX4 Evaluation board ML401 XC4VLX25 FPGA 1.0 MHz 100 MHz 100 MHz Note 1: The slice similarity analysis task is implemented by HW in the FPGA, and the storage of the reconstructed implemented as a SW task and executed by the system CPU. Platform SOPC/FPGA Task i SW HW SW Task ii HW HW SW Task iii HW/SW1 SW SW image in the system memory is Table 54. Physical implementation in different platforms. To set an example, Figure 72 shows the reconstruction process of a fingerprint image. The fingerprint impression on the right is the result of the reconstruction task over the set of up to 100 acquired slices shown on the left. 5.4. Image Enhancement Approach The second stage of the suggested fingerprint recognition algorithm, in both enrolment and authentication phases, is the enhancement of the original fingerprint image. The acquired bitmap can present some deficiencies caused by multiple factors like the user’s skin conditions (too wet, too dry, cuts, worn ridges due to job or age reasons, etc.), the environmental conditions (sensor noise, temperature or humidity influences, etc.), the acquisition process (incorrect finger pressure, non-uniform sliding speed, etc.), or other deficiencies in the reconstruction process that can affect its quality. The enhancement process is an intermediate stage aiming at improving as much as possible the quality of the acquired image. A set of sequential image processing tasks are applied to the original fingerprint in order to transform it in a clearer representation of the ridge-valley map. A new version of the bitmap is obtained as a result of the image enhancement approach, from which those inherent and distinctive characteristics can be easily extracted. Although the input image is a greyscale representation of the user’s fingerprint, the enhanced version consists of a black and white representation of the input print. The genuine ridge-valley map is kept as foreground, with black ridges and white valleys, and around the foreground, a white background remains. 5.4.1. Algorithm Description Among the different fingerprint enhancement methodologies detailed in chapter 2, the author has selected some of the most relevant techniques described in literature to improve the clarity of the ridge-valley structures. A complex and accurate algorithm has been developed in order to become a representative example of the computational efforts required by state-of-the-art fingerprint enhancement algorithms. 208 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The fingerprint recognition algorithm suggested in this work makes use of (i) the field orientation map of the ridge-valley pattern, and (ii) the set of minutia points extracted from the ridge map as the inherent features that characterize a fingerprint. The enhancement algorithm is oriented to accurately strengthen such traits in order not to loose or not to alter those true features while removing other false features. Subsequent recognition will only be effective if the enhanced representation of the fingerprint is clean and true, i.e. it possesses the real properties of the user’s fingertip. The fingerprint image enhancement algorithm is composed of the sequential processing stages depicted in Figure 73. START IMAGE SEGMENTATION IMAGE NORMALIZATION IMAGE ISOTROPIC FILTERING FINGERPRINT IMAGE ENHANCEMENT PHASE FIELD ORIENTATION MAP FILTERED FIELD ORIENTATION MAP IMAGE DIRECTIONAL FILTERING AND IMAGE BINARIZATION IMAGE SMOOTHING END Figure 73. Fingerprint image enhancement algorithm: processing stages. Image segmentation The segmentation of the original image is the first of the pre-conditioning stages. It aims at isolating those background areas of the bitmap from the true fingerprint. Apart from isolating those portions of the image that do not correspond to the fingerprint, the algorithm also separates those fingerprint regions featuring a very low quality –noisy regions with high distortion in the ridge-valley structures–, which cannot be recovered as valid fingerprint regions. The segmentation approach is based on the computation of the directional gradients of the image. The gradient of each pixel of the image is computed for two main purposes in the enhancement algorithm: - to proceed with the image segmentation process, and - to compute the field orientation map of the ridge-valley pattern. Given a fingerprint image f(x,y) of size N×M pixels, x=1…N, y=1…M, the directional gradients Gx(x,y) and Gy(x,y) of each pixel p(x,y) are computed by applying Sobel operators of size 5×5 as shown in Figure 74. ⎡−1 −1 0 ⎢−1 −1 0 ⎢ = ⎢− 2 − 2 0 ⎢ ⎢−1 −1 0 ⎢−1 −1 0 ⎣ 1 1⎤ 1 1⎥ ⎥ 2 2⎥ ⎥ 1 1⎥ 1 1⎥ ⎦ ⎡ − 1 − 1 − 2 − 1 − 1⎤ ⎢ − 1 − 1 − 2 − 1 − 1⎥ ⎥ ⎢ =⎢ 0 0 0 0 0⎥ ⎥ ⎢ 1 2 1 1⎥ ⎢ 1 ⎢ 1 1 2 1 1⎥ ⎦ ⎣ S x5 x 5 S y5 x 5 a) b) Figure 74. Sobel mask operators: a) Sobel mask used to compute Gx, b) Sobel mask used to compute Gy. Each pixel of the original image is convolved with the orthogonal Sobel operators Sx(x,y) and Sy(x,y) in order to deduce the direction of the maximum intensity change. The gradient vector of pixel p(x,y) is represented by: 209 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 G p ( x, y ) = Gx ( x, y ) + j ⋅ G y ( x, y ) = G p ( x, y ) ∠θ p ( x , y ) where: Gx ( x, y ) = f ( x, y ) ∗ S x ( x, y ) = G y ( x, y ) = f ( x, y ) ∗ S y ( x, y ) = 2 2 i = +2 i = −2 ∑ ∑ S (i, j ) ⋅ f (x + i, y + j ) x j = −2 j = +2 i = +2 i = −2 ∑ ∑ S (i, j ) ⋅ f (x + i, y + j ) j = −2 y j = +2 G p ( x, y ) = G x ( x, y ) + G y ( x, y ) θ p ( x, y ) = tan −1 G y ( x, y ) Gx ( x, y ) As it can be deduced from the above formulas, the gradient of a pixel p(x,y) depends on the greyscale intensity distribution of pixels in a local neighbourhood of size 5×5 centered on pixel p(x,y). In order to reduce the computational load required to calculate the gradient of image pixels, it is usual to approximate the gradient magnitude with absolute values: G p ( x, y ) ≈ Gx ( x, y ) + G y ( x, y ) The segmentation of the image is carried out at block level in the proposed algorithm. The reason is the fact that any image can be seen as a set of clusters with different intensity and quality levels. Pixels corresponding to good- or mid-quality ridge-valley regions present high magnitude gradients due to the fact that good contrast between ridges and valleys exist. However, those background areas –where no fingerprint exists– present low contrast, with nearly uniform intensity levels, so the gradient magnitude in all directions tends to be homogeneous and small in such regions. A similar behaviour is observed in those low quality clusters of true fingerprint regions. Those corrupted areas affected by smudges, too dry or too wet skin conditions, creases, uneven pressured regions and other elastic distortions present small gradient magnitudes. To proceed with the segmentation, the original image is tessellated into non-overlapped square blocks of size 8×8 pixels. The input fingerprints have a fixed size of 268×460 pixels. A frame of six pixels wide is kept around the image so a sub-image of size 256×448 pixels remains, which can be tessellated in 32×56 blocks. The size of the block has been adjusted taking into consideration the resolution of the images –500 dpi– in order each block to contain ridges and valleys. The discrimination of foreground and background clusters is done by using the average magnitude of the gradient in each image block, Gb(u,v). The average gradient magnitude of the block (u,v), whose top-left corner pixel is located at coordinates (x,y), is computed as follows: ≈ 64 64 Because of the fact that the fingerprint area is rich in edges due to the ridge/valley alternation, the average gradient response is high in those clusters corresponding to the fingerprint, whereas the average gradient response is small in the foreground regions as well as in those degraded fingerprint blocks. By setting an appropriate threshold for the average gradient of a block –GThreshold–, it is possible to accurately segmentate the original fingerprint impression: if Gb (u, v ) ≥ GThreshold → Block (u,v ) ∈ foreground Gb (u, v ) = ∑∑ i =0 j =0 i =7 j =7 G p (x + i, y + j ) ∑∑ i =0 j =0 i =7 j =7 G x ( x + i, y + j ) + G y ( x + i, y + j ) otherwise → Block (u,v ) ∈ background Those useless regions are identified and isolated from the valid fingerprint area. An example of fingerprint segmentation process is shown in Figure 75. As it can be seen, the original image contains two distinct zones: the central area corresponds to the foreground, whereas the borders correspond to the background. In the segmented image, only the ridge-valley pattern is present, and the background regions are transformed to white pixels in this representation. 210 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a) b) Figure 75. Fingerprint image segmentation: a) original image (greyscale, 256 levels), b) segmented image (average block threshold Gthreshold = 700). Image normalization The next step is the normalization of the original fingerprint based on the segmentation results. The aim of the normalization process is to adapt the variations of grey level intensities along ridges and valleys in the different regions of the image. As a result of the normalization process, any fingerprint impression that originally is too dark or too light, or that presents some noisy regions because of the existence of certain abnormal conditions in the acquisition process (too dry fingers, too wet fingers, prints with smudges, gaps, etc.) can be compensated in order to have a similar appearance to those well acquired impressions. The normalization process is a pixel-wise operation. It permits to adjust the mean and variance values of a greyscale output image from the knowledge of the mean and variance parameters of the greyscale input image. Given the original fingerprint f(x,y) and its segmentation matrix previously deduced, a new image h(x,y) can be obtained by computing the normalized greyscale value of each pixel p(x,y). According to [Hong et al., 1998], the normalized image can be computed as follows: 2 σ 02 ⋅ ( f ( x, y ) − m ) , if f ( x, y ) > m σ2 2 σ 02 ⋅ ( f ( x, y ) − m ) , otherwise 2 σ where m0 and σ02 are the desired mean and variance values, respectively, for the normalized image h(x,y); and m and σ2 are the mean and variance values of the input image –f(x,y)– deduced considering only those non-segmented blocks of the original fingerprint. Given f(x,y) a fingerprint of size N×M pixels, x=1…N, y=1…M, with K valid (non-segmented) pixels, K ≤ (N×M), the computation of the mean and variance parameters of f(x,y) is as follows: ⎧ ⎪m0 + ⎪ h (x, y ) = ⎨ ⎪m − ⎪ 0 ⎩ m= ∑ f ( x, y ) K K σ2 = ∑ ( f ( x, y ) − m ) K 2 K = ∑( K f 2 ( x, y ) + m 2 − 2 ⋅ m ⋅ f ( x, y ) K ) ∑ f ( x, y ) 2 = K K − m2 211 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Therefore, by computing ∑ f(x,y) and ∑ f 2(x,y) both parameters m and σ2 can be easily deduced. Some optimizations can be applied to the normalization equations in order to make easy the implementation of such algorithm under a processing unit. The initial expressions of h(x,y) can be grouped into one single formula as: h ( x, y ) = m0 + Once deduced the parameters m and σ2 of the input fingerprint, and established the parameters m0 and σ02 for the output fingerprint, the value of the normalized image at pixel p(x,y) is a linear function of the corresponding intensity level of pixel p(x,y) in the input fingerprint f(x,y). It can be expressed as: 2 ⎛ σ 02 ⎞ ⎟ + σ 0 ⋅ f ( x, y ) = A + B ⋅ f (x, y ) h ( x, y ) = ⎜ m0 − m ⋅ ⎜ σ2 ⎟ σ2 ⎠ ⎝ where: σ 02 ⋅ ( f ( x, y ) − m σ2 ) A = m0 − m ⋅ σ 02 ≡ constant σ2 Given m0, σ02, and the input image f(x,y), the parameters A and B result in constant values, and the normalization process can be easily implemented under any processing system as a pixel-wise operation based on one product and one addition. It is important to remark that the resultant image has to keep the resolution of the original image, therefore some limits are established for the result of the mathematical operations (e.g. in case of greyscale images with 256 pixel intensities, the normalization result of one pixel must remain in the range [0,255]). In Figure 76 one example of image normalization is presented. σ 02 B= ≡ constant σ2 a) b) Figure 76. Fingerprint image normalization: a) input image (m = 111, σ2 = 1002), b) normalized image (m0 = 127, σ02 = 1272). 212 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Isotropic filtering The next step aims at filtering the normalized image in order to remove that hazard noise that may still be present on the ridge-valley pattern. The proposed filtering technique is abstracted from [Cheng and Tian, 2004]. The normalized image is convolved with an isotropic Gaussian filter in order to smooth the intensity of each pixel according to the intensity of its neighbours. In this way, isolated noisy pixels or small noisy clusters can be compensated in the new version of the image. Given the input fingerprint image h(x,y) obtained from the normalization process, h(x,y) is convolved with a Gaussian filter Gf : G f ( x, y , θ ) = 1 2 ⋅π ⋅σ x ⋅σ y ⋅e −1 ⎡ ( x ⋅cos θ + y ⋅sin θ )2 ( − x ⋅sin θ + y ⋅cos θ )2 ⎤ ⎥ ⋅⎢ + 2 2 ⎢ σx σ2 ⎥ y ⎣ ⎦ Setting the parameters σx = σy = 2½, Gf is discretized to a 7×7 integer-values matrix as shown in Figure 77: ⎡1 ⎢3 ⎢ ⎢7 1 ⎢ ⋅ 9 Gf = 1024 ⎢ ⎢7 ⎢ ⎢3 ⎢1 ⎣ 3 7 9 7 3 1⎤ 11 3⎥ ⎥ 23 7⎥ ⎥ 30 9⎥ 23 7⎥ ⎥ 11 3⎥ 3 1⎥ ⎦ 1024·Gf 90 80 70 60 50 40 30 20 10 0 y0 - 3 y0 - 2 x0 x0 - 1 x0 - 2 x0 - 3 x0 + 3 x0 + 2 x0 + 1 11 23 30 23 23 49 63 49 30 63 81 63 23 49 63 49 11 23 30 23 3 7 9 7 y0 - 1 y0 y0 + 1 y0 + 2 y0 + 3 Figure 77. Discrete isotropic filter Gf. The convolution process results in a smoothed image l(x,y), in which the effect of the noise is reduced: l ( x, y ) = h ( x, y ) ∗ G f ( x, y ) = i = +3 i = −3 ∑ ∑ G (i, j ) ⋅ h(x + i, y + j ) j = −3 f j = +3 When comparing the original image h(x,y) with the smoothed image l(x,y), a new image o(x,y) can be obtained: o( x, y ) = h ( x, y ) − l ( x, y ) The differential image o(x,y) captures some of the noise present on the image, but it also contains some of the true fingerprint information that was lost as noise. For this reason, and in order to recover that true information present in o(x,y), o(x,y) is smoothed again by convolution with the discrete Gaussian filter Gf. Therefore, a new detail image d(x,y) is obtained, which contains the valid information previously removed in the smoothing process: d ( x, y ) = o( x, y ) ∗ G f ( x, y ) = i = +3 i = −3 ∑ ∑ G (i, j ) ⋅ o(x + i, y + j ) j = −3 f j = +3 It is possible to build a smoothed version e(x,y) of the original image h(x,y) by considering the true information recovered from both filtering processes: e( x, y ) = l ( x, y ) + d ( x, y ) = 2 ⋅ [ h ( x, y ) ∗ G f ( x, y )] − [ (h ( x, y ) ∗ G f ( x, y ))∗ G f ( x, y )] The resultant image e(x,y) is used as reference in the next step to obtain the field orientation map of the ridge-valley pattern corresponding to the input fingerprint. Figure 78 shows an example of fingerprint isotropic filtering processing. 213 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a) b) Figure 78. Image isotropic filtering: a) normalized image, b) filtered image. Field orientation map The proposed fingerprint recognition algorithm considers the field orientation map extracted from the user’s fingerprint impression as part of the discriminatory information to be used to prove the uniqueness and distinctiveness of the user. The method used in this work to compute the field orientation map of fingerprints is based on the directional gradients of the image, and it is abstracted from well-known methodologies described in literature [Rao and Schunck, 1989]. Given the enhanced version of the fingerprint image e(x,y), the elementary ridge direction vector at pixel p(x,y) is assumed to be perpendicular to the gradient vector [Gx(x,y), Gy(x,y)]T, where Gx(x,y) and Gy(x,y) are the gradient magnitudes, which can be computed by using the Sobel operators Sx(x,y) and Sy(x,y) of size 5×5 as it was done in the image segmentation stage. The elementary gradient vector of pixel p(x,y) can be expressed as a complex number: G p ( x, y ) = G x ( x , y ) + j ⋅ G y ( x , y ) = G p ( x, y ) G p ( x , y ) = Gx ( x , y ) + G y ( x , y ) 2 2 ∠θ p ( x , y ) where: θ p ( x, y ) = tan −1 G y ( x, y ) Gx ( x, y ) Since pixel direction vectors can be easily corrupted by local noises, the size of the Sobel masks is fixed to 5×5 in this work, thus the gradient orientation vector of one pixel is computed by taking into account not only the required pixel but other 24 local neighbours. In order to improve the robustness of the orientation estimation process, and to attenuate the influence of small noisy regions that may exist in the image, the fingerprint field orientation map is computed at block level instead of at pixel level. There exists a trade-off in the selection of the proper size for the blocks. If the size of the block is small, the orientation estimation is more accurate, but is also more susceptible to noise. If the block size is large, the estimation of the orientation field in the regions of high ridge curvature is not very accurate. It has been verified, by preliminary experimentation, that a block size of 8×8 pixels is a good compromise when acquiring images at 500 dpi. Therefore, the same non-overlapped blocks of size 8×8 pixels in which the original fingerprint was initially tessellated in the segmentation process are now used to compute the field orientation map. 214 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The average direction of all pixels in a block is taken as the dominant direction for the complete block. The field orientation map consists then in the representation, by means of a two-dimensional matrix, of the dominant direction of ridges and valleys in each image block. When computing the average ridge direction of one block, however, the elementary pixel orientation vectors cannot be added directly. Owing to the duality of orientations when managing angles in the range [0,2π), two opposite directions θ and (θ + π) indicate the same ridge-valley orientation in a fingerprint. If directly added, they would compensate one to each other instead of strengthening the dominant direction. Therefore, the elementary orientations cannot be averaged directly. To overcome such errors in the computation of the average orientation, the suggested algorithm makes use of the squared gradient vector Gs,p(x,y) of the image pixels. The squared gradient vector of pixel p(x,y) is expressed as: 2 ⎛ G ( x, y ) ⎞ = G ( x, y ) 2 Gs , p ( x, y ) = Gs , p ( x, y ) ∠β ( x , y ) = ⎜ p ⎟ p ∠θ ( x , y ) ⎠ ∠ 2⋅θ ( x , y ) ⎝ 2 2 G s , p = (G x + j ⋅ G y ) 2 = (G x − G y ) + j ⋅ ( 2 ⋅ G x ⋅ G y ) In the equations above, not only the angle of the gradient vector is doubled, but also the magnitude of the gradient vector is squared: 2 Gs , p = G p s,p p p β s, p = 2 ⋅θ p The average squared gradient in each block Gs,b(u,v) is then calculated as: Gs ,b (u, v ) = ∑ Gs , p ( x , y ) = ∑ ( G x 2 − G y 2 ) + j ⋅ ∑ ( 2 ⋅ G x ⋅ G y ) ∀ ( x , y )∈ block ( u , v ) ∀ ( x , y )∈ block ( u , v ) ∀ ( x , y )∈ block ( u , v ) and the average ridge orientation in each block (u,v) is represented by: ∑ (2 ⋅ Gx ⋅ G y ) π 1 −1 ∀ ( x , y )∈ block ( u , v ) Φ block (u, v ) = + tan 2 2 ∑ Gx 2 − G y 2 ∀ ( x , y )∈ block ( u , v ) ( ) Although the previous formula is expressed in radians, the suggested field orientation calculator codifies the dominant ridge orientation of one block in degrees, in the range [0º,180º), with a resolution of ±1º. The angular information is coded in one byte, in the range 0-179, and those segmented blocks are coded with an invalid value of 255. Filtered field orientation map In order to improve the reliability of the field orientation map in case of dealing with low quality fingerprint impressions, one additional low-pass filtering step is applied to the previously computed field orientation map [Hong et al., 1998]. This filtering step is computed at block level, in order to compensate any potential incorrect estimation of the local ridge orientation. Given the original field orientation map Φb(u,v) of size Nb×Mb blocks, u=1…Nb, v=1…Mb, a new filtered field orientation map is built by means of a kernel of size 5×5 at block level as indicated in the below formula: i = +2 j = +2 1 = −2 Φ 'block (u, v ) = tan −1 ii= +2 2 ∑ ∑ sin ( 2 ⋅ Φ j = −2 j = +2 b (u + i , v + j ) ) i = −2 j = −2 ∑ ∑ cos ( 2 ⋅ Φ (u + i, v + j ) ) b The angular range and resolution of the filtered field orientation map keeps the same as the original field orientation map. Those segmented blocks are not considered in the computation of the average field orientation map. The resultant two-dimensional matrix is kept as part of the discriminatory information that characterizes the user’s fingerprint. Additionally, this information is used in the next image enhancement steps to improve the clarity of the ridge-valley pattern. 215 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In Figure 79 the computed filtered field orientation map of the previously enhanced image is shown. Those blocks that were initially segmented do not have any defined orientation vector, and those that present a high ridge/valley curvature present a smoothed dominant orientation. a) b) Figure 79. Orientation maps computation: a) field orientation map b) filtered field orientation map. Directional filtering and image binarization The next step consists of making use of the filtered field orientation map Φ’b(u,v) of size Nb×Mb blocks, u=1…Nb, v=1…Mb, and the smoothed version e(x,y) of the original fingerprint f(x,y) in order to improve the definition of the ridges and the valleys in each local neighbourhood of the image. A set of directional Gabor filters is constructed in order to reinforce the ridge-valley pattern. The Gabor filter can be expressed as a function of the filtered ridge orientation Φ ′, the ridge frequency f, and the standard deviations σx and σy of the Gaussian envelope along axes x and y, respectively, in each local neighbourhood: H f ( x, y , Φ ' , f , σ x , σ y ) = e −1 ⎡ ( x ⋅ cos Φ ' + y ⋅ sin Φ ' )2 ( − x ⋅ sin Φ ' + y ⋅ cos Φ ' )2 ⋅⎢ + 2 2 ⎢ σx σ2 y ⎣ ⎤ ⎥ ⎥ ⎦ ⋅ cos (2 ⋅ π ⋅ f ⋅ ( x ⋅ cos Φ '+ y ⋅ sin Φ ' )) Obviously, it is hard to adjust each parameter to its optimal value in each region of the bitmap to obtain the best enhancement performance because there are too many selections and parameters that can affect each other. However, as a good design intent, the following values have been considered in this work when dealing with images of 500 dpi: Gabor filter of size 7×7 pixels, σx = σy = 4, and typical ridge frequency f = 0.2. An initial field orientation Φ’ = 0º is used to construct the reference Gabor filter Hf,0º, as depicted in Figure 80. The reference filter Hf,0º is then rotated in order to obtain the set of oriented filters covering the full range [0º,180º) with a resolution of +1º. A total of 180 directional Gabor filters are built to enhance each of the regions of the fingerprint impression. Each non-segmented pixel p(x,y) of the input image e(x,y), p(x,y) ∈ block(u,v), is convolved with the directional Gabor filter Hf,Φ’ that matches the dominant orientation of its block, Φ ′(u,v). Hence, a new enhanced version k(x,y) of the input image e(x,y) is obtained: k ( x , y ) = e( x , y ) ∗ H f , Φ ' ( x , y ) = i = +3 i = −3 ∑ ∑H j = −3 j = +3 f ,Φ ' (i , j ) ⋅ e ( x + i , y + j ) 216 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 H f ,0º ⎡− 118 ⎢− 138 ⎢ ⎢ 58 1 ⎢ = ⋅ 193 256 ⎢ ⎢ 58 ⎢ ⎢− 138 ⎢− 118 ⎣ − 138 − 152 − 156 − 152 − 138 − 118⎤ − 161 − 177 − 183 − 177 − 161 − 138⎥ ⎥ 68 74 77 74 68 58⎥ ⎥ 226 248 256 248 226 193⎥ 68 74 77 74 68 58⎥ ⎥ − 161 − 177 − 183 − 177 − 161 − 138⎥ − 138 − 152 − 156 − 152 − 138 − 118⎥ ⎦ 256·Hf,0º +300 +250 +200 +150 +100 +50 0 -50 -100 x0 - 3 x0 - 2 x0 - 1 x0 -150 -200 x0 + 1 x0 + 2 x0 + 3 y 0 + 1 y0 + 2 y0 - 1 y0 y0 - 3 y0 - 2 y0 + 3 Figure 80. Discrete directional filter Hf,0º. The convolution of the enhanced image e(x,y) with the directional Gabor filters oriented according to the dominant direction pointed by the filtered field orientation map permits to measure the strength of the ridges and the valleys in every local neighbourhood. The comparison of such a measure with an adaptive threshold makes possible to discern whether the pixel under consideration corresponds to a valley or to a ridge. The adaptive threshold is computed as the average value of those image pixels in the local neighbourhood of the pixel under analysis, multiplied by the equivalent weight of the oriented Gabor filter. As a result of this computing stage, each pixel is associated to either the fingerprint ridge map (black pixel, codified as ‘1’), or to the valley area (white pixel, codified as ‘0’), thus, a coarse-level binary bitmap bin(x,y) is deduced: ⎧ ⎪ i = +3 j = +3 ⎪1 (black/ridge pixel), if ∑ ∑ H f ,Φ ' (i, j ) ⋅ e( x + i, y + j ) < ⎪ i = −3 j = −3 bin( x, y ) = ⎨ ⎪ ⎪0 (white/valley pixel), otherwise ⎪ ⎩ i = +3 i = −3 ∑ ∑H j = −3 j = +3 f ,Φ ' (i , j ) ⋅ ∑ 49 i = +3 i = −3 ∑ e( x + i , y + j ) j = −3 j = +3 a) b) Figure 81. Image directional filtering and binarization: a) greyscale image, b) binarized image. 217 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The resultant binary image preserves the local orientation information of the input greyscale fingerprint. Figure 81 shows one example of binarization process. Image smoothing After the image binarization process, ridges and valleys are identified. However, some deficiencies and imperfections can exist on the resulting image. The black and white bitmap can have a large number of ridge pixels labeled as non-ridge pixels because of the presence of noise, spurious valleys, abrupt creases, smudges, spikes, etc. An additional processing step is needed in order to try to obtain a good quality image with a well-defined structure of binarized ridges and valleys. a) b) Figure 82. Image directional smoothing: a) binary image, b) smooothed image. A directional smoothing process is applied in order to filter the binarized ridge-valley pattern, as depicted in Figure 82. The aim of this processing stage is to remove isolated valley clusters, noisy branches, and to improve ridge endings, ridge bifurcations, and the clarity of the complete image. Neighbour ridges and valleys are usually oriented in the same direction. Therefore, a smoothing kernel oriented in the direction of the ridges in each local neighbourhood can be useful in order to point out those ridge pixels that were previously not identified. ⎡0 ⎢0 ⎢ ⎢1 ⎢ = ⎢1 ⎢1 ⎢ ⎢0 ⎢0 ⎣ 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0⎤ 0⎥ ⎥ 1⎥ ⎥ 1⎥ 1⎥ ⎥ 0⎥ 0⎥ ⎦ ⎡1 ⎢1 ⎢ ⎢1 ⎢ = ⎢0 ⎢0 ⎢ ⎢0 ⎢0 ⎣ 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0⎤ 0⎥ ⎥ 0⎥ ⎥ 0⎥ 1⎥ ⎥ 1⎥ 1⎥ ⎦ SΦ ' = 0 º SΦ ' = 45º a) b) Figure 83. Directional smoothing filters: a) Ridge orientation Φ’ = 0º, b) Ridge orientation Φ’ = 45º. Similarly to [Ratha et al., 1995b], directional filters oriented according the local direction of the ridge flow are applied to each valley pixel of the binarized image. An oriented filter mask 218 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 composed of values ‘0’ and ‘1’ is used to identify which pixels need to be evaluated. Given an oriented filter centered on a valley pixel, only those neighbour image pixels with corresponding mask values at ‘1’ in the oriented filter are evaluated. Figure 83 shows two of the 7×7 directional filter masks used in this work. A total of 180 directional smoothing filters are used to enhance the binary fingerprint impression covering the whole ridge orientation map. The convolution of each valley pixel in the non-segmented blocks of the binary image with a kernel mask of size 7×7 pixels oriented according to the direction pointed by the filtered field orientation map results in an improved version of the binary image. The procedure consists in counting the number of ridge pixels in the binarized image coincident with the valid pixels of the oriented mask. If the number of ridge pixels in the local neighbourhood of the valley pixel under evaluation is more than 50% of the total number of valid mask pixels (pixels with value ‘1’ in the oriented mask), the central pixel under evaluation is converted to a ridge; otherwise it is kept as a valley. Once the directional smoothing process is applied to all valley pixels, the resulting image is used as the valid ridge-valley pattern to be taken into account for further processing in the next stages of the recognition algorithm. 5.4.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration From the initial implementation of the suggested recognition algorithm purely by software under different computational platforms, as detailed in section 5.2.3, it has been confirmed that most of the image enhancement tasks are compute-intensive. When those tasks are executed as software tasks under low-cost and mid-performance microprocessors under embedded system platforms, they constrain the execution time performance of the whole application. If comparing the execution time performance reached in the embedded system platforms with the execution time performance achieved in one personal computer platform, at least one or two orders of magnitude of difference is noted. Application-specific hardware coprocessors have been thought at this stage in order to accelerate each one of the demanding tasks when executed under embedded system platforms featuring programmable logic resources, as an alternative to the usage of HPC platforms. Nowadays, typical embedded system platforms embed microprocessor units of 8-bits, 16-bits or 32bits. Because of cost limitations, microprocessor units of 64-bits are still rare in embedded system platforms, being 32-bit microprocessors the most usual option. Therefore, the selected embedded system platforms used for evaluation purposes in this work feature 32-bits microprocessor units. From a general point of view, the proposed embedded system architecture can be structured around three main components: - the system CPU, understood as the main processing unit of the embedded system, which has the role of application master controller; - some application-specific hardware accelerators, acting as companion coprocessors of the system CPU, which are in charge of the execution of those time-critical image processing tasks to off-load the system CPU; and - the memory resources of the whole system, acting as shared resources accessible to any of the previously cited processors. From a conceptual standpoint, the enhancement algorithm has been structured in a sequence of mutually exclusive image processing tasks, as described in the previous section. Each of the tasks receives some input data, previously stored in the system memory, applies some operations over such data, and generates some output data that are stored in the system memory. The next task starts once the previous task is finished and its output data are available in the system memory. In this direction, each new task has as input data at least some of the output data generated in the previous stage. Every task can be either implemented as a software task, executed by the system CPU, or as a hardware task, executed by one specific coprocessor instantiated with the programmable logic resources of the embedded system. In case of hardware tasks, the applicationspecific coprocessors are provided with DMA controllers in order to access (read/write) the system memory without interrupting the activity of the system CPU. 219 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The topology of the system is depicted in Figure 84, where one example is presented covering the specific case of executing a hardware task based on the processing of one intermediate image along the enhancement process. The hardware coprocessor is based on a three-level modular design. The dedicated processing Core is located in the bottom side of the proposed architecture, below the System Bus Interface and the Data Bandwidth Adaptation blocks. The input image is transferred to the hardware coprocessor, which is in charge of applying a transformation to obtain an enhanced version of the image that is finally stored back in the system memory. SYSTEM CPU SYSTEM BUS ⁄ W-bits ⁄ W-bits ⁄ W-bits ⁄ W-bits ⁄ W-bits SYSTEM BUS INTERFACE SYSTEM BUS INTERFACE SYSTEM BUS INTERFACE ⁄ W-bits W-bits ⁄ ⁄ W-bits W-bits ⁄ ⁄ W-bits W-bits ⁄ DATA BANDWIDTH ADAPTATION DATA BANDWIDTH ADAPTATION DATA BANDWIDTH ADAPTATION ⁄ N1-bits CORE M1-bits ⁄ ⁄ N2-bits CORE M2-bits ⁄ ⁄ N3-bits CORE M3-bits ⁄ HW COPROCESSOR 1 HW COPROCESSOR 2 HW COPROCESSOR 3 MEMORY Figure 84. Embedded system topology. From the figure, those shared resources as the memory and the system bus are pointed as potential bottlenecks in the presented architecture since the different available processors (system CPU and the rest of hardware coprocessors) can try to access simultaneously the same shared resources at a given time. Certain arbitration mechanisms exist in each of the embedded system platforms to mitigate as much as possible the latency in the accesses to the shared resources. Those tasks that require acceleration performance are ported to hardware, and specific coprocessors that are adapted to the system bus characteristics of each of the platforms have been developed – System Bus Interface block in Figure 84–. In order to exploit the parallelism inherent to the hardware design, some adaptation stages of the system bus to wider internal buses have been done to cope with the operational bandwidth demands of each of the processing steps –Data Bandwidth Adaptation block in Figure 84–. No standard IPs have been used; fully- or partially-pipelined hardware coprocessors, tailored to the application requirements and developed from scratch, have been implemented instead. Each of the hardware coprocessors is described in VHDL hardware description language. The processor descriptors are technology-independent to ensure their portability and reusability to any full-custom, standard cell or gate array technology. Modular system design techniques have also been used to improve the independence between processors. Any physical platform featuring a programmable logic device can thus embed the proposed accelerator modules once their interfaces are properly engaged. In Figure 85, a detailed view of the modular design of those hardware coprocessors in charge of the image enhancement tasks is presented. As it can be deduced from the figure, three different clock domains exist. The System Bus Interface module and the internal FIFOs support the conditioning of the hardware processors clock domain – f3 – with the system bus clock domain – f1 –. System Bus Interface module The System Bus Interface module acts as master controller in charge of starting either read or write transactions through the system bus. Burst transactions are performed when possible in order to 220 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 improve the throughput in the communications when transferring big amounts of data such as the input images to be processed by the hardware accelerators, or the resultant enhanced versions to be stored in on-chip or off-chip memories. The data transfers through the bus are fitted to the characteristics of the system (clock frequency, data bandwidth, communication protocol, burst transfer sizes supported, etc.). The System Bus Interface module interacts, on the one hand, with the system bus, and, on the other hand, with two FIFO memories used as buffers to accommodate the different working frequencies of the different processors, as depicted in Figure 85. Data Bandwidth Adaptation module The Data Bandwidth Adaptation module is in charge of establishing the proper widths of the data buses in the interface between the hardware Cores and the system bus. Several dual-port memories are provided, which act as buffers to allow the delivery of wider data buses to the Core modules. In this way, the input data bandwidth of the Core modules is independent of the data bandwidth of the system bus. On the one hand, the input data bandwidth of each Core module is configurable, and supports bus data widths of up to 160 bits. This is a key feature in order to design dedicated processors that take advantage of the parallelism in the processing. On the other hand, the width of the output data bus of the Core module is fixed to 32 bits for all coprocessors in this work. SYSTEM BUS f1 CLOCK DOMAIN W, W = 8, 16, 32 or 64 bits f1 MHz f2 MHz SYSTEM BUS INTERFACE W, W = 8, 16, 32 or 64 bits W, W = 8, 16, 32 or 64 bits f2 CLOCK DOMAIN f2 MHz f3 MHz IN FIFO OUT FIFO f2 MHz f3 MHz 32 32 DP RAM DP RAM DP RAM DP RAM DP RAM f3 CLOCK DOMAIN 32 32 32 32 32 DATA BANDWIDTH ADAPTATION 0 1 2 3 4 0 1 2 3 4 MUX 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 32 D[159:128] 32 D[127:96] 32 D[95:64] 32 D[63:32] 32 D[31:0] K x 32, K = 1 … 5 32 f3 MHz PIPELINE-BASED COPROCESSOR CORE Figure 85. Three-level hardware accelerator modular design. Core module The Core module is in charge of the specific processing requested in each of the enhancement tasks. Pipeline techniques are used in order to have the right performance. The images are transferred horizontally to the processors, as depicted in Figure 86, in a sequence of vertical slices of different widths. 221 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Through the dual-port memories and the output multiplexers available in the Data Bandwidth Adaptation module, it is possible to shift the images horizontally in vertical slices that are progressively inputted to the Core module. The width of the slices is adjusted in each of the tasks according to the parallel processing demands. Several pixels can be processed in parallel while one slice is transferred to the processor. After transferring and processing one vertical slice, a new slice, partially overlapped with the previous slice, is transferred and processed. In this way, the Core module is able to process the complete image. As a result of the processing, thinner vertical slices are obtained that make up the output enhanced version of the image to be stored in the system memory. In order to mitigate any bottleneck in the accesses to the shared resources (system memory or system bus), the partitioning of the different enhancement tasks has been thought in the way of minimizing the number of accesses (RD/WR) to the system memory. Each of the tasks is performed with one single access loop to the system memory (one single read of the input image, and one single write process of the output result), without the need of storing other partial or intermediate results in the system memory along the task. X 268 INPUT IMAGE Pj,i Y-shift i … (1,1) K x 32 … X-shift j 460 Y PROCESSING … … … PIPELINE-BASED HW ACCELERATOR 32 X 268 OUTPUT IMAGE Pj,i … Y-shift i (1,1) j 460 Y Figure 86. Image processing tasks scheduling. 5.4.3. Physical Implementation – Design Development – Design Implementation In this section, the design of application-specific hardware accelerators in charge of the execution of each of those time-demanding tasks involved in the fingerprint image enhancement process is detailed. The general concept described in the previous section, and based on a modular design with three different levels (System Bus Interface, Data Bandwidth Adaptation and Core), is considered. Although the fingerprint sensor selected in the application features an image resolution of 4 bits per pixel, in general, the greyscale input images to be processed can present pixel resolutions of 1 bit, 4 bits or 8 bits. In order to develop standard coprocessors able to handle any kind of resolution, the interfaces have been adjusted to 8 bits per pixel in all of them. Apart from the input and output data buses used as interface between the Core module and the Data Bandwidth Adaptation module, the Core processor is provided with some input and output registers 222 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 used either for configuration purposes or to store some results of the processing. Those registers are accessible to the system CPU in order to monitor and take control of each of the hardware coprocessors. In this section, the implementation details of the Core modules involved in each of the enhancement tasks are covered. Image segmentation The segmentation coprocessor is in charge of the following tasks: - to read the fingerprint image to be processed, previously stored in the system memory; - to tessellate the input image of size 268×460 pixels into 32×56 square blocks of size 8×8 pixels, keeping one external frame of 6 pixels wide around the set of blocks; - to compute the directional gradients of each of the pixels in the blocks, and to compute the global gradient for each block; - to determine the segmentation result of every block by comparing its gradient value with a programmable threshold –GThreshold– stored in one specific register (and accessible to the system CPU). The segmentation result of each block is coded in one bit (if the block is segmented, its corresponding bit is coded as ‘0’, otherwise it is coded as ‘1’); and - to pack the segmentation result of the 32 blocks present in one image row in a 32-bit word to be transferred to the system memory. The output matrix corresponding to the segmentation map of the input image is thus coded in 56 words. Pj-2,i+2 Pj-1,i+2 Pj,i+2 Pj+1,i+2 Pj+2,i+2 × +1 × +1 × +2 × +1 × +1 Pj-2,i+1 Pj-1,i+1 Pj,i+1 Pj+1,i+1 Pj+2,i+1 × +1 × +1 × +2 × +1 × +1 Pj-2,i Pj-1,i Pj,i Pj+1,i Pj+2,i ×0 ×0 ×0 ×0 ×0 + Gx ( j , i ) Pj-2,i-1 Pj-1,i-1 Pj,i-1 Pj+1,i-1 Pj+2,i-1 × –1 × –1 × –2 × –1 × –1 Pj-2,i-2 Pj-1,i-2 Pj,i-2 Pj+1,i-2 Pj+2,i-2 × –1 × –1 × –2 × –1 × –1 Figure 87. Computation of the directional gradient Gx of the image pixels p(j,i). Only shifts and additons of integer operands are needed to compute the gradient at pixel level. The computation of the gradients at pixels level (e.g. the directional gradient Gx) is based on convolutions of the image with directional Sobel operators of size 5×5: ⎡ Pj −2 , i −2 ⎢P ⎢ j −1 , i −2 ⎢ Pj , i −2 ⎢ ⎢ Pj +1 , i −2 ⎢ Pj +2 , i −2 ⎣ Pj −2 , i −1 Pj −1 , i −1 Pj , i −1 Pj +1 , i −1 Pj +2 , i −1 Pj −2 , i Pj −1 , i Pj , i Pj +1 , i Pj +2 , i Pj −2 , i +2 Pj −1 , i +1 Pj , i +1 Pj +1 , i +1 Pj +2 , i +1 Pj −2 , i +2 ⎤ ⎡ − 1 − 1 0 Pj −1 , i +2 ⎥ ⎢ − 1 − 1 0 ⎥ ⎢ Pj , i +2 ⎥ ∗ ⎢ − 2 − 2 0 ⎥ Pj +1 , i +2 ⎥ ⎢ − 1 − 1 0 ⎢ Pj +2 , i +2 ⎥ ⎢ − 1 − 1 0 ⎦ ⎣ 1 1 2 1 1 ⎡− 1 ⋅ Pj −2 , i −2 1⎤ ⎢− 1 ⋅ P 1⎥ j −1 , i − 2 ⎢ ⎥ 2⎥ = ∑ ⎢− 2 ⋅ Pj , i −2 ⎢ ⎥ 1⎥ ⎢− 1 ⋅ Pj +1 , i −2 ⎢− 1 ⋅ Pj +2 , i −2 1⎥ ⎦ ⎣ − 1 ⋅ Pj −2 , i −1 − 1 ⋅ Pj −1 , i −1 − 2 ⋅ Pj , i −1 − 1 ⋅ Pj +1 , i −1 − 1 ⋅ Pj +2 , i −1 0 ⋅ Pj −2 , i 0 ⋅ Pj −1 , i 0 ⋅ Pj , i 0 ⋅ Pj +1 , i 0 ⋅ Pj +2 , i 1 ⋅ Pj −2 , i +1 1 ⋅ Pj −1 , i +1 2 ⋅ Pj , i +1 1 ⋅ Pj +1 , i +1 1 ⋅ Pj +2 , i +1 1 ⋅ Pj −2 , i +2 ⎤ 1 ⋅ Pj −1 , i +2 ⎥ ⎥ 2 ⋅ Pj , i +2 ⎥ ≡ G x ( j, i ) ⎥ 1 ⋅ Pj +1 , i +2 ⎥ 1 ⋅ Pj +2 , i +2 ⎥ ⎦ 223 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 However, thanks to the fact that all the operators in the Sobel masks are power of two, the products in the convolution operation can be replaced by shifts of the integer operands. The convolution of the image with the Sobel masks can thus be implemented by means of shifts and additions in the hardware version. To set an example, Figure 87 shows the circuit topology used to compute the directional gradient Gx of one pixel. Moreover, in this stage, some additional computations that depend on the segmentation result are carried out in order to speed up the next image processing stage –image normalization–. The following parameters are computed in parallel with the segmentation process in order to avoid additional readings of the original fingerprint image to be performed in the normalization process: - the parameter K, which corresponds to the number of pixels of the image that belong to nonsegmented blocks, and can be computed as K = 64 · Σ SGMNTi, where SGMNTi is the segmentation result (‘1’/’0’) of each image block; - the parameter ∑ f, which refers to the addition result of the pixel intensity level of all those pixels of the image that belong to non-segmented blocks; and - the parameter ∑ f 2 or the addition of the greyscale square values of all those non-segmented pixels of the image. The block diagram of the segmentation coprocessor is depicted in Figure 88 and Figure 89. INPUT DATA P7 8 P6 P5 8 8 P4 P3 8 8 P2 P1 8 8 P0 8 PARALLELIZATION BUS 8 8 8 8 8 8 8 8 5x5x8 5x5x8 Pj,i X^2 Pj+1,i X^2 Pj+2,i X^2 Pj+3,i X^2 ∑ Gyj,i ∑ ∑ ∑ Pj,i Pj+1,i Pj+2,i Pj+3,i Gyj+1,i Gyj+2,i Gyj+3,i Gxj,i Gxj+1,i Gxj+2,i Gxj+3,i P2j,i P2j+1,i P2j+2,i P2j+3,i ABS ABS ABS ABS ABS ABS ABS ABS ADD ADD ADD ADD ADD ADD ADD ADD ADD ADD ADD ADD ADD ∑ |Gy| + ∑ |Gx| PIXELS ∑ P PIXELS ∑ P2 PIXELS Figure 88. Image segmentation hardware coprocessor (part I). A total of 8 pixels are transferred to the segmentation processor each clock. Image vertical slices of 8 pixels wide can then be processed by the core module. A shift register matrix of 8×5 pixels is built, which permits to have up to 4 kernels of size 5×5 pixels in parallel to compute the directional gradients of 4 pixels at a time. Since the image is processed in vertical slices and 4 pixels of one 224 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 block are processed in parallel, one internal buffer (in the way of a dual-port memory in Figure 89) is provided in the segmentation processor in order to store the intermediate results (∑ |Gy| + ∑ |Gx|) of the 32 partial blocks present in one vertical slice. When processing the next slice, the previously stored intermediate results are added to the new results to obtain the final results of each block. A shift register composed of up to 32 1-bit flip-flops permits to shift the segmentation result of each of the 32 blocks present in one row of the image. After processing 2 vertical slices, the 32 segmentation results of the 32 blocks in an image row are obtained so they can be packed in a 32-bit word and transferred out to the system memory. The segmentation result of each block drives other parallel computations like the parameters K, ∑ f, and ∑ f 2, which are required in the normalization process. Two additional internal buffers (in the way of dual-port memories in Figure 89) are used to store the partial additions of the greyscale values and the greyscale square values of those non-segmented pixels in each block (∑ P and ∑ P2 ). Three output registers are continuously updated with the total computation of the parameters K, ∑ f, and ∑ f 2 of those non-segmented blocks of the image. At the end of the processing, all three registers contain the final values of those parameters K, ∑ f, and ∑ f 2 for the resultant segmented image. INPUT REGISTER ∑ |Gy| + ∑ |Gx| PIXELS 0 MUX ∑ P PIXELS 0 MUX ∑ P2 PIXELS 0 MUX DP-RAM DP-RAM DP-RAM MUX ADD MUX MUX ADD ∑ |G| Threshold ADD ∑{|Gy| + |Gx|}Block A B 0 1 0 ∑{P}Block 0 MUX 0 1 MUX SGMNT (1/0) 0 ∑{P2}Block 0 MUX 0 MUX CMP (A ≥ B) SGMNT (1/0) MUX ADD SGMNT (1/0) ADD ADD <<6 64 Pixels / Block K (Not SGMNT Blocks) OUTPUT REGISTER ∑f (Not SGMNT Blocks) OUTPUT REGISTER ∑f2 (Not SGMNT Blocks) OUTPUT REGISTER ⋅⋅⋅ D [31] D [30] D [29] D [28] D [27] D [4] D [3] D [2] D [1] D [0] SGMNT[31:0] OUTPUT DATA Figure 89. Image segmentation hardware coprocessor (part II). A pipeline-based coprocessor is developed able to compute the segmentation matrix while the image is received. The inputs of the segmentation stage are the original greyscale image and the segmentation threshold –GThreshold– used to discriminate between foreground and background. The outputs of the segmentation stage are the segmentation matrix, and the parameters K, ∑ f, and ∑ f 2 deduced from the segmented image. 225 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Image normalization The normalization processor is in charge of applying a linear transformation to the original fingerprint image f(x,y) in order to build a normalized version of the image h(x,y). Based on the segmentation output results K, ∑ f, and ∑ f 2, it is easy to deduce the parameters Offset (A) and Slope (B) to be applied in the normalization process: h ( x, y ) = A + B ⋅ f ( x, y ) where: Offset ≡ A = m0 − ∑f K i ⋅ σ 02 ∑f K 2 i ⎛ f ⎞ −⎜∑ i ⎟ ⎜ K ⎟ ⎝ ⎠ 2 Slope ≡ B = σ 02 ∑f K 2 i The constant parameters m0 =127 and σ02 = 1272 are the desired mean and variance values for the normalized image in the suggested image enhancement algorithm. The normalization coprocessor receives in two specific configuration registers the values of the Offset and Slope parameters to be applied. Those parameters can be easily computed by the system CPU after ending the segmentation stage. The normalization coprocessor is in charge of reading the original fingerprint image from the system memory, applying the cited transformation in order to obtain a new version of the image, and storing it in the system memory. The block diagram of the normalization coprocessor is depicted in Figure 90. INPUT REGISTERS NORM Offset NORM Slope P3 8 ⎛ f ⎞ −⎜∑ i ⎟ ⎜ K ⎟ ⎝ ⎠ 2 INPUT DATA P2 8 P1 8 P0 8 MUL MUL MUL MUL ADD 0 0 MUX 255 0 MUX 1 >255? 1 <0? ADD 0 0 MUX 255 0 MUX 1 >255? 1 <0? ADD 0 0 MUX 255 0 MUX 1 >255? 1 <0? ADD 0 0 MUX 255 0 MUX 1 >255? 1 <0? P′3 8 P′2 8 P′1 8 P′0 8 OUTPUT DATA Figure 90. Image normalization hardware coprocessor. A fully-pipelined normalization core module is developed able to process 4 pixels in parallel. Because the application deals with 8-bit greyscale images, the normalization results of one pixel are adjusted to the range [0,255]. Therefore, if the mathematical operations at pixel level provide values greater than 255 they are adjusted to 255. Similarly, negative values are adjusted to 0. 226 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Isotropic filtering The next processing stage aims at applying an isotropic filtering to the normalized image. The greyscale value of each pixel is modified taking into consideration the greyscale value of its neighbours in a local region around the pixel. Therefore, the new value of the pixel does not only depend on the pixel itself but on its closer neighbours. It is possible to build a smoothed version e(x,y) of the input image h(x,y) by applying two convolution processes to the input image with the Gaussian filter Gf of size 7×7: e( x, y ) = 2 ⋅ [ h ( x, y ) ∗ G f ( x, y )] − [ (h ( x, y ) ∗ G f ( x, y ))∗ G f ( x, y )] However, when thinking about the hardware implementation of the filtering coprocessor in the proposed system architecture, some modifications of the initial equation are done in order to reduce the processing to one single convolution with a bigger Gaussian filter G’f of size 13×13: e( x, y ) = h ( x, y ) ∗ G ′f ( x, y ) With this implementation, only one read cycle of the input image is done to transfer the image from the system memory to the hardware coprocessor, and only one write cycle is done to save the resultant enhanced version of the image in the system memory, minimizing thus the accesses to the shared resources like the memory or the system bus at the expense of using more computational resources in the programmable region to perform the convolution of size 13×13. Another possible solution would have been to perform the isotropic filtering in two steps, with two convolutions with the initial filter mask of size 7×7, and saving the intermediate results in the system memory (since not enough memory is embedded in the programmable logic fabric), but this solution has been discarded to avoid featuring a higher ratio of usage of the shared resources of the system, that would have extended the execution time performance of this task. In the proposed implementation, the convolution of the image with a kernel 13×13 implies a big amount of multiply-add operations, up to 169 products and 168 additions of two operands in case all the coefficients of the filter mask are different. However, and because the convolver Gaussian filter G’f is isotropic and symmetric, its coefficients are repeated along the two-dimensional matrix, which helps to reduce the amount of products to be computed by applying specific pre-adders. In Figure 91, one example is set when dealing with isotropic filters of size 3×3. The amount of products is reduced from 9 to 3 by applying a pre-adder stage: ⎡ Pj −1,i −1 Pj −1,i Pj −1,i +1 ⎤ ⎡ a b a ⎤ ⎢ ⎥ e ( j , i ) = h3 x 3 ( j , i ) ∗ G '3 x 3 = ⎢ Pj ,i −1 Pj ,i Pj ,i +1 ⎥ ∗ ⎢ b c b ⎥ = ⎢ ⎥ ⎢ Pj +1,i −1 Pj +1,i Pj +1,i +1 ⎥ ⎢ a b a ⎥ ⎦ ⎣ ⎦ ⎣ = a ⋅ ∑ (Pj −1,i −1 , Pj −1,i +1 , Pj +1,i −1 , Pj +1,i +1 ) + b ⋅ ∑ (Pj −1,i , Pj ,i −1 , Pj ,i +1 , Pj +1,i ) + c ⋅ ∑ (Pj ,i ) PRE-ADDER PARTIAL PRODUCT ADDER PRE-ADDER PARTIAL PRODUCT PARTIAL PRODUCT Figure 91. Image convolution with symmetric filters. In case of dealing with isotropic kernels of size 13×13, the amount of products is reduced to 28 when performing the convolution with a pre-adder stage. Following the pre-adder stage, the next stage computes the partial products, and the latest stage performs the addition of the partial products. The bock diagram of the isotropic filtering hardware coprocessor is shown in Figure 92. A modular design is developed in charge of processing up to 4 pixels of the input image in parallel. 227 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 INPUT DATA P15 P14 P13 P12 P11 P10 P9 8 8 8 8 8 8 8 P8 P7 P6 P5 8 8 8 8 P4 P3 P2 P1 8 8 8 8 P0 8 Pj,i Pj+1,i Pj+2,i Pj+3,i 13x13x8 13x13x8 ∑ LUTs MUL ∑ LUTs MUL ∑ LUTs MUL ∑ LUTs MUL PRE-ADDERS PARTIAL PRODUCTS ∑ 0 0 MUX 255 0 MUX 1 >255? 1 <0? 0 ∑ 0 1 MUX 255 0 MUX 1 >255? <0? 0 ∑ 0 1 MUX 255 0 MUX 1 >255? <0? 0 ∑ 0 1 MUX 255 0 MUX 1 ADDERS <0? >255? 8-BIT RESULT P′j,i 8 P′j+1,i 8 P′j+2,i 8 P′j+3,i 8 OUTPUT DATA Figure 92. Image isotropic filtering hardware coprocessor. The input stage is composed of a 2-D shift register array of 16×13 pixels. The constant coefficients of the filter G’f are stored in LUTs as convolver operands for the input images. The different stages that take place in the processing are pipelined: the pre-adder stage, the partial product computation stage, and the final stage dealing with the addition of the partial results. The convolution result is 228 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 limited in the range [0,255] in order to provide 8-bit greyscale images to be finally stored in the system memory. Different implementations are possible to perform the partial products stage: either to use specific hardware multiplier operators in those FPGAs provided with dedicated DSP blocks, or to synthesize generic descriptions (e.g. the vector multiplier method) of multipliers making use of those general resources available in FPGAs. Since the design of the coprocessors is modular, each of the blocks can be tuned to the resources available in each programmable logic device. Field orientation map At this stage, the system memory stores an enhanced version of the original fingerprint. The next task aims at deducing the dominant ridge-valley orientation for each of the non-segmented image blocks. One specific hardware coprocessor is in charge of this activity by reading the input image, computing the directional gradients Gx and Gy of the pixels in each image block, and the coefficients Σ (2 · Gx · Gy) and Σ (Gx2 – Gy2) for each of the blocks. Based on those coefficients, it is possible to determine the dominant directions of the ridges and valleys present in each block as follows: ∑ (2 ⋅ Gx ⋅ G y ) 1 −1 block Φ block = 90º + tan , Φ block ∈ [0º ,180º ) 2 2 2 ∑ Gx − G y ( ) block The block diagram of the hardware coprocessor is depicted in Figure 93. A two-dimensional shift register array of size 8×5 pixels makes possible the computation of the directional gradients of 4 pixels of the image in parallel. Four multiplier blocks and four made-to-measure ALU operators are in charge of computing the coefficients (2 · Gx · Gy) and (Gx2 – Gy2) of each of the pixels based on directional Sobel masks of size 5×5. Since the image is transferred and processed in vertical slices, and 4 pixels of one block are processed at one time, two consecutive vertical slices need to be processed to get the results of the 32 blocks in one row. Two dual-port memories act as accumulator buffers to store the partial results Σ (2 · Gx · Gy) and Σ (Gx2 – Gy2) of each of the 32 blocks in one row while the first slice is being processed. When processing the second vertical slice, the partial results are added to the new results to compute the final results corresponding to the blocks in the row. An 8-iteration pipelined CORDIC processor is responsible for the computation of the dominant ridge orientation of an image block from its coefficients Σ (2 · Gx · Gy) and Σ (Gx2 – Gy2) previously computed. The CORDIC processor codifies the orientation of the ridge-valley pattern in each image block in one byte the range [0º,180º) with a resolution of +1º. In case one indetermination of the angle exists (both coefficients coefficients Σ (2 · Gx · Gy) and Σ (Gx2 – Gy2) are null) in one block, the result is set to 255 (decimal value), which is an indication that the corresponding block needs to be disregarded and segmented. Once all the vertical slices of the image are processed, the field orientation map deduced from the image is available to be stored in the system memory. Therefore, the input of this stage is the enhanced greyscale image of size 268×460 pixels, and the output is a field orientation vector deduced from the 32×56 blocks of the image, resulting in up to 1792 orientations. 229 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 INPUT DATA P7 P6 8 8 P5 P4 P3 P2 8 8 8 8 P1 P0 8 8 Pj,i Pj+1,i Pj+2,i Pj+3,i 5x5x8 5x5x8 ∑ Gyj,i Gxj,i ∑ ∑ ∑ Gyj+1,i Gyj+2,i Gyj+3,i Gxj+1,i Gxj+2,i Gxj+3,i MUL MUL MUL MUL ALU ALU 2 ALU 2 j+1,i−Gy j+1,i 2 ALU Gxj,i⋅Gyj,i Gxj+2,i⋅Gyj+2,i Gxj+1,i⋅Gyj+1,i Gxj+3,i⋅Gyj+3,i Gx 2 j+3,i−Gy j+3,i Gx Gx2j+2,i−Gy2j+2,i Gx2j,i−Gy2j,i ADD ADD <<1 0 MUX ADD ADD ADD ADD 0 MUX DP-RAM DP-RAM MUX ADD ADD MUX ∑{2 ⋅ Gx ⋅ Gy}Block Y X ∑{Gx2 − Gy2}Block CORDIC Φ 8 Φ = ½ ⋅ tan (Y/X) + 90º OUTPUT DATA -1 Figure 93. Field orientation map hardware coprocessor. 230 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Filtered field orientation map The data from the segmentation matrix, deduced in the segmentation stage, and the field orientation map, computed in the previous stage, are merged in a new matrix with the updated information for each of the blocks. All those blocks previously segmented (either in the segmentation stage or in the field orientation map computation stage) are coded with its field orientation at a decimal value of 255, which identifies the background of the image. The rest of blocks feature their orientations coded with decimal values in the range [0,180), and constitute the image foreground. The field orientation map filtering processor is in charge of obtaining a smoothed version of the field orientation map by performing a convolution, at block level, of a continuous version of the input orientation field. A kernel of size 5×5 blocks is used to computed the average field orientation map as follows: i = +2 j = +2 1 = −2 Φ 'block (u, v ) = tan −1 ii= +2 2 ∑ ∑ sin ( 2 ⋅ Φ (u + i, v + j ) ) j = −2 j = +2 b b i = −2 j = −2 ∑ ∑ cos ( 2 ⋅ Φ (u + i, v + j ) ) It is needed to compute the trigonometric functions sin(2·Φb) and cos(2·Φb) for each of the blocks. Two different kind of LUTs are implemented in the hardware coprocessor to compute the functions sin(2·Φb) and cos(2·Φb) from the input Φb corresponding to the field orientation of one block. Two independent adder blocks are in charge of computing the coefficients Σ5×5 sin(2·Φb) and Σ5×5 cos(2·Φb) corresponding to the neighbourhood of size 5×5 around one block. Both coefficients are the inputs of an 8-iteration pipelined CORDIC processor that computes the resultant smoothed field orientation for each block of the image (disregarding in the computation those neighbour blocks that have been segmented). The output result of the processing corresponds to a new matrix with the updated information of the filtered field orientation map for all the 1760 blocks of the fingerprint image. In case the CORDIC coprocessor receives both input coefficients Σ5×5 sin(2·Φb) and Σ5×5 cos(2·Φb) as null when computing the tan-1() of one block, the block under consideration is segmented (its filtered field orientation is coded as 255). At the end of the processing, a new matrix of size 32×56 words is deduced. Those segmented blocks present a filtered field orientation coded as 255, whereas those non-segmented blocks feature a value coded in the range [0,180). Directional filtering and image binarization One hardware coprocessor has been developed in order to perform both image directional filtering and image binarization tasks in one single stage. The filtered field orientation map previously computed is stored in a dedicated memory internal to the hardware coprocessor and is used along the processing to indicate the orientation of those pixels that are evaluated. The hardware coprocessor is in charge of the following tasks: - To read the enhanced image e(x,y) that is stored in the system memory to sequentially transfer the image to the core module in charge of the directional filtering and binarization processes. - While the image is being transferred and processed, to make use of the filtered field orientation map information to know the dominant orientation of the region of the image that is being processed. Gabor filters that match the orientation of those pixels under evaluation are built and used. - To perform the convolution of each portion of the image with its corresponding oriented Gabor filters of size 7×7 pixels. - To compute, in parallel, the value of those adaptive binarization thresholds that depend on the directional Gabor filter and the portions of the image that are convolved with that Gabor filter at a given time. 231 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 To compare the convolution results with those adaptive thresholds to determine the binarization result for each pixel of the image. - And finally, to save the resultant binary image in the system memory. As a result of the processing, a binary version bin(x,y), deduced from the input greyscale image e(x,y) and its filtered field orientation map, is obtained. The binarization process takes into consideration the segmentation results. Those image pixels that belong to segmented blocks are automatically converted into white pixels in the binarized version of the image. The block diagram of the hardware coprocessor is depicted in Figure 95. An input shift register matrix of size 10×7 pixels is used to transfer the greyscale image to the core module. Since the greyscale image needs to be convolved with Gabor filters of size 7×7, the cited matrix permits to compute up to 4 pixels of the image in parallel. It is important to remark that the 4 pixels that are processed at one time belong to the same image block, therefore, all 4 pixels share the same dominant orientation and they can be convolved with the same Gabor filter. A dual-port memory is used as buffer to store the filtered field orientation map of the image to be processed. By knowing the exact block where the pixels that are under process belong to, it is possible to identify the orientation Φ ' of those pixels and to build the directional Gabor filter Hf,Φ' that matches their orientation. Since the Gabor filters possess certain symmetry, it is possible to store, in two dedicated buffers instantiated in the design (implemented as ROM memories in Figure 95), the coefficients of the oriented Gabor filters and the addition results of those coefficients respectively, as shown in Figure 94, to perform a fast computation of (i) the convolution and (ii) the dynamic thresholds used in the binarization process. As shown in the figure, the orientation of the blocks is used as the address data to access such information. ⎡a b c ⎢h i j ⎢ ⎢o p q ⎢ = ⎢v w x ⎢u t s ⎢ ⎢n m l ⎢g f e ⎣ d k r y r k d e f g⎤ l m n⎥ ⎥ s t u⎥ ⎥ x w v⎥ q p o⎥ ⎥ j i h⎥ c b a⎥ ⎦ H f ,Φ′ 7×7 IMAGE BLOCK(u,v) @ MEM D Φ' BLOCK(u,v) @ MEM D COEFFICIENTS Hf,Φ‘ = = { a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y } Σ COEFFICIENTS Hf,Φ‘ = = y + 2 · Σ( a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x ) @ MEM D Figure 94. Symmetry of Gabor filters. Construction of directional Gabor filters Hf,Φ' from the field orientation data. That information is used to build the Gabor filter to be used in the convolution process, as well as to compute the dynamic thresholds used in the binarization task: i = +3 ⎧ ⎪1 (black/ridge pixel), if 49 ⋅ ∑ i = −3 ⎪ ⎪ bin ( x, y ) = ⎨ ⎪0 (white/valley pixel), otherwise ⎪ ⎪ ⎩ ∑ H f ,Φ ' (i, j ) ⋅ e( x + i, y + j ) < ∑ j = −3 j = +3 i = +3 i = −3 ∑ H f ,Φ ' (i, j ) ⋅ ∑ j = −3 j = +3 i = +3 i = −3 ∑ e( x + i , y + j ) j = −3 j = +3 By means of shifts and additions of the convolution result Σ (Hf,Φ' · e), it is possible to compute the term 49 · Σ (Hf,Φ' · e) of the comparison: Given z = i = +3 i = −3 ∑ ∑H j = −3 j = +3 f ,Φ ' (i, j ) ⋅ e( x + i, y + j ) → 49 ⋅ z = (32 + 16 + 1) ⋅ z = (25 ⋅ z ) + (24 ⋅ z ) + (1 ⋅ z ) 232 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The other term (Σ Hf,Φ' · Σ e) of the comparison is also computed on-line. The first parameter Σ Hf,Φ' of the product is pre-stored in one of the dedicated memory blocks, and the other parameter Σ e is computed on-line. The product of both parameters provides the dynamic threshold (Σ Hf,Φ' · Σ e) to be used in the binarization process of the pixels of the image. INPUT DATA P9 8 P8 P7 P6 P5 8 8 8 8 P4 P3 P2 P1 8 8 8 8 P0 8 7x7x8 e(x,y) 7x7x8 PRE-ADDERS ∑ ∑ ∑ ∑ DP-RAM Φ' BLOCK Hf,Φ' (i,j) PARTIAL PRODUCTS @ D ROM MUL MUL MUL MUL ∑ ∑ ∑ ∑ ADDERS Φ' BLOCK ∑49 e(x,y) ∑49 Hf,Φ' (i,j) SHIFTS AND ADDITIONS @ D ROM ∑ ∑ ∑ ∑ MUL MUL MUL MUL PRODUCTS 49 ⋅ ∑49{Hf,Φ' (i,j) ⋅ e(x+i,y+j)} A B A B A B ∑49 Hf,Φ' (i,j) ⋅ ∑49 e(x+i,y+j) A B CMP (A < B) CMP (A < B) CMP (A < B) CMP (A < B) PIXEL BINARIZATION 4 7 6 5 4 DEMUX 3 2 1 0 7…0 DP-RAM 4 DP-RAM 4 DP-RAM 4 DP-RAM 4 DP-RAM 4 DP-RAM 4 DP-RAM 4 4 D [31:28] D [27:24] D[23:20] D [19:16] D [15:12] D [11:8] D [7:4] D [3:0] BIN[31:0] OUTPUT DATA Figure 95. Image binarization hardware coprocessor. 233 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 As a result of the binarization process, each 8-bit greyscale pixel is transformed into one binary bit associated to either the fingerprint ridge or the valley area. As a consequence of the parallelism and pipeline strategies used in this design, 4 pixels are transformed into 1 nibble of the binary image each rising edge of the coprocessor clock. The binary image is packed into 32-bit words before transferring it to the system memory. For such a purpose, some dual-port memories are instantiated in the design in order to temporarily collect the processed nibbles before transferring the 32-bit data in the desired order and format to the system memory map. In this way, the size of the resultant binary image is 8 times smaller than the original 8-bit greyscale image, thus the memory resources needed to store the resultant binary image are notoriously reduced. Image smoothing The smoothing hardware coprocessor is in charge of convolving the binary image bin(x,y) with specific filters SΦ' of size 7×7 pixels oriented according to the dominant direction Φ ' of the pixels in each local region. The processing that is carried out is as follows: i = +3 ⎧ 1 (black/ridge pixel), if bin(x, y) = 0 and 2 ⋅ ∑ ⎪ i = −3 ⎪ ⎪ smooth ( x, y ) = ⎨ ⎪bin(x, y), otherwise ⎪ ⎪ ⎩ ∑ SΦ ' (i, j ) ⋅ bin( x + i, y + j ) > j = −3 j = +3 i = +3 i = −3 ∑ ∑S j = −3 j = +3 Φ' (i , j ) The block diagram is shown in Figure 96. A shift register array of size 38×7 pixels makes possible the smoothing process of up to 32 pixels of the binary image in parallel. Similarly to the binarization stage, the smoothing processor makes use of a dedicated memory where the filtered field orientation map of the fingerprint image is stored in a specific format. By reading the content of such a memory while the processor receives those vertical slices of the binary image, it is possible to know the orientation of those pixels that are under process. Such information is used to determine the directional filters to be applied in the convolution operation. Up to 32 pixels, corresponding to 4 different blocks, are processed at a given time. The 8 pixels of one block are convolved with the directional filter that matches the orientation of the block. A set of ROM memories is used to store (i) the coefficients of the directional binary filter masks of size 7×7, and (ii) the addition result of the coefficients of each oriented filter. It makes possible the fast computation of the convolution, as well as the calculation of the adaptive thresholds used in the smoothing evaluation process. The fact of dealing with binary images and binary filters makes simpler the convolution process. The convolution is reduced to the addition of logical AND operations carried out at bit level, as follows: ⎡ Pj −3,i −3 ⎢P ⎢ j −2 ,i −3 ⎢ Pj −1,i −3 ⎢ bin ( j, i ) ∗ SΦ′ = ⎢ Pj ,i −3 ⎢ Pj +1,i −3 ⎢ ⎢ Pj + 2,i −3 ⎢ Pj +3,i −3 ⎣ Pj −3,i +3 ⎤ ⎡a Pj −2,i −2 Pj −2,i −1 Pj −2 ,i Pj −2,i +1 Pj −2 ,i + 2 Pj −2 ,i +3 ⎥ ⎢h ⎥ ⎢ Pj −1,i −2 Pj −1,i −1 Pj −1,i Pj −1,i +1 Pj −1,i +2 Pj −1,i +3 ⎥ ⎢o ⎥ Pj ,i −2 Pj ,i −1 Pj ,i Pj ,i +1 Pj ,i +2 Pj ,i +3 ⎥ ∗ ⎢v ⎢ Pj +1,i −2 Pj +1,i −1 Pj +1,i Pj +1,i +1 Pj +1,i +2 Pj +1,i +3 ⎥ ⎢u ⎥ ⎢ Pj +2,i −2 Pj +2,i −1 Pj +2,i Pj +2,i +1 Pj +2 ,i +2 Pj +2,i +3 ⎥ ⎢n Pj +3,i −2 Pj +3,i −1 Pj +3,i Pj +3,i +1 Pj +3,i +2 Pj +3,i +3 ⎥ ⎢ g ⎦ ⎣ = ∑ (AND( Pj −3,i −3 , a ), AND( Pj −3,i −2 , b), AND( Pj −3,i −1 , c ), ... , AND( Pj ,i , y ), ... , Pj −3,i −2 Pj −3,i −1 Pj −3,i Pj −3,i +1 Pj −3,i +2 g⎤ i j k l m n⎥ ⎥ p q r s t u⎥ ⎥ w x y x w v⎥ = t s r q p o⎥ ⎥ m l k j i h⎥ f e d c b a⎥ ⎦ b c d e f AND( Pj +3,i +1 , c ), AND( Pj +3,i +2 , b), AND( Pj +3,i +3 , a ) ) The evaluation of the image convolution results with the dynamic thresholds permits to obtain a smoothed version smooth(x,y) of the binary image where the noise is reduced, and the clarity of the fingerprint ridges is improved. A fully-pipelined hardware coprocessor, able to deliver 32 pixels per clock, is in charge of the processing. The resultant image is saved in the system memory, and will be used as the basis image from which to extract those distinctive features available in the fingerprint –minutiae– in the next processing stages of the recognition algorithm. 234 INPUT DATA P5 P4 P3 P2 P1 1 1 1 1 1 1 1 P37 P36 P35 P34 P33 P32 P31 P30 P29 P28 P27 P26 … P24 P23 P22 P21 P20 P19 P18 P17 P16 P15 P14 P13 … P11 P10 P9 P8 P7 P6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 P0 1 1 1 1 1 1 1 1 1 … … … … … … … … … … … … … … DP-RAM Φ'BLOCKs 4x8 Φ'0 Φ'2 Φ'1 Φ'3 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 96. Image smoothing hardware coprocessor. bin(x,y) 235 ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ 2 ⋅ ∑49{SΦ'n(i,j) ⋅ bin(x+i,y+j)} CMP 1 1 1 1 1 1 1 1 1 1 1 CMP CMP CMP CMP CMP CMP CMP CMP CMP CMP CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 D [8] D [7] Φ'0 Φ'2 Φ'1 Φ'3 7x7x1 SΦ'n(i,j) D ROM @ BINARY CONVOLVER ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑49 SΦ'n(i,j) CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 CMP 1 D ROM @ PIXEL SMOOTHING CMP CMP CMP CMP CMP CMP 1 1 1 1 1 1 D [31] D [30] D [29] D [28] D [27] D [26] D [25] D [24] D [23] D [22] D [21] D [20] D [19] D [18] D [17] D [16] D [15] D [14] D [13] D [12] D [11] D [10] D [9] SMTH[31:0] D [6] D [5] D [4] D [3] D [2] D [1] D [0] OUTPUT DATA UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 5.5. Feature Extraction Approach The third stage of the suggested fingerprint recognition algorithm, in both enrolment and authentication phases, is the extraction of that discriminatory information available in the fingerprint impression. The present work is focused on hybrid matching techniques, which implies that several features of different nature are used to characterize and match fingerprints. One of the important features used in this work to discriminate among different fingerprints is the field orientation map. The main advantage of using the field orientation is the fact that the ridge orientation is robust against distortions; it can be reliably extracted even in poor quality fingerprints. However, it presents as weak point the fact that its distinctiveness is small. In order to complement the limitations of such a feature, the minutiae are also used as fingerprint identifiers in the suggested recognition algorithm. On the contrary to the field orientation map, the spatial distribution of minutia points along the fingerprint pattern features a high discriminatory level. Nevertheless, the main disadvantage of minutiae is the fact that the accuracy of minutiae extraction algorithms is drastically reduced when dealing with noisy fingerprint impressions. Therefore, is the combination of both features that permits to improve the recognition performance of the whole system and to reach better results than in case of handling each of those features individually. The inputs of this stage are the enhanced binary version of the original fingerprint and its filtered field orientation map; and the output is the discriminatory information deduced from the fingerprint, ordered, encoded and packed in a specific format. Both feature descriptors –field orientation map and minutiae– are used as the inherent and reliable information linked to the individual’s identity. In the enrolment stage, this information, together with any other personal data of the user, is stored in an individualized (e.g. customized smart card) or centralized (e.g. laptop logon system for the members of a family –fathers and children–, where the application privileges depend on the exact user that is enrolled) database to provide the user with the proper rights in any application. In the authentication stage, the extracted features of the user are matched against those features recorded in the system database corresponding to the individual claimed by the user. The matching process either confirms that the user who attempts to access the system is the person he claims to be, or, on the contrary, points out the user as an impostor who attempts to access the system in a fraudulent way. 5.5.1. Algorithm Description Although the field orientation map was already obtained in the previously detailed enhancement process, the binary image needs further processing before fingerprint minutiae features can be properly extracted. In this work, an image thinning process is carried out in order to transform the binarized ridge-valley pattern into a ridge skeleton of one-pixel wide. From the thinned image, it is possible to deduce those ridge endings and ridge bifurcations that characterize the fingerprint. The fingerprint features extraction algorithm is split into a set of sequential processing stages, as shown in Figure 97. START IMAGE THINNING MINUTIAE EXTRACTION MINUTIAE FILTERING FINGERPRINT FEATURE EXTRACTION PHASE FEATURE SETS –MINUTIAE & FILTERED FIELD ORIENTATION MAP– FORMATTING END Figure 97. Fingerprint feature extraction algorithm: processing stages. 236 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Image thinning Among the different methods that have been developed in literature dealing with the thinning process of binarized fingerprint images, the author has selected those focused on iterative deletion of ridge pixels based on pixel neighbourhood analysis. The image thinning process consists thus in reducing the thickness of the fingerprint ridges till achieving a ridge skeleton of one pixel-wide, keeping the geometry and the connectivity of the original ridge pattern. The proposed algorithm successively deletes layers of ridge pixels on the boundary of the ridge pattern until only one skeleton remains. The deletion or retention of a ridge (black) pixel depends on the configuration of pixels in its local neighbourhood. Given a ridge pixel p, the set of neighbours of p in a local kernel of size a×b is termed the neighbourhood of p, N(p). For a kernel of size 3×3, the neighbourhood of p is depicted in Figure 98. p[7] p[0] p[6] p p[2] p[1] p[5] p[4] p[3] Figure 98. Image pixel neighbourhood definition (kernel 3×3). The neighbours p[0], p[1], … , p[7] are called 8-neighbours of p. The weight number of p, WN(p), is defined as: WN ( p ) = ∑ p[k ]⋅ 2k k =0 7 where ridge (black) pixels are codified as ‘1’ and valley (white) pixels as ‘0’. Therefore, for a local neighbourhood 3×3, the weight numbers are in the range [0,255]. The neighbour number of p, NN(p), is the number of nonzero neighbours of p: NN ( p ) = ∑ p[k ] k =0 7 The connection number of p, CN(p), is defined as: CN ( p ) = ∑ p[k ]⋅ ( p[k + 1] ∪ p[k + 2]) k = 0, 2 , 4 , 6 where: p[k ] = 1 − p[k ] p[8] = p[0] A ridge pixel p is called globally removable if CN(p) = 1 and NN(p) > 1. When dealing with local neighbourhoods of size 3×3, the set of globally removable pixels (GRS) is composed of a total of 108 elements as follows: GRS3×3 = {3, 5, 6, 7, 12, 13, 14, 15, 20, 21, 22, 23, 24, 28, 29, 30, 31, 48, 52, 53, 54, 55, 56, 60, 61, 62, 63, 65, 67, 69, 71, 77, 79, 80, 81, 83, 84, 86, 88, 89, 91, 92, 94, 96, 97, 99, 101, 103, 109, 111, 112, 113, 115, 116, 118, 120, 121, 123, 124, 126, 129, 131, 133, 135, 141, 143, 149, 151, 157, 159, 181, 183, 189, 191, 192, 193, 195, 197, 199, 205, 207, 208, 209, 211, 212, 214, 216, 217, 219, 220, 222, 224, 225, 227, 229, 231, 237, 239, 240, 241, 243, 244, 246, 248, 249, 251,252, 254} A ridge pixel p is called irreducible if p is not a globally removable point. The thinning process consists in building the skeleton of the binary image by keeping only those irreducible pixels while preserving the connectivity of the image. The proposed algorithm makes use of both sequential and parallel thinning techniques: a) In sequential thinning techniques, the pixels are examined in a fixed sequence in each iteration (e.g. from left to right, and from top to bottom), and the deletion or retention of pixel p in the nth iteration depends on all those operations that have been performed till that moment, it means, the (n-1)th iteration, as well as the pixels already processed in the nth iteration. b) Nonetheless, in parallel thinning techniques the deletion or retention of a pixel p in the nth iteration uniquely depends on the result that remains after the (n-1)th iteration. Therefore, in such 237 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 case all pixels can be examined concurrently in each iteration. These techniques are suitable for implementation on parallel processors, and those pixels satisfying the set of removal conditions can be deleted simultaneously. The suggested algorithm is split in two steps. In a first step, a 2-subcycle parallel thinning process, abstracted from [Gao and Hall, 1989], is iteratively applied to the input binary image. The entire ridge pixels of the image are examined for deletion or retention in parallel, based on the results of the previous iteration. However, an iteration is split in two sub-iterations, and in each sub-iteration only a subset of contour pixels are considered for removal. The criterion to preserve the connectivity of the image pattern, while iteratively deleting pixels, is as follows: (i) In the first sub-iteration, only those globally removable pixels in the set GH1 are removed. Once the first sub-iteration is done and the changes applied, a new image is obtained. (ii) In the second sub-iteration, the new image is analysed, and only those globally removable pixels in the set GH2 are removed. As a result of the second sub-iteration, a new image is obtained for further processing. The above strategy, based on sub-iterations 1 and 2, is iteratively applied to the new images till no more ridge pixels can be deleted, being the removable sets GH1 and GH2 as follow: GH13×3={28, 56, 60, 65, 67, 80, 81, 83, 88, 89, 92, 97, 99, 112, 113, 115, 120, 121, 124, 131, 193, 195, 208, 209, 211, 216, 217, 220, 224, 225, 227, 240, 241, 243, 248, 249, 252} GH23×3={5, 7, 13, 14, 15, 20, 21, 22, 23, 28, 29, 30, 31, 52, 53, 54, 55, 56, 60, 61, 62, 63, 131, 133, 135, 141, 143, 149, 151, 157, 159, 193, 195, 197, 199, 205, 207} From the first step, a quasi-thinned image is obtained. The second and last computational step consists of a single sequential thinning process responsible for detecting those globally removable pixels that can still be present in the partially thinned image. The remaining ridge pixels are examined in a fixed sequence –from left to right, and from top to bottom– against the complete set of removable pixels GRS. At the end of the iteration, one irreducible bitmap with the ridge skeleton of the fingerprint is obtained. The general processing flow of the suggested thinning algorithm is detailed in Figure 99. From the resultant image, it is easy to extract the set of minutia points in the next processing stages. Input: binary image I Output: skeleton S of I 1. S = I; // output image S initialization // processing step 1 2. do { 3. removed_pixels = 0; // subcycle 1 4. for all pixel p in S { // parallel processing 5. if (p is ridge pixel) // ridge pixel identification 6. { 7. if (p is removable pixel & p matches pattern GH1) 8. { 9. convert p to a valley pixel in S; 10. removed_pixels= 1; 11. } 12. } 13. } // subcycle 2 14. for all pixel p in S { // parallel processing 15. if (p is ridge pixel) // ridge pixel identification 16. { 17. if (p is removable pixel & p matches pattern GH2) 18. { 19. convert p to a valley pixel in S; 20. removed_pixels= 1; 21. } 22. } 23. } 24. } while (removed_pixels ≠ 0); 238 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. // processing step 2 for r := upper row of S to r := bottom row of S { // sequential processing for c := left column of S to c := right column of S { // sequential processing if (p[r, c] is ridge pixel) // ridge pixel identification { if (p is removable pixel & p matches pattern GRS) { convert p to a valley pixel in S; } } } } result S; // skeleton S of I Figure 99. Image thinning process flow. Figure 100 shows the thinned version of the binarized and smoothed input fingerprint image after application of the proposed thinning algorithm. a) b) Figure 100. Image skeletonization: a) smooothed image, b) thinned image. Minutiae extraction The scientific community has accepted minutia points as efficient fingerprint characteristics, and their high discriminating power has been proven along a deep list of biometric matching algorithms [Maltoni et al., 2009]. It is easy to extract the minutiae set from the thinned fingerprint image. The most commonly employed method of minutiae extraction is the crossing number (CrN) concept. The crossing number at point p is expressed as a function of the pixel values in a 3×3 local neighbourhood, as depicted in Figure 98. Given a ridge pixel p, the crossing number at point p is defined as half of cumulative successive differences between pairs of adjacent pixels belonging to the 8-neighbourhood of p: 1 7 CrN = ∑ Pi − Pi +1 , P8 = P0 2 i =0 The ridge point p is considered as a candidate ridge ending when CrN = 1, a candidate ridge bifurcation when CrN = 3 or CrN = 4, and a non-minutia point otherwise, as shown in Table 55. 239 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 CrN 0 1 2 3 4 Pixel type Isolated pixel Ridge ending Ridge connector pixel Ridge bifurcation Ridge crossing Minutia/Non-minutia Non-minutia Minutia Non-minutia Minutia Minutia Table 55. Ridge pixels type definition. By applying the above criterion to all ridge pixels of the thinned image it is possible to identify those potential candidates to minutia points present in the fingerprint. For each extracted candidate to minutia point, the following information is recorded: (i) x and y coordinates, (ii) orientation of the associated ridge segment in the minutia point, (iii) and type of minutia (ridge ending or ridge bifurcation/crossing), as shown in Figure 101. Y Φ '1 y1 m1 Ridge bifurcation m0 y0 Φ '0 Ridge ending x0 x1 X Figure 101. Minutia descriptor definition. Minutiae filtering The proposed fingerprint acquisition, image enhancement and feature extraction algorithms are not infallible. Factors such as low quality fingerprint impressions, inefficient image enhancement processes, or non-accurate extraction algorithms affected by noises, contrast deficiencies or other agents can lead to the development of structures in the thinned image that can provoke either the appearance of fake minutia points, or the removal of true minutia points. For this reason, all minutia points deduced in the previous stage are considered as potential candidates only, and further verification of each of the salient features is required prior to assessing them as true minutia points to be considered as genuine traits of the user’s fingerprint. This postprocessing stage is known as minutiae filtering or minutiae verification, and is used to confirm that only legitimate minutiae will be taken into account as fingerprint descriptors in the enrolment and authentication stages. The suggested minutiae filtering stage does not aim at recovering those true minutia points that could have been lost along the processing. Instead of that, the verification stage aims at identifying and rejecting those false minutia points extracted from the thinned version of the fingerprint. The rules that are applied in the proposed algorithm to filter those false minutia points and to keep only those true features are based on the local analysis in a neighbourhood area around each minutia candidate: - if many minutiae are detected in a small area (close ridge endings and/or ridge bifurcations), they are discarded because they are likely caused by noise; - ridge-ending pixels around the border of the image are also ignored because they are not true minutiae; - minutia points close to segmented regions are discarded too; and - close ridge ending pairs with similar orientations are also discarded because they are most probably caused by the effects of cuts or broken ridges. 240 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The aim of the minutia verification process is therefore to overcome those undesired imperfections that can still exist in the thinned image in order to improve the recognition performance of the whole authentication system. Figure 102 shows the set of minutia points deduced as legitimate features from the thinned fingerprint impression. a) b) Figure 102. Minutiae extraction and verification: a) thinned image, b) extracted minutiae. Feature sets –filtered field orientation map and minutiae– formatting One specific format is used to store the distinctive traits extracted from a fingerprint impression: a) The two-dimensional matrix, corresponding to the filtered field orientation map of size 32×56, is stored in the way of one 1792-byte vector with values in the range [0,180) for valid blocks, and 255 for segmented blocks. b) The set of minutia points is stored in quadruplets ( xn, yn, Φ 'n, mn ), where xn, yn and Φ 'n correspond to the spatial position and orientation of the nth true minutia with regard to the x and y axes of the image, and mn is the type of minutia (ridge ending –codified as ‘0’– or ridge bifurcation –codified as ‘1’–). All such information is either stored as part of the user’s template during the enrolment stage, or used to match the user’s fingerprint against one template in the authentication stage. 5.5.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration The early software-only implementation of the algorithm under several embedded system platforms has provided a fast and easy way of identifying those time-consuming tasks that need to be ported to hardware in order to speed up the processing. The profiling of the application under those software-related resources available in each embedded system points out the final hardwaresoftware partitioning to be applied to meet those application execution time requirements. All compute-intensive tasks are ported to hardware by exploiting the acceleration performance featured by programmable logic devices, and those less expensive tasks remain as software tasks under the execution of the system CPU. The final partitioning of the application depends on the specific resources of the embedded system platform selected to build the system, and this topic is covered in detail in section 5.8 (System Integration) for different embedded system platforms. The topology of the hardware coprocessors in charge of the feature extraction stage is the same as the ones in charge of the image enhancement process. The hardware coprocessors are based on a 241 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 three-level modular design concept composed of the System Bus Interface block, the Data Bandwidth Adaptation block, and the Core module in charge of the specific tasks, as depicted in Figure 84 and Figure 85 of section 5.4. The main difference with regard to the image enhancement coprocessors is the fact that the image thinning and minutiae extractor coprocessors deal with binary images only, so each pixel is encoded with one single bit instead of one byte. This permits to reduce the width of the communication data buses in the interface between the different modules. The maximum width of the input data bus of the feature extractor cores is limited to 36 bits (instead of those configurable 160 bits available in the enhancement coprocessors). Similarly to the enhancement coprocessors, the output data bus remains 32-bits wide for all the extractor coprocessors. Another difference is the fact that, unlike the image enhancement coprocessors, where only one single read of the input image and one single write of the resultant output image takes place, the thinning coprocessors have been designed in the way that some iterative operations over the input images require to save the intermediate image results of the thinning process in the system memory. Therefore, several read and write cycles of the processed images are needed and consequently, this task features a higher usage ratio of the shared system resources. However, since the data content of the binary images is smaller than the greyscale ones, fast task execution times can still be achieved. 5.5.3. Physical Implementation – Design Development – Design Implementation In this section, the details about the implementation of those core modules in charge of the fingerprint image thinning and minutiae extraction processes are given. As already discussed in previous sections, apart from the high bandwidth input and output data buses used in the interface between the Core module and the Data Bandwidth Adaptation module, each Core module is provided with general-purpose registers in order the system CPU to be able of configuring those application-specific hardware coprocessors in charge of the feature extraction stage, and monitoring the processing flow. The design of the image thinning and minutiae extractor coprocessors is covered next. Image thinning The flow diagram of the suggested thinning algorithm is detailed in Figure 103. Two hardware coprocessors, called Iterative Thinning processor and Irreducible Thinning processor, have been developed to perform the two steps in which the thinning process is split. START READ IMAGE GH1 REMOVABLE PIXELS PROCESSING ITERATIVE THINNING PROCESSOR GH2 REMOVABLE PIXELS PROCESSING SAVE IMAGE No END ITERATIONS ? Yes READ IMAGE IRREDUCIBLE THINNING PROCESSOR GRS REMOVABLE PIXELS PROCESSING SAVE IMAGE END Figure 103. Thinning process application flow. 242 INPUT DATA P4 P3 P2 P1 P0 1 1 1 1 1 1 P35 P34 P33 P32 P31 P30 P29 … P27 P26 P25 P24 P23 P22 P21 … P19 P18 P17 P16 … P14 P13 P12 P11 P10 P9 P8 … P6 P5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 … … … … … … … … … … … … 3x3x1 LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT 3x3x1 GH1 THINNING P′ 23 P′ 22 LUT LUT LUT LUT LUT LUT LUT LUT P′ 34 P′ 33 P′ 32 P′31 P′20 P′19 P′ 18 P′ 17 P′ 16 P′ 15 P′ 13 P′12 P′ 11 P′ 10 P′ 30 P′29 … P′ 9 P′27 P′ 26 P′ 25 P′ 24 … … … … … … … … P′8 … … … … P′6 P′ 5 P′ 4 P′ 3 P′ 2 P′ 1 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The Iterative Thinning processor is in charge of reading the input image, initially stored in the system memory, and applying two consecutive transformations –GH1 and GH2– to the image ridge map in order to detect those sets of removable pixels that can be converted to valley pixels. Figure 104. Iterative image thinning hardware coprocessor. 243 LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT D [8] … … … 3x3x1 LUT LUT LUT LUT LUT LUT LUT LUT GH2 THINNING LUT LUT LUT LUT LUT LUT LUT P′′ 33 P′′ 32 P′′31 P′′30 P′′ 29 P′′28 P′′ 27 P′′ 26 P′′ 25 P′′ 24 P′′ 23 P′′ 22 P′′ 21 P′′ 20 P′′ 19 P′′ 18 P′′ 17 P′′ 16 P′′ 15 P′′ 14 P′′13 P′′ 12 P′′11 P′′ 10 P′′9 P′′ 8 P′′ 7 P′′ 6 P′′ 5 P′′ 4 P′′ 3 P′′ 2 D [31] D [30] D [29] D [28] D [27] D [26] D [25] D [24] D [23] D [22] D [21] D [20] D [19] D [18] D [17] D [16] D [15] D [14] D [13] D [12] D [11] D [10] D [9] D [7] D [6] D [5] D [4] D [3] D [2] D [1] D [0] THIN[31:0] OUTPUT DATA UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The resultant image is stored in the system memory, and that process is repeated iteratively, as many times as needed, until no removable pixels within the sets GH1 and GH2 can be found in the processed image. In each iteration, all the image ridge pixels are evaluated, and only those ridge pixels that do not disturb the connectivity of the image pattern are removed or converted into valley pixels. By doing this, in each iteration, the ridge map is successively erased till one skeleton remains. A simplified block diagram of the coprocessor is shown in Figure 104. The Iterative Thinning processor performs the evaluation of up to 32 pixels of the image in parallel. One iteration is split into two sub-iterations. In the first sub-iteration, the input data bus is 36 pixels wide. A shift register matrix of size 36×3 pixels is used to transfer those vertical slices of the image to the hardware coprocessor. Since the thinning evaluation process of one pixel is based on neighbourhood kernels of size 3×3, up to 34 pixels can be evaluated at one time. A total of 34 LUTs are used to evaluate each of the pixels against the removable pixels set GH1. In case one ridge pixel meets the removable conditions, automatically it is converted to a valley pixel. Those 34 partial results are inputted to the second sub-iteration process, which consists of a new shift register matrix of size 34×3 pixels. Up to 32 LUTs are used to evaluate a total of 32 pixels against the removable pixels set GH2. After one complete iteration of all pixels of the image, a thinner ridge map is obtained, which is stored in the system memory. As illustrated in the figure, the iterative thinning processor is based on pipeline techniques. One specific flag is available in the core module in order to judge, at the end of each complete iteration, whether any ridge pixel has been converted to a valley or not. In case some ridge thinning has taken place, the processing continues by reading the previously stored image and applying a new thinning iteration to the intermediate image. Otherwise, the iterative processing stops and the Irreducible Thinning processor continues the processing. A simplified block diagram of the Irreducible Thinning processor is shown in Figure 105. The Irreducible Thinning processor reads one more time the intermediate thinned image –previously stored in the system memory–, performs a sequential processing stage over the image and generates an irreducible version of the fingerprint ridge pattern that is stored in the system memory once more. This coprocessor is in charge of evaluating all the ridge pixels presents in the image in one specific order –from top to bottom, and from left to right–, looking for any removable ridge pixel that matches the removable pixel set GRS. In case one removable pixel is found, it is automatically converted to a valley, and such a conversion is taken into consideration when evaluating its neighbour pixel in the defined processing sequence. Pipeline strategies are also applied in this stage. Similarly to the Iterative Thinning processor, the Irreducible Thinning processor performs the evaluation of up to 32 pixels of the image in parallel based on kernel neighbourhoods of size 3×3, implemented through combinational logic circuits synthesized in the way of ROM memories. The 32-bits word result processed in one clock is taken into consideration in the processing of the next word, just in the next clock. Both coprocessors are responsible for reducing the thickness of the fingerprint ridges to one pixelwide while keeping the geometry and the connectivity of the original ridge pattern. The thinned version of the input fingerprint is irreducible, and it is properly stored in the system memory at the end of the thinning stage. Minutiae extraction The minutiae extraction hardware coprocessor is in charge of reading the thinned image and identifying from it those potential minutia points. A partially-pipelined processor is in charge of receiving the thinned image and processing up to 32 pixels of the image in parallel looking for potentially valid ridge endings and ridge bifurcations. The block diagram of the minutiae extraction processor is depicted in Figure 106. Since the evaluation of the ridge pixels is based on kernels of size 3×3, a shift register array of size 34×3 is implemented to transfer the vertical slices of the thinned image to be processed. A total of 32 LUTs are in charge of evaluating up to 32 pixels of one row in parallel. A finite state machine core is 244 INPUT DATA P8 P7 … P5 P4 1 1 1 1 1 1 1 1 1 P33 P32 P31 P30 P29 P28 … P26 P25 P24 P23 P22 P21 … P19 P18 P17 P16 P15 P14 … P12 P11 P10 P9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 P3 P2 P1 P0 1 1 1 1 1 … … 28 26 25 MUX MUX MUX MUX MUX … … 24 23 22 21 19 18 MUX MUX MUX MUX MUX … … 17 16 15 14 12 11 MUX MUX MUX MUX MUX … … 10 9 8 7 5 4 MUX MUX 32 31 30 29 3 MUX 2 MUX 1 MUX MUX MUX MUX MUX … MUX MUX MUX MUX … … … … … … … 3x3x1 3x3x1 R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 105. Irreducible image thinning hardware coprocessor. Once a minutia point is identified, a 32-word is created storing the information of the coordinates x and y of the minutia relative to the whole image, the type of minutia –ridge ending or ridge bifurcation–, and the specific image block it belongs to in order to assign as orientation of the R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M R O M responsible for identifying those potential minutia points present in the thinned image and transferring to the system memory the information (x,y,Φ ',m) related to any potential minutia. 245 P′27 P′26 P′25 P′24 P′23 P′22 P′21 P′20 P′19 P′18 P′17 P′16 P′15 P′14 32 GRS THINNING P′32 P′31 P′30 P′29 P′28 P′13 P′12 P′11 P′10 P′9 P′8 P′7 P′6 P′5 P′4 P′3 P′2 P′1 P′32 P′31 P′27 P′26 P′25 P′24 P′23 P′22 P′21 P′30 P′29 P′28 P′20 P′19 P′18 P′17 P′16 P′15 P′14 P′13 P′12 P′11 P′10 P′9 P′8 P′7 P′6 P′5 P′4 P′3 P′2 P′1 D [31] D [30] D [29] D [28] D [27] D [26] D [25] D [24] D [23] D [22] D [21] D [20] D [19] D [18] D [17] D [16] D [15] D [14] D [13] D [12] D [11] D [10] D [9] D [8] D [7] D [6] D [5] D [4] D [3] D [2] D [1] D [0] THIN[31:0] OUTPUT DATA INPUT DATA P7 … P5 P4 P3 1 1 1 1 1 1 1 1 P33 P32 P31 P30 P29 P28 … P26 P25 P24 P23 P22 P21 … P19 P18 P17 P16 P15 P14 … P12 P11 P10 P9 P8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 P2 P1 P0 1 1 1 1 1 1 1 … … … … … … … … … … … … 3x3x1 LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT MINUTIAE EXTRACTION P′22 P′ 21 P′ 20 P′ 19 P′18 P′ 17 P′ 16 P′ 15 P′14 P′13 P′ 12 P′ 11 P′ 10 P′ 9 P′ 8 LUT LUT LUT LUT LUT LUT LUT LUT UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 minutia point the orientation of its corresponding block in the filtered field orientation map. The information about those detected minutia points is transferred to the system memory. At the end of the processing, a 32-bit words vector is saved in the system memory with all the minutia candidates extracted from the image. One additional processing step is needed to filter those unreliable candidates that could be present in the image. Figure 106. Minutiae extraction hardware coprocessor. P′ 7 P′6 P′ 5 P′ 4 P′ 3 P′ 2 D [8] D [7] D [6] D [5] D [4] D [3] D [2] D [1] 32 246 MINUTIAE FORMATTING MNT[31:0] Pixel X Pixel Y Type Block X Block Y P′ 32 P′31 P′ 30 P′ 29 P′ 28 P′ 27 P′26 P′ 25 P′ 24 P′23 P′1 D [31] D [30] D [29] D [28] D [27] D [26] D [25] D [24] D [23] D [22] D [21] D [20] D [19] D [18] D [17] D [16] D [15] D [14] D [13] D [12] D [11] D [10] D [9] D [0] OUTPUT DATA UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Minutiae filtering Owing to the reduced latency exhibited by the minutiae verification stage in any low- or midperformance microprocessing unit, this stage has been kept as a software task to be executed by the system CPU in each of the tested embedded system platforms. The execution time performance heavily depends on the total number of potential minutia points deduced in the minutiae extraction phase. Given a vector of N quadruplets (x,y,Φ ',m) that define the minutia candidates of a fingerprint impression, a set of heuristic rules is applied to each of the minutia pairs to reject those unreliable minutia points while keeping those others that accomplish the consistency checkings. Feature sets –filtered field orientation map and minutiae– formatting Similarly to the minutiae verification stage, the formatting of the reliable information extracted from the fingerprint is kept as a software task, to be executed by the system CPU in each of the embedded system platforms evaluated in this work. 5.6. Fingerprints Alignment Approach Among those fingerprint alignment techniques described in literature, alignment methods based on field orientation maps have attracted more and more attention because of many advantages: - the field orientation maps tend to be more robust than other features such as minutia points, singular points, ridge pores, etc. when dealing with noisy images and mid- or low-quality fingerprint impressions; - the field orientation maps tolerate better the inherent deformations that can exist between two fingerprint acquisitions of the same finger originated by the elasticity of the skin; - the field orientation map gives information about the complete fingerprint impression, and not only partial regions as it can happen with minutia points or singular (core and delta) points. Singular points and minutia points are not uniformly distributed in the fingerprint. Depending on the size of the scanner sensing area, and the relative position of the finger on the sensing surface, only few minutia points can be available in a fingerprint impression. Even it is possible to have most of minutia points located in a relatively small area, or no singular points to be detected in the acquisition process, what gives reduced discriminatory information about the global print. The usage of the field orientation map, however, overcomes the problem of not sharing common features in partial prints. Because of that, the proposed recognition algorithm bases its fingerprints alignment stage on the correlation analysis of the filtered field orientation maps deduced from template and query fingerprints. 5.6.1. Algorithm Description The alignment process flow is depicted in Figure 107. START TMP-QUERY F. O. MAPS ALIGNMENT FINGERPRINTS ALIGNMENT PHASE No ALIGNMENT FOUND ? Yes TMP-QUERY MINUTIAE SETS ALIGNMENT END Figure 107. Fingerprints alignment algorithm: processing stages. As it can be deduced from the figure, the suggested fingerprint alignment algorithm is split into two stages: the first one is the inherent alignment process of fingerprints based on the correlation 247 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 analysis of the field orientation maps; and the second one corresponds to the alignment of template and query minutiae sets once the transformation offsets (∆X, ∆Y, ∆Φ) to be applied to align the fingerprints are already known. In case the correlation of the field orientation maps does not reveal any possible alignment, automatically the process stops and template and query fingerprints are considered as non-matching. Field orientation maps alignment The proposed algorithm makes use of the filtered field orientation maps of template and query fingerprints –TFO and QFO–, which are represented in the form of two-dimensional matrices of size NT×MT blocks and NQ×MQ blocks respectively as follows: Template ridge map ≡ TFO ≡ φT (u, v ) ∀u ∈1 ... N T , v ∈1 ... M T Query ridge map ≡ QFO ≡ φQ (u, v ) ∀u ∈1 ... N Q , v ∈1 ... M Q Each component in the field orientation matrix contains information about the dominant direction of the fingerprint ridges and valleys in a local neighbourhood of size 8×8 pixels of the image. The orientations are defined as vectors tangent to the fingerprint ridges; they are coded in the range [0º,180º) with a resolution of ±1º, and packed in one byte. In case of dealing with a segmented block, the corresponding field orientation value in the matrix is coded to 255, out of the specified valid range, as an indication of void orientation. An example of template and query field orientation maps to be aligned is depicted in Figure 108. As it can be noted, those segmented blocks in the background of the images –with corresponding matrix values coded to 255– remain without red orientation lines in the figures. a) b) Figure 108. Filtered field orientation maps to be aligned: a) Template fingerprint, b) Query fingerprint. In order to find the best possible alignment between template and query fingerprints, template and query field orientation maps are correlated. A set of requirements is established in order to discriminate between valid and non-valid alignments. Those alignments that accomplish the requirements –valid alignments– can be weighted by means of one specific alignment score. Based on those alignment requisites and score rules, a brute force analysis of template and query field orientation matrices evaluates a wide range of possible alignments in order to find the one that better accomplishes the requirements. 248 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 In the correlation analysis, the query field orientation matrix (QFO) is taken as reference, and the template field orientation matrix (TFO) is moved over the query field orientation matrix, totally or partially, as depicted in Figure 109. A fine-grain correlation analysis is carried out. The granularity of the correlation is one block, that is, the template is moved over the query –either vertically or horizontally– in steps of one block. T x Q y Figure 109. Field orientation matrices correlation process. As indicated in Figure 110, for each relative position of template and query field orientation matrices, the rectangular overlapped region OvFO between TFO and QFO is computed: Overlapped region ≡ OvFO ≡ TFO ∩ QFO ≡ OvFO (i, j ) ∀i ∈1 ... N Ov , j ∈1 ... M Ov T cell: cell: cell: φT(u,v) φQ(u,v) Ov Q Figure 110. Overlapped region. A minimum amount of overlap ThresholdOv between both matrices is defined in order to avoid the situation where a high similarity score is obtained based on an alignment with little or almost no overlap: N Ov × M Ov ≥ Threshold Ov The similarity between two corresponding blocks in the overlapped region OvFO can be deduced from the difference in the field orientation maps as follows: ∀(i, j ) ∈ OvFO , ∆φ (i, j ) = min ( φ (i , j ) − φ (i , j ) T Q , 180 − φT (i, j ) − φQ (i, j ) ) The above equation can be further improved in order the alignment algorithm to tolerate those elastic deformations inherent to fingerprint impressions: ∀(i, j ) ∈ Ov FO , ∆φ ( i , j ) = min K i = −2...0... + 2 K j = −2...0... + 2 ( φT (i, j ) − φQ (i + K i , j + K j ) , 180 − φT (i, j ) − φQ (i + Ki , j + K j ) ) 249 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Each template cell is compared against not only its corresponding query cell, but against a query neighbourhood kernel of size 5×5 cells centered on the corresponding query cell. Those segmented blocks or segmented cells are disregarded in the computation. For each relative position of template and query field orientation matrices, the amount of nonsegmented overlapped cells KOv, KOv ≤ (NOv×MOv), as well as the average and the standard deviation of the difference between both field orientation maps in the region of interest OvFO are computed: ________ ∆φOv = (i, j ) ∑ ∆φ (i, j ) KOv 2 Ov ________ ⎞ ⎛ ∑) ⎜ ∆φ (i, j ) − ∆φOv ⎟ ⎠ ⎝ σ Ov = ( i , j KOv Therefore, one similarity score SFO of both field orientation maps in the overlapped area can be deduced as follows: Ov S FO = ∆φOv + σ Ov The lowest the similarity score, the highest the similarity is between both maps. In addition, a similarity threshold ThresholdFO is defined in order to avoid the alignment of dissimilar field orientation maps: S FO < Threshold FO With the aim of properly aligning fingerprints even when the original template and query impressions present different orientations, the algorithm computes several rotated versions of the query field orientation matrix and correlates all them against the template field orientation matrix. It is possible to rotate the query field orientation matrix θ degrees prior to computing the alignment with the template matrix. As graphically indicated in Figure 111, the cell (i,j) in the original query matrix is transformed to a new cell (i′,j′) after rotation of the query matrix one angle θ: ⎛ i ' ⎞ ⎛ cos θ − senθ ⎞⎛ i ⎞ ⎜ ⎟=⎜ ⎜ j ' ⎟ ⎜ senθ cos θ ⎟⎜ j ⎟ ⎟⎜ ⎟ ⎝ ⎠ ⎝ ⎠⎝ ⎠ The field orientation at cell (i′,j′) in the rotated query matrix can be deduced as follows: φ 'Q (i ' , j ' ) = φQ (i, j ) + θ ________ y j' φ'(i', j') θ j φ(i, j) i' i x Figure 111. Field orientation map rotation process. 250 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Up to five rotated versions of the query field orientation map – θ = {-10º, -5º, 0º, +5º, +10º} – are computed and correlated with the template field orientation map in the alignment stage. From the minimization of score SFO, it is possible to obtain the optimal shift (∆X, ∆Y) and the optimal rotation (∆Φ) between both images, considering not only a rigid body transformation but tolerating also non-linear and elastic deformations in local neighbourhoods of size 5×5 blocks for the acquired fingerprints. The parameters (∆Φ, ∆X, ∆Y) that lead to the minimum score SFO are pointed as the ones providing the highest level of similarity between field orientation maps so they are selected as the alignment result. The overlapped region OvFO corresponding to the alignment result becomes the region of interest – RoI – for further processing. These outputs, together with the amount of non-segmented overlapped cells in the RoI – KRoI – are used as reliable inputs to determine the positive or negative correspondence between fingerprints in the next matching stage. The fingerprints alignment methodology is shown in Figure 112. Up to five different correlation analysis loops of the query field orientation map against the genuine template field orientation map are performed. At the end of the processing, those alignment parameters that better meet the imposed alignment requirements and therefore define the best alignment found are stored in the system memory. Query Template – Query Best Alignment Selection Template F. O. Matrix F. O. Matrix F. O. Matrix F. O. Matrix F. O. Matrix θ = –10° θ = –5° θ = 0° θ = +5° θ = +10° Field Orientation Analysis 1 Field Orientation Analysis 2 Field Orientation Analysis 3 Field Orientation Analysis 4 Field Orientation Analysis 5 F. O. Matrix Translation Offsets: ∆X, ∆Y Rotation Offset: ∆Φ Alignment Score: SFORoI = ∆φRoI + σRoI # Overlapped Cells: KRoI Figure 112. Fingerprints alignment methodology. The field orientation matrices of template and query fingerprints on the thinned versions of the images are shown in Figure 113, and the alignment result is depicted in Figure 114. The overlapped region, or region of interest –RoI–, deduced in the alignment process corresponds to most of the foreground region of the template fingerprint in this example. 251 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a) b) Figure 113. Field orientation maps: a) Template fingerprint, b) Query fingerprint. Figure 114. Template and query field orientation maps alignment result. Minutiae sets alignment As a result of the field orientation maps alignment stage, those transformation parameters (∆X, ∆Y, ∆Ф) to be applied in order to align template and query fingerprints were deduced. These transformation parameters are used at this stage in order to align both minutiae sets. Given the template minutiae set Tm composed of N minutia points: T T T T T T T T Template minutiae set ≡ Tm = { ( x1 , y1 , β1T ) , ( x2 , y2 , β 2 ) , ... , ( xN , y N , β N ) } 252 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 and the query minutiae set Qm composed of M minutia points: Q Q Q Q Q Q Q Query minutiae set ≡ Qm = { ( x1 , y1 , β1Q ) , ( x2 , y2 , β 2Q ) , ... , ( x M , y M , β M ) it is possible to align Tm and Qm by applying the following transformation to Qm: ⎛ x ' ⎞ ⎛ cos ∆Φ − sin ∆Φ 0 ⎞⎛ x ⎞ ⎛ ∆X ⎞ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎜ y ' ⎟ = ⎜ sin ∆Φ cos ∆Φ 0 ⎟⎜ y ⎟ + ⎜ ∆Y ⎟ ⎜ β '⎟ ⎜ 0 0 1 ⎟⎜ β ⎟ ⎜ ∆Φ ⎟ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ In this way, a new vector Q′m of triplets (x′,y′,β′) is obtained, which is aligned with Tm: Q Q Q Q 'm = { ( x '1 , y '1 , β '1 ) , ( x 'Q , y 'Q , β 'Q ) , ... , ( x 'Q , y 'Q , β 'Q ) 2 2 2 M M M } } Figure 115 shows both minutiae sets after fingerprint alignment. Minutiae sets Tm and Q′m share the same absolute coordinate axes x and y. Therefore, it is easy to identify those minutia points in Tm and Q′m within the overlapped region RoI. Furthermore, it is possible to find the corresponding minutia pairs Tm – Q′m in the overlapped region. Figure 115. Template and query minutiae sets alignment result. It is important to remark that the presented alignment transformation considers template and query minutiae sets –Tm and Qm– as rigid bodies. However, as it was already indicated in the field orientation maps alignment stage, certain flexibility needs to be allowed when matching the global structures of template and query minutiae sets. This is done in order to overcome the inherent deformations that can affect the spatial positioning of legitimate minutia points in two consecutive acquisitions of one finger generated by the inherent elasticity of the skin. A minutiae-related similarity score between both fingerprints, which tolerates non-linear deformations, is deduced in the next matching stage. The minutiae matching score is computed from the amount of corresponding minutia points of Tm and Q′m found in the overlapped region (RoI). 5.6.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration The alignment process is implemented by means of hardware-software codesign techniques with the support of two main cores: the system CPU and one application-specific hardware coprocessor in 253 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 charge of those compute-intensive operations derived from the alignment stage. The alignment process flow is shown in Figure 116. START SYSTEM INITIALIZATION (ThresholdOV, ThresholdFO, X_0, Y_0, θ_0) READ TMP F. O. MATRIX READ QUERY F. O. MATRIX TMP MATRIX TRANSLATION (X, Y) QUERY MATRIX ROTATION (θ) TEMPLATE – QUERY OVERLAP COMPUTATION (NOv × MOv) No NOv × MOv > ThresholdOV ? Yes SFO COMPUTATION F. O. MAPS ALIGNMENT PHASE SFO < ThresholdFO ? Yes ALIGNMENT RESULT UPDATE ThresholdFO = SFO ∆X = X, ∆Y = Y, ∆Φ = θ No TMP MATRIX UPDATE (X, Y) No TMP TRANSLATION END ? Yes QUERY ROTATION END ? Yes No TMP MATRIX INIT. (X_0, Y_0) QUERY MATRIX UPDATE (θ) Yes ALIGNMENT FOUND ? No SFO = ThresholdFO SFO = 0 END Figure 116. Field orientation maps alignment algorithm: processing stages. Given the original template field orientation matrix, and one of the five versions of the query field orientation matrices under evaluation, the correlation analysis is done by moving the template matrix over the query matrix some horizontal and vertical displacement offsets X and Y, respectively. X and Y cover the proper range of values to generate any possible overlap between both matrices of size equal or larger than ThresholdOv components. For each valid relative position of template and query matrices, the overlapped region is identified, and each component of the template field orientation matrix in the overlapped region is evaluated against all those 5×5 corresponding components in the query field orientation matrix. This process, performed on all those components in the overlapped region, permits to weight the level of similarity between both field orientation maps in that relative position. That process is repeated for each relative position, and all five rotated versions of the query field orientation matrix are evaluated. The template field orientation map presents a fixed size of 32×56 components. The size of the query field orientation map however, can present three different sizes: 32×56, 36×58, or 41×60 components, depending on whether the rotation angles are 0º, ±5º, or ±10º, respectively. At the end of the processing, that relative position of template and query field orientation maps that leads to the best similarity score (smaller than a configurable threshold ThresholdFO) is pointed out as the alignment result. The X, Y and θ parameters of that relative position become the alignment factors ∆X, ∆Y and ∆Φ to be used to align the original template and query fingerprints, and its 254 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 similarity score SFO becomes the quantitative measurement of the correspondence between template and query field orientation maps once both fingerprints are aligned. The alignment process addresses a brute force analysis that demands a high computational power for the physical system in charge of the processing. The correlation analysis of template and query field orientation maps is therefore suggested to be performed by hardware. Since the correlation analysis is performed with 2-D matrices, the parallelism performance exhibited by hardware coprocessors is pointed as a valid solution to speed up the processing. However, meanwhile the correlation of the template and one version of the query matrices is performed, it is suggested the system CPU to perform the computation of a new rotated version of the query matrix to be correlated later on. In this way, it is possible to implement the alignment stage by means of hardware-software codesign techniques. The temporal partitioning of the application is depicted in Figure 117. SYSTEM CPU HW COPROCESSOR SYSTEM QUERY{θ = −5º} INITIALIZATION ROTATION TEMPLATE vs QUERY{0º} CORRELATION ANALYSIS QUERY{θ = +5º} ROTATION TEMPLATE vs QUERY{–5º} CORRELATION ANALYSIS QUERY{θ = −10º} ROTATION QUERY{θ = +10º} ROTATION ALIGNMENT RESULT TEMPLATE vs QUERY{+5º} TEMPLATE vs QUERY{–10º} TEMPLATE vs QUERY{+10º} CORRELATION ANALYSIS CORRELATION ANALYSIS CORRELATION ANALYSIS Execution time Figure 117. Alignment process: hardware-software partitioning. In order to speed up the processing in the suggested implementation of the algorithm, the original template and query field orientation maps are correlated first. Meanwhile the original query field orientation matrix is evaluated by the hardware coprocessor, a rotated version of the query field orientation matrix can be computed in parallel by the system CPU, to be processed in the next stage. Once all five correlation processes take place, the application either returns the best alignment found (which accomplishes the alignment requirements –ThresholdOv and ThresholdFO –), or reports that no alignment is found compliant with the imposed requirements. The CPU acts as the master of the processing, and is in charge of (i) configuring the hardware controller, (ii) computing the rotated versions of the original query matrix to be correlated against the template matrix, (iii) transferring to the hardware coprocessor the original template matrix as well as the original and rotated versions of the query matrix to be processed, (iv) supervising the alignment process flow, and (v) capturing the alignment results for each template-query case under study; whereas the hardware coprocessor acts as a slave controller, and is in charge of (vi) correlating the specified template and query matrices, and (vii) indicating, in each scenario (θ), the best alignment similarity score SFO found, as well as the corresponding spatial alignment parameters (X, Y). 5.6.3. Physical Implementation – Design Development – Design Implementation Among those processing stages that take part in the alignment –field orientation maps alignment and minutiae sets alignment–, the one with the higher computational demands is clearly the field orientation maps alignment, even in the scenario where both fingerprints present a rich amount of minutia points. Field orientation maps alignment A total of five versions of the query field orientation map are correlated with the template field orientation map in the alignment stage. Only one version of the query field orientation map can be correlated with the template field orientation map at one time. The hardware coprocessor in charge of the correlation analysis is provided with the following input circuitry: - two dual-port memories where to store the field orientation maps to be processed; and - some specific registers in order the system CPU to manage the correlation process (Start correlation flag command, ThresholdOv parameter, and the size –number of rows and columns: QRows, QColumns– of the query field orientation matrix to be correlated against the template field orientation matrix, among others). 255 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 For each correlation analysis, the hardware coprocessor generates, as output data properly saved into dedicated registers, the following information: - the displacement offsets X and Y that need to be applied to the template field orientation matrix in order to find the best correspondence between the template and query field orientation matrices under evaluation; - the similarity score SFO of such a best alignment found; - the amount of non-segmented overlapped blocks KRoI that result from such alignment; and - one specific flag –End command– that is set once the correlation analysis of both input field orientation matrices is finished. INPUT DATA 32 Start WR DP RAM DP RAM DP RAM DP RAM DP RAM ThresholdOv Data2mem QRows QColumns QContext WR DP RAM DP RAM DP RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM DP DP RAM RAM RD (Select_TRows) 16 x “255” φ′T [63:48] Select_TColumns 16 x “255” φ′T [15:0] MUX MUX MUX MUX MUX MUX MUX MUX MUX MUX MUX 32 φ′Q [7:4] MUX 32 φ′Q [3:0] 32 32 32 32 32 32 32 32 32 32 φ′Q [47:44] φ′Q [43:40] φ′Q [39:36] φ′Q [35:32] φ′Q [31:28] φ′Q [27:24] φ′Q [23:20] φ′Q [19:16] φ′Q [15:12] φ′Q [11:8] MUX (48x8 to 32x8) φQ [31:0] = φ′Q [31+K:K] , K∈[0,16] RD (Select_QRows) QContext 256 φ′T [47:16] MUX (64x8 to 32x8) Select_QColumns φT [31:0] = φ′T [31+Z:Z] , Z∈[0,31] φQ [31] φT [31] 255 A B ALU MIN(|A-B|,180-|A-B|) MUX φQ [30] φT [30] 255 A B ALU MIN(|A-B|,180-|A-B|) MUX φQ [29] φT [29] 255 A B ALU MIN(|A-B|,180-|A-B|) MUX … D φQ [2] φT [2] 255 A B ALU MIN(|A-B|,180-|A-B|) MUX φQ [1] φT [1] 255 A B ALU MIN(|A-B|,180-|A-B|) MUX φQ [0] φT [0] 255 A B ALU MIN(|A-B|,180-|A-B|) MUX C ALU MIN(C,D) D C ALU MIN(C,D) D C ALU MIN(C,D) C D ALU MIN(C,D) C ALU MIN(C,D) D C ALU MIN(C,D) D … ≠255? ≠255? ≠255? ≠255? ≠255? ≠255? n31 n31 0 n30 0 ∆φ31 0 ∆φ30 0 ∆φ31 n29 n27 n30 ∆φ30 n25 n23 n29 ∆φ29 n21 n19 … n17 n2 ∆φ2 n1 n1 ∆φ1 n0 ∆φ0 … n28 n26 n24 n22 n20 n18 n16 n0 0 ∑ ∑ ∑ ni … ∆φ29 ∆φ27 ∆φ25 ∆φ23 ∆φ21 ∆φ19 ∆φ17 ∆φ1 … ∆φ28 ∆φ26 ∆φ24 ∆φ22 ∆φ20 ∆φ18 ∆φ16 ∆φ0 0 ∑ ∑ ∑ ∆φi … X2 X2 Start ThresholdOV QRows QColumns ∑ ni ∑ ∆φi ∑ ∆φi2 F.O. MAPS 0 ∑ Select_TRows Select_TColumns Select_QRows Select_QColumns ∑ ∑ ∆φi2 ALIGNMENT CORE X Y SFO KRoI End OUTPUT REGISTERS Figure 118. Alignment hardware coprocessor. The alignment coprocessor is seen by the system CPU as a memory-mapped peripheral. The system CPU needs first to configure the application by initializing the coprocessor configuration registers, 256 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 as well as downloading in its dual-port memories the field orientation maps of template and query fingerprints. As soon as the original template and query field orientation matrices are loaded into the internal memories of the hardware coprocessor, and those configuration registers are setup, it is possible to start the correlation analysis by setting the Start command flag in one of its configuration registers. The coprocessor notifies the end of the processing to the system CPU either through the activation of the End flag in one output register, or through the generation of one dedicated interrupt. The alignment stage is split into five correlation processes. At the end of each correlation process, the system CPU evaluates the partial alignment result SFO. The best alignment out of the five correlation processes becomes the alignment result of template and query fingerprints. The block diagram of the alignment hardware coprocessor is depicted in Figure 118. A dual-port memory block of size 32×64 bytes is enough to store the template field orientation matrix (of fixed size 32×56 components, where each component is encoded in one byte). A dualport memory block of size 48×128 bytes, split in a dual-context memory of size 2×(48×64) bytes, is used to store two different versions of the query field orientation matrix (with a maximum size of 41×60 components). The fact of using a dual-context dual-port memory for the query fingerprint allows the system CPU to compute and save in one context one rotated version of the query field orientation map meanwhile the hardware coprocessor correlates the template matrix with another version of the query matrix that is saved in the other context. In this way, from the hardware coprocessor perspective, the correlation process is performed in the foreground, while the query rotation process runs in the background, driven by the system CPU. This technique permits to speed up the processing since, as soon as one correlation analysis is finished, a new version of the query can already be available for further processing in the other context of the memory. By switching the context, it is possible to continue the alignment stage without facing any extra latency. One specific bit – QContext – in one of the registers in the input interface permits to the system CPU to select which context is storing the version of the query matrix that needs to be correlated with the template at any time. In order to avoid any latency, it is checked that the rotation operation of the query matrix carried out by the system CPU takes less time than the correlation process performed by the hardware coprocessor (when correlating any of the query matrices). The alignment coprocessor correlates one row, that is, up to 32 components of the template and query field orientation matrices, in parallel. In order to allow the parallel processing, some multiplexers are connected downstream the dual-port memory blocks, which permit to select which 32 components of the template and query matrices are going to be correlated. Since the width of the template matrix is just 32, and in order to allow the processing of partial overlaps between the template and query matrices, the template multiplexer permits to expand one row of the template with up to 16 components on the left or on the right, allowing then the correlation process of partial overlaps of 16 blocks-wide as minimum. Those complementary components that are added, either on the left or on the right, are coded to 255 in order to indicate non-valid field orientations, to be disregarded in the processing. Moreover, the query matrix is extended with a frame of 2×2 components, also coded to 255, in order to allow the processing of those kernels of size 5×5 for each of the components, even those in the edges or in the corners of the matrices. The partial evaluation results of 32 template and 32 query components are computed in one clock. And up to 25 clocks are needed to correlate each of the 32 components in one row of the template matrix against its 5×5 corresponding components in the query matrix. After 25 system clocks, up to 32 partial results are obtained, consisting of: - the orientation difference factor ∆φi, ∆φi ∈ ([0,90] ∪ {255}), where 255 is used to identify those segmented blocks present either in the template or in the query matrices; and - the valid block factor, ni ≡ {1 if ∆φ ∈ [0,90]; 0 otherwise}. A total of 16 system clocks are needed to perform the addition of all those sets of 32 partial values corresponding to one overlapped row. Specific accumulators are used to compute the partial results Σ ni, Σ ∆θi and Σ ∆θi2 obtained in the processing of each of the rows of one overlapped region. 257 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 A partially-pipelined coprocessor has been developed, as indicated in the processing flow diagram of Figure 119. Given a relative position of template and query field orientation matrices with N overlapped rows, the correlation analysis time (in system clocks) can be computed as: t = (25 × N ) + 16 + t0 where t0 is the time (in system clocks) needed to compute the similarity score SFO from the knowledge of the partial results K, ΣOv ∆θi and ΣOv ∆θi2 as follows: S FO = ∆φOv + σ Ov ________ where: K = ∑ ni i _______ Ov ∆φOv = ∑ ∆φ i Ov i K _______ ⎛ ⎞ ∑ ⎜ ∆φi − ∆φOv ⎟ ⎠ = i ⎝ K Ov 25 CLKs ··· 16 CLKs CORRELATION ANALYSIS ROW 3 Σni, Σ∆θi, Σ∆θi2 ROW 2 ··· ··· ··· 16 CLKs CORRELATION ANALYSIS ROW N Σni, Σ∆θi, Σ∆θi2 ROW N−1 Σni, Σ∆θi, Σ∆θi2 ROW N SFO COMPUTATION 2 σ Ov = 25 CLKs ··· ··· 16 CLKs 25 CLKs ··· ∑ ∆φ i Ov 2 i K ⎛ _______ ⎞ − ⎜ ∆φOv ⎟ ⎝ ⎠ 25 CLKs ··· 2 ··· 16 CLKs ··· ··· t0 CLKs HW LAYER 1 HW LAYER 2 HW LAYER 3 CORRELATION ANALYSIS ROW 1 CORRELATION ANALYSIS ROW 2 Σni, Σ∆θi, Σ∆θi2 ROW 1 Figure 119. Partially-pipelined correlation process. In the current application, t0 = 15 clocks, ThresholdOv = 400, and ThresholdFO = 180. Once the correlation analysis of one relative position between the template and query matrices is performed, the template is moved over the query in order to achieve a new relative position to be correlated. A big amount of relative positions between both matrices is evaluated. The hardware coprocessor permanently stores in its output registers the parameters of the best alignment found along the processing, therefore, at the end of one specific correlation stage, those registers store the best alignment found. Since the system CPU knows which version of the query field orientation map is evaluated in each correlation analysis, by comparing the best score results SFO of each of the five complete correlation processes, the system CPU can easily determine which version (rotated or non-rotated) of the query better aligns with the template, and from this, the alignment triplet (∆X, ∆Y, ∆Φ) considered as the final result. Concerning to the application execution time, the alignment process has been identified as the most critical task, which sets to a large extent the execution time of the whole authentication process. Because of that, two levels of parallelism exist in the alignment stage: - At system level, the CPU and the hardware coprocessor work in parallel. Meanwhile the hardware processor correlates one version of the query matrix against the template matrix, a new rotated version of the query matrix is being computed by the CPU and saved in the internal memory for subsequent processing. - At hardware level, the coprocessor is able to correlate one complete row of the overlapped region in parallel. Those features permit to improve the execution time performance of the suggested alignment process when implemented by means of hardware-software codesign techniques with regard to the purely software implementation of the same alignment algorithm under the system CPU. 258 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Minutiae sets alignment The minutiae sets alignment stage has been kept as a software task, to be executed by the system CPU in each of the embedded system platforms tested in this work. The reason of that is the reduced workload exhibited by this operation when executed under mid- or low-performance CPUs. The execution time of this stage depends on the amount of minutia points present in the query fingerprint, since they are aligned with the template fingerprint. 5.7. Fingerprints Matching Approach The latest stage of the suggested fingerprint recognition algorithm corresponds to the matching process of those previously aligned features: the field orientation maps, and the minutiae sets of template and query fingerprints. In case of successful alignment of template and query fingerprints, the matching process aims at determining the degree of similarity between both images. For such a purpose, the overlapped region between both prints becomes the region of interest, and the matching process consists of quantifying the level of correspondence that exists between those feature sets of template and query fingerprints available in the region of interest. The similarity score between both fingerprints is quantified in the range [0,1]. The higher the score is, the more similar are the fingerprints. Finally, the application needs to answer with a Boolean response: either template and query prints belong to the same finger/user or not. The comparison of the similarity score with a certain threshold (which is driven by the final application that makes use of the biometric recognition, and is normally influenced by factors like security, reliability, etc.) assesses the authentication result. 5.7.1. Algorithm Description The fingerprints matching process flow is depicted in Figure 120. In case the previous stage, corresponding to the alignment phase of template and query fingerprints, is unsuccessful, it is an indication that template and query prints do not belong to the same user so the matching step of those extracted features is skipped and the recognition system automatically assesses both prints as dissimilar (null similarity). On the contrary, when some alignment is possible between the template and the query fingerprints, the matching stage aims at quantifying how much similar or dissimilar the fingerprints are. In this scenario, the matching process takes place. It is split into five sequential steps: (i) RoI retrieval, (ii) field orientation matching in RoI, (iii) minutiae sets matching in RoI, (iv) similarity score computation, and finally, one additional step called (v) Authentication result decision, which provides the answer to the question: “Is the user (query) the person who claims to be (template)?” by comparing the computed similarity score with one threshold fixed by the application. START ALIGNMENT FOUND ? Yes REGION OF INTEREST (RoI) RETRIEVAL F. O. MAPS (RoI) MATCHING MINUTIAE SETS (RoI) MATCHING SIMILARITY SCORE COMPUTATION No FINGERPRINTS MATCHING PHASE AUTHENTICATION RESULT DECISION END Figure 120. Fingerprints matching algorithm: processing stages. 259 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 RoI retrieval In case of successful alignment of template and query fingerprints, a rectangular region of size NRoI×MRoI blocks, delimited by the overlapped region between both prints, is considered as the region of interest for matching purposes: Region of Interest ≡ RoI FO ≡ TFO ∩ QFO ≡ RoI FO (i, j ) ∀i ∈ 1 ... N RoI , j ∈1 ... M RoI The overlapped region is larger than a minimum size ThresholdOv blocks, and presents KRoI nonsegmented blocks: N RoI × M RoI ≥ Threshold Ov K RoI ≤ N RoI × M RoI Those other regions of template and query fingerprints out of the overlapped region are not taken into consideration in the matching stage: those minutia points out of the RoI, and those portions of the field orientation maps out of the RoI are disregarded in the matching process. Only those features within the RoI are used as valid features to determine the degree of similarity between both fingerprints. Therefore, once identified the RoI, the next step consists of determining which features –field orientation submaps and minutiae subsets– have to be considered as valid information for each fingerprint. Field orientation maps matching The determination of the RoI, and the computation of the similarity score SFO related to the field orientation maps in the RoI were tasks already done in the alignment process, as described in section 5.6. A new level of similarity ScoreFO, limited in the range [0,1], is then computed by applying the following transformation to the original score SFO: ⎧ 20 - S FO ⎪ , if S FO < 20 ScoreFO = ⎨ 20 ⎪0, otherwise ⎩ The higher the value of the new score, the more similar are the template and the query field orientation maps in the RoI. Minutiae sets matching Although in the proposed fingerprint alignment algorithm there is no reliance on the existence of common landmarks such as singularities or minutia points for alignment purposes, the suggested matching process focus to some extent on the spatial distribution of minutia points to determine the result of the fingerprints comparison stage. A similarity score ScoreM is deduced from the analysis of those template and query minutiae sets within the RoI. Minutiae matching is generally handled as a point pattern matching problem. The similarity between minutiae sets is done at two levels in the proposed algorithm based on structural analysis. The computation of ScoreM depends on two factors: (i) the similarity score of local structures – SLM –, and (ii) the similarity score of global structures – SGM – within the RoI. Given the template minutiae set Tm composed of N minutia points, it is possible to identify those N′ (N′ ≤ N) minutia points within the RoI as the subset TRoI : T T T T T T TRoI = { ( x1 , y1 , β1T ) , ( x2 , y2 , β 2T ) , ... , ( x T ' , y N ' , β N ' ) } , TRoI ⊆ Tm N The same operation can be done with the already aligned query minutiae set Q′m, composed of M minutia points, and its M′ (M′ ≤ M) minutia points available in the RoI: Q Q Q QRoI = { ( x '1 , y '1 , β '1 ) , ( x 'Q , y 'Q , β 'Q ) , ... , ( x 'Q ' , y 'Q ' , β 'Q ' ) } , QRoI ⊆ Q 'm 2 2 2 M M M Those minutia points within the RoI are identified in template and query fingerprints, and for each template-query potential minutia pair, a minutia-related similarity score SM is computed as: ∀i ∈ TRoI , i ∈ [1... N '] ∀j ∈ QRoI , j ∈ [1...M '] S M (i , j ) = S LM (i, j ) ⋅ SGM (i, j ) 260 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 where SM ∈ [0,1], SLM ∈ [0,1], SGM ∈ [0,1], SLM (i,j) denotes the similarity level between the local structures of minutia i in TRoI and minutia j in QRoI, and SGM (i,j) refers to the similarity level of the global structures of both minutia points. ▪ Minutiae sets local analysis For the purpose of analysing the local structures of both minutiae sets, representations that are translation and rotation invariant are computed. Each minutia point in the RoI is characterized by the local structure of its neighbour minutiae. In the local descriptor of a minutia, the relative Euclidean distances d and angles φ between the specific minutia and its W closer minutia neighbours, as well as the relative ridge directions γ between the specific minutia and its W closer minutia neighbours are computed, as depicted in Figure 121. Y β1 y1 m1 γ β0 d y0 m0 β0 α φ x0 x1 X Figure 121. Fingerprint minutia descriptor (d,φ,γ), where (x0,y0,β0) is a ridge ending used as reference and (x1,y1,β1) is a ridge bifurcation used as neighbour of the reference minutia. In the suggested application, the parameter W has been set to W = 8, therefore, each minutia point is characterized by eight triplets (d,φ,γ) linked to those eight closer neighbours located within a circular region of radius R = 100 pixels around the minutia, as indicated in Figure 122. y m3 m2 m4 y0 m1 m0 R m5 m6 m7 m8 x0 x Figure 122. Characterization of minutia m0 from its 8 closer neighbours m1, … , m8. Only those minutia points within the RoI are taken into account when performing the local analysis of both minutiae sets. Given a minutia point under study, with absolute coordinates (x0,y0,β0), and one of its neighbour minutiae, with absolute coordinates (x1,y1,β1), the relative parameters (d,φ,γ) are computed as follows: d= (x1 − x0 ) 2 + ( y1 − y0 ) 2 261 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ⎛y −y ⎞ α = tan −1 ⎜ 1 0 ⎟ ⎜x −x ⎟ ⎝ 1 0⎠ φ = α − β0 γ = β1 − β 0 Relative distances and angles between the reference minutia and its neighbours are established in order to unequivocally define each point of interest. Once the characterization of all minutia points in TRoI and QRoI is done, the next step consists in matching those structural definitions in order to compute the level of correspondence between both minutiae sets. Minutia mi ∈ TRoI , mi ≡ Minutia m j ∈ QRoI , m j ≡ { (d { (d i1 ,φi1 , γ i1 ) , (d i 2 , φi 2 , γ i 2 ) , ... , (d iu ,φiu , γ iu ) , ... , (d i 8 ,φi 8 , γ i 8 ) } j1 , φ j1 , γ j1 ) , (d j 2 ,φ j 2 , γ j 2 ) , ... , (d jv , φ jv , γ jv ) , ... , (d j 8 ,φ j 8 , γ j 8 ) } For this purpose, the minutia points in TRoI are taken as reference. Based on the local structure of each minutia point in TRoI, certain tolerance boxes are considered for the 3 parameters in the minutia descriptor – d, φ, and γ – in order the matching algorithm to tolerate those elastic deformations inherent to fingerprints. ∀i ∈ 1... N ' , ∀u ∈ 1...8, Le _ d iu = f1 ( d iu ), Le _ φiu = f 2 ( d iu ), Le _ γ iu = f 3 ( d iu ), 0 ≤ Le _ d iu ≤ Thd 0 ≤ Le _ φiu ≤ Thφ 0 ≤ Le _ γ iu ≤ Thγ if (φiu + Le _ φiu ) < 180 ⎧φ + Le _ φiu , φiu _ max = ⎨ iu ⎩φiu + Le _ φiu − 180, otherwise if φiu ≥ Le _ φiu ⎧φ − Le _ φiu , φiu _ min = ⎨ iu ⎩φiu − Le _ φiu + 180 , otherwise ⎧d + Le _ d iu , if (d iu + Le _ d iu ) ≤ R d iu _ max = ⎨ iu otherwise ⎩ R, ⎧d − Le _ d iu , if d iu ≥ Le _ d iu d iu _ min = ⎨ iu otherwise ⎩0, if (γ iu + Le _ γ iu ) < 180 ⎧γ − Le _ γ iu , γ iu _ max = ⎨ iu ⎩γ iu − Le _ γ iu − 180 , otherwise if γ ui ≥ Le _ γ iu ⎧γ − Le _ γ iu , γ iu _ min = ⎨ iu ⎩γ iu − Le _ γ iu + 180, otherwise Tolerance box d iu = [d iu_min , d iu_max ] ⎧ [ φiu_min ,φiu_max ], Tolerance box φiu = ⎨ ⎩ [0 ,φiu_max ] ∪ [ φiu_min , 180), ⎧ [ γ iu_min , γ iu_max ], Tolerance box γ iu = ⎨ ⎩ [0 , γ iu_max ] ∪ [ γ iu_min , 180), if φiu_max ≥ φiu_min otherwise if γ iu_max ≥ γ iu_min otherwise Each triplet ( diu,φiu,γiu ) of the template minutia mi is matched against all 8 triplets ( djv,φjv,γjv ) of the query minutia mj looking for the best corresponding triplet in the query minutia. The similarity score Siu_jv between triplet ( diu,φiu,γiu ) and triplet ( djv,φjv,γjv ) is considered null if the descriptor (djv,φjv,γjv ) of the query minutia mj is not inside the tolerance boxes of the template descriptor (diu,φiu,γiu ). For all those descriptors ( djv,φjv,γjv ) of the query minutiae within the tolerance boxes of the template triplet ( diu,φiu,γiu ), the similarity Siu_jv is computed as follows: 262 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ∀i ∈1... N ' , ∀u ∈1...8, tolerance boxes of (d iu, ,φiu ,γ iu ) are computed. ∀j ∈1...M ' , ∀v ∈1...8 within tolerance boxes of (d iu, ,φiu ,γ iu ), Siu_jv is computed as : ⎧ d − d iu , if d jv − d iu < Th∆d ⎪ ∆d iu _ jv = ⎨ jv ⎪Th∆d , otherwise ⎩ ⎧min( φ jv − φiu , 180 − φ jv − φiu ), if min( φ jv − φiu , 180 − φ jv − φiu ) < Th∆φ ⎪ ∆φiu _ jv = ⎨ otherwise ⎪Th∆φ , ⎩ ⎧min( γ jv − γ iu , 180 − γ jv − γ iu ), if min( γ jv − γ iu , 180 − γ jv − γ iu ) < Th∆γ ⎪ ∆γ iu _ jv = ⎨ otherwise ⎪Th∆γ , ⎩ Th - ∆d iu _ jv Th - ∆φiu _ jv Th - ∆γ iu _ jv Siu _ jv = η∆d ⋅ ∆d + η∆φ ⋅ ∆φ + η ∆γ ⋅ ∆γ Th∆d Th∆φ Th∆γ with: η∆d + η∆φ + η∆γ = 1 Among all possible query descriptors ( djv,φjv,γjv ) of minutia mj within the tolerance boxes of the template triplet ( diu,φiu,γiu ), only the one with the best similarity level is selected as the one corresponding to the original template descriptor ( diu,φiu,γiu ), and the similarity level of the rest of query descriptors is set to zero. In this way, each template triplet ( diu,φiu,γiu ) of minutia mi is corresponding with no more than one query triplet ( djv,φjv,γjv ) of minutia mj. This procedure is repeated for all template and query triplets in order to compute the similarity between both local structures. The different thresholds and parameters used in the algorithm have been tuned by experimental tests. Finally, the local similarity level between each minutia pair ( mi, mj ), mi ∈ TRoI and mj ∈ QRoI, is computed as: S LM (i, j ) = ∑∑ Siu _ jv u =1 v =1 u =8 v =8 ▪ Minutiae sets global analysis Given the template and query minutiae subsets within the RoI – TRoI and QRoI – composed of N′ and M′ minutia points respectively, both subsets can be defined also by their global structures. TRoI and QRoI can be characterized by N′ and M′ triplets ( x,y,β ), respectively. Once template and query minutiae subsets are aligned, the global descriptors are directly comparable because both minutiae subsets are referenced under the same axes system. Although minutia matching can be treated as a point pattern matching problem, it is important to remark that minutia points cannot be considered as rigid bodies, and certain flexibility must be allowed due to the inherent elasticity of the skin and the natural non-linear deformations affecting fingerprints in the image acquisition stage. Certain tolerances are allowed in the spatial distribution and orientation of minutia points – Th∆x, Th∆y and Th∆β – when aligning and matching the global structures of template and query minutiae sets. Given a template minutia point mi, with absolute coordinates ( xi,yi,βi ), and a query minutia point mj, with absolute coordinates ( xj,yj,βj ), the level of correspondence SGM between both is computed as follows: ∆x = x j − xi ∆y = y j − yi ∆β = min( β j − β i , 180 − β j − β i ) Th - ∆y Th - ∆β ⎧ Th - ∆x , if ∆x ≤ Th∆x and ∆y ≤ Th∆y and ∆β ≤ Th∆β + η ∆y ⋅ ∆y + η ∆β ⋅ ∆β ⎪η∆x ⋅ ∆x SGM (i, j ) = ⎨ Th∆x Th∆y Th∆β ⎪0, otherwise ⎩ with: η ∆x + η ∆y + η ∆β = 1 263 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The thresholds Thk ( k = ∆x, ∆y or ∆β ) define the allowed tolerance ranges between global structures, and the coefficients ηk ( k = ∆x, ∆y or ∆β ) are the weight factors for each of the parameters that take part in the computation of the global similarity score. ▪ Minutiae sets similarity score fusion Once the structural analysis at local and global levels is performed for both minutiae sets in the RoI, a N′×M′ similarity matrix is built with the results of the score SM for each potential minutia pair: S M (i, j ) = S LM (i , j ) ⋅ S GM (i , j ) By inspection of the similarity level SM for each potential template-query minutia pair, it is possible to identify those corresponding minutia pairs. Defined MP as the number of corresponding template-query minutia pairs found within the RoI, and defined K as the minimum number of minutia points of one set –template or query– available in the RoI: K = min(N ' , M ' ) with: MP ≥ ThresholdMP and K ≥ ThresholdK the level of similarity between both fingerprints is computed as: ⎧ MP ⎪ , if MP ≥ Threshold MP and K ≥ Threshold K ScoreM = ⎨ K ⎪0, otherwise ⎩ A minimum amount of minutia points – ThresholdK – in both minutiae subsets TRoI and QRoI is needed, as well as a minimum number of corresponding minutia pairs – ThresholdMP – is required in order to find some similarity between both minutiae sets in the RoI. In this work, it has been considered ThresholdK = 10 and ThresholdMP = 7. Figure 123 shows one example of template-query minutia pairing as a result of the fingerprint alignment and minutiae matching processes. Figure 123. Corresponding template-query minutia pairs on the aligned images. 264 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Similarity score computation A global similarity score ScoreMatch between template and query fingerprints is computed based on those previously obtained partial scores ScoreFO and ScoreM as follows: if ScoreFO ≠ 0 and ScoreM ≠ 0 ⎧η1 ⋅ ScoreFO + η2 ⋅ ScoreM , ⎪ ScoreMatch = ⎨ ⎪ ⎩ 0, otherwise where: η1 + η2 = 1 ScoreFO ∈ [0,1] , ScoreM ∈ [0,1] , ScoreMatch ∈ [0,1] Authentication result decision The global similarity score ScoreMatch is compared against one matching threshold ThresholdMatch in order to determine the final authentication result. Only two answers are possible: either both fingerprints come from the same individual, which confirms that the user is the person he/she claims to be; or both fingerprints belong to different fingers/users, so the user is not recognized as the claimed person: if ScoreMatch ≥ Threshold Match ⎧match, Authentic ation result = ⎨ ⎩non − match, otherwise 5.7.2. Hw/Sw Partitioning – Hardware Accelerators Topology – Hw Acceleration The implementation of the matching stage in the proposed application is done by means of hardware-software codesign techniques. Table 56 shows the list of tasks that take part of the matching process. Task 1 2 3 4 5 Description RoI retrieval Field orientation maps matching (ScoreFO) Minutiae sets matching (ScoreM): ▪ Minutiae sets local analysis (SLM) ▪ Minutiae sets global analysis (SGM) ▪ Minutiae sets similarity score fusion (SM) Similarity score computation (ScoreMatch) Authentication result decision Main parameters Size NRoI × MRoI blocks ScoreFO = f(SFO) SM = SLM ⋅ SGM ScoreM = f(SM) ScoreMatch = f(ScoreFO, ScoreM) Authentication result = f(ScoreMatch, ThresholdMatch) Table 56. Tasks involved in the fingerprints matching stage. The first two tasks are already done at this stage since the alignment process is in charge of both operations. It is only required to compute the new score ScoreFO from the previously computed score SFO. This specific task remains as a software task to be executed by the system CPU under any of the embedded system platforms tested in this work. The next task deals with the matching stage of the minutiae sets present in the RoI. This task is split into three sub-tasks. A specific hardware coprocessor is in charge of the local similarity analysis of both minutiae sets because of the complexity in the mathematical operations to be carried out in this stage. The rest of sub-tasks, related to the global analysis of the minutiae sets within the RoI, and the fusion of both minutia-related similarity scores in one single similarity score ScoreM, are kept as software tasks since their workload is quite low when executed under mid- or low-performance CPUs. Finally, (i) the fusion of the similarity scores related to the field orientation maps and the minutiae sets, and (ii) the subsequent decision about the authentication result are simple tasks implemented both as software tasks to be executed as well by the system CPU. The profiling of the fingerprint matching stage under different low-cost processors integrated into the different embedded system platforms evaluated in this work confirms that, by means of the suggested partitioning, it is possible to reach acceptable matching execution time performances in each of the platforms. 265 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 5.7.3. Physical Implementation – Design Development – Design Implementation In this section, the physical implementation of the minutiae-related similarity score computation algorithm is covered. It is split in three sub-tasks: minutiae sets local analysis, minutiae sets global analysis, and minutiae sets score fusion. Minutiae sets local analysis A hardware coprocessor based on a finite state machine is developed to address the local similarity analysis of template and query minutiae sets. The hardware coprocessor makes use of both FIFObased and REGISTER-based interfaces, as depicted in Figure 124. NON-VOLATILE MEMORY INPUT FIFO VOLATILE MEMORY INPUT FIFO I/F SYSTEM CPU S Y S T E M B U S INPUT REGS I/F INPUT REGS HARDWARE COPROCESSOR(S) HARDWARE COPROCESSOR(S) BLOCK BLOCK INTs TO CPU INTERRUPT CONTROLLER OUTPUT FIFO I/F TIMER CONTROLLER OUTPUT FIFO OTHER PERIPHERALS OUTPUT REGS I/F OUTPUT REGS Figure 124. Integration of the hardware matching coprocessor in the embedded system: basic interfaces. The proposed system architecture permits to the hardware coprocessor to generate some interrupts to the system CPU to inform when the hardware processing is finished. In parallel to the interrupt event, some flags are set in the output registers just to provide certain flexibility in the implementation of the algorithm (configurable flags polling or interrupt services strategies are possible) in the embedded system. The inputs of the minutiae sets local similarity matching stage are the local structures (d,φ,γ) corresponding to those template and query minutiae subsets TRoI and QRoI, and the output is the deduced local similarity matrix SLM for each template-query minutia pair within the RoI. Two specific dual-port memories are instantiated in the hardware coprocessor, as shown in Figure 125. The first one is used to save the structural definitions of template and query fingerprints. Initially, the template and query fingerprints need to be processed in order to deduce the structural definition of each minutia point according to the spatial distribution of its neighbours. This task is carried out by the system CPU, no specific hardware coprocessor is in charge of this task. However, and in order to speed up the processing of the triplets (d,φ,γ), two specific computational functions atan2() and sqrt() have been implemented by hardware. These two functions are accessible by the system CPU by means of one interface based on registers, as shown in Figure 125. The input data to be processed is inputted in some registers Y, X; and the result is stored in other registers atan2(Y,X) and sqrt(Y:X). Some start/stop flags are used in order to control the processing flow of the computations. Once the system CPU computes the local structures (d,φ,γ) of template and query fingerprints, such information, initially saved in the system memory, needs to be ported to the hardware coprocessor. This information is directly downloaded from the system memory to the input FIFO, and from the 266 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 input FIFO is transferred to the internal dual-port memory at the initial stage of the hardware processing. The hardware coprocessor is in charge of reading the local structures and computing the local similarity analysis between template and query minutia pairs. For such a purpose, the hardware coprocessor makes use of some configurable parameters accessible through the registers interface (parameters N', M', R, W, η∆d, η∆φ, η∆γ, etc. in Figure 125). The local similarity scores SLM are saved in a second dedicated dual-port memory. Once the processing is completed, the content of that dualport memory is transferred to the output FIFO memory, and from that FIFO it is transferred to the external system memory at the end of the stage. INPUT DATA 32 INPUT REGISTERS 32 Local Structure Sets (d,φ,γ) TRoI : (d,φ,γ)iu DP RAM i = 1 … N' u=1…W Start SLM QRoI : (d,φ,γ)jv j = 1 … M' v=1…W 32 η∆d η∆φ η∆γ Th∆d Th∆φ Th∆γ N' M' R W Start atan2() Start sqrt() Y X FSM LOCAL SIMILARITY ANALYSIS CORE 32 atan2() & sqrt() CORES Local Similarity SLM(i,j) TRoI : DP RAM i = 1 … N' QRoI : 32 j = 1 … M' Done SLM Done atan2() Done sqrt() atan2(Y,X) sqrt(Y:X) OUTPUT DATA OUTPUT REGISTERS Figure 125. Matching hardware coprocessor. The processing flow is shown below. BEGIN 1) System initialization {R, W, N’, M’, Th∆d, Th∆φ, Th∆γ, η∆d, η∆φ, η∆γ} 2) Initialization TRoI local structures FOR i = 1 to N’ DO FOR u = 1 to W DO diu = R+1; φiu = 0; γiu = 0; END FOR END FOR 3) TRoI local neighbourhood analysis FOR i = 1 to N’-1 DO Read TRoI[i] = (xi,yi,βi) FOR j = i+1 to N’ DO Read TRoI[j] = (xj,yj,βj) dmax_i = 0; dmax_j = 0; FOR u = 1 to W DO IF diu ≥ dmax_i dmax_i = diu; umax_i = u; END IF IF dju ≥ dmax_j dmax_j = dju; umax_j = u; END IF END FOR Computation d = (x − x )2 +(y − y )2 ij j i j i 267 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 IF dij < dmax_i diumax_i = dij; END IF IF dij < dmax_j djumax_j = dij; END IF END FOR END FOR FOR i = 1 to N’ DO Read TRoI[i] = (xi,yi,βi) FOR u = 1 to W DO IF diu ≤ R Read neighbour minutia u ≡ TRoI[j] = (xj,yj,βj) Computation φiu = tan −1 ⎜ ⎜ Computation γ iu END IF END FOR END FOR 4) Initialization QRoI local structures FOR j = 1 to M’ DO FOR v = 1 to W DO djv = R+1; φjv = 0; γjv = 0; END FOR END FOR 5) QRoI local neighbourhood analysis FOR i = 1 to M’-1 DO Read QRoI[i] = (xi,yi,βi) FOR j = i+1 to M’ DO Read QRoI[j] = (xj,yj,βj) dmax_i = 0; dmax_j = 0; FOR v = 1 to W DO IF div ≥ dmax_i dmax_i = div; vmax_i = v; END IF IF djv ≥ dmax_j dmax_j = djv; vmax_j = v; END IF END FOR Computation d = (x − x )2 +(y − y )2 ij j i j i ⎛ yj − yi ⎞ ⎟ − βi ⎟ ⎝ xj − xi ⎠ = βj − βi IF dij < dmax_i divmax_i = dij; END IF IF dij < dmax_j djvmax_j = dij; END IF END FOR END FOR FOR j = 1 to M’ DO Read QRoI[j] = (xj,yj,βj) FOR v = 1 to W DO IF djv ≤ R Read neighbour minutia v ≡ QRoI[i] = (xi,yi,βi) Computation φjv = tan −1 ⎜ ⎜ Computation γ jv END IF END FOR END FOR ⎛ yi − yj ⎞ ⎟ − βj ⎟ ⎝ xi − xj ⎠ = βi − βj 268 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 6) TRoI - QRoI local similarity analysis FOR i = 1 to N’ DO FOR j = 1 to M’ DO FOR u = 1 to W DO Read TRoI[i][u] = (diu,φiu,γiu) IF diu ≤ R Computation Tolerance Boxes = f(diu). FOR v = 1 to W DO Read QRoI[j][v] = (djv,φjv,γjv) IF djv ≤ R Computation ∆diu_jv, ∆φiu_jv, ∆γiu_jv Th ∆d-∆d iu_jv Th ∆φ -∆φiu_jv Th ∆γ -∆γ iu_jv Computation + η∆γ ⋅ Siu_jv = η∆d ⋅ + η ∆φ ⋅ Th ∆d Th ∆φ Th ∆γ ELSE Siu_jv = 0 END IF END FOR ELSE Siu_jv = 0, ∀v ∈ [1,W] END IF END FOR Computation SLM [i][j]= ∑∑Siu_jv u =1 v =1 W W Write SLM[i][j] END FOR END FOR END Minutiae sets global analysis The minutiae sets global analysis of template and query fingerprints is kept as software task executed by the system CPU. The processing flow is shown below: BEGIN 1) System initialization {R, W, N’, M’, Th∆x, Th∆y, Th∆β, η∆x, η∆y, η∆β} 2) TRoI - QRoI global similarity analysis FOR i = 1 to N’ DO Read TRoI[i] = (xi,yi,βi) FOR j = 1 to M’ DO Read QRoI[j] = (xj,yj,βj) Computation ∆x, ∆y, ∆β Th∆β -∆β Th -∆y Th -∆x Computation + η∆y ⋅ ∆y + η∆β ⋅ SGM [i][j]= η∆x ⋅ ∆x Th∆y Th ∆β Th∆x END FOR END FOR END Minutiae sets similarity score fusion The minutiae sets similarity score SM fusion and the minutiae-related similarity score ScoreM computation are kept also as software tasks. The processing flow is shown below: BEGIN 1) System initialization { N’, M’, Threshold MP, Threshold K} 2) TRoI - QRoI minutiae similarity analysis FOR i = 1 to N’ DO FOR j = 1 to M’ DO Computation SM [i][j]= SLM [i][j]⋅ SGM [i][j] END FOR END FOR 3) Similarity score computation K = min (N',M'); FOR i = 1 to N’ DO FOR j = 1 to M’ DO Evaluation of TRoI-QRoI minutia pairs based on SM[i][j] (MP) END FOR END FOR IF (MP ≥ Threshold MP) AND (K ≥ Threshold K) Score M = MP/K; ELSE Score M = 0; END IF END The same implementation is ported to the different embedded platforms evaluated in this work. 269 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 5.8. System Integration Three leading families of SOPC/FPGA technology devices such as Atmel Corporation, Altera Corporation and Xilinx Inc. have been used in this work in order to evaluate the performance of the proposed embedded system architecture. One SOPC device from Atmel Corporation, based on a general-purpose CPU, on-chip memory and multi-purpose programmable logic fabric is evaluated in the development of the first stages of the suggested fingerprint recognition algorithm: the fingerprint acquisition process based on sweeping technology sensors, and the reconstruction on-the-fly process of the acquired image. Other two SOPC devices from Altera Corporation and Xilinx Inc. are used to evaluate the performance that is achieved when implementing the complete automatic fingerprint authentication system by means of hardware-software codesign techniques. All three devices have in common several important features: a) they all embed in one single device at least three of the main components in the suggested system architecture: the CPU –implemented either as a hard-core or as a soft-core–, a programmable logic region where to instantiate application-specific functional circuits –used as companion cores or dedicated hardware coprocessors that off-load the system CPU–, and some memory resources – program and data memory for the CPU, or data memory for the hardware coprocessors–; b) they all feature dynamic reconfiguration performance of the programmable logic fabric, which means that all or some regions of the programmable logic fabric can be modified in run-time – technically known as full or partial reconfiguration, depending on whether all or only some regions of the programmable logic fabric can be reconfigured on-the-fly, respectively–; and c) in all three devices the design is done in the way that any hardware coprocessor instantiated in the logic fabric can generate one interrupt to the system CPU in order to notify when any specific task is done, or when any specific action per part of the system CPU is required. Those implementation examples related to any of the tasks that take part in the proposed fingerprint recognition algorithm, which have been presented in the previous sections of this chapter, are only representative examples. Depending on the available resources of each SOPC device, the final implementation has been done either as initially suggested, or the original design has been modified in order to accommodate the functionality to the logic resources of the physical devices used in each of the platforms. The modular design of the different coprocessors makes this task easy. To set some examples, when the device features specific DSP blocks with dedicated hardware multipliers, the vector multiplier method suggested to implement the mathematical product of two operands in section 5.4.3 is replaced by the usage of those hardware multiplier blocks. Similarly, the width of the data buses and the level of parallelism of each of the coprocessors are tuned to the amount of resources available in each physical device. In those scenarios where the amount of resources available allows a high level of parallelism, high-bandwidth circuits are synthesized. However, in those other devices with less available resources, smaller levels of parallelism –and consequently longer latency times– are achieved. In the next sections, the system integration results of the AFAS application in each of the development platforms are presented. 5.8.1. Recognition Appplication: Test Case In today’s fast-moving world, one has to prove several times a day who he is. Biometrics and Electronics can help on this since (i) personal biometric traits like fingerprints are distinctive, unique and permanent to each individual, and (ii) automatic authentication systems deploy methods to validate the identity claims raised by human beings by comparing biometric samples captured online with genuine templates previously enroled off-line in any system. The authentication system either agrees with the claim, or rejects the claim. Only in case of positive match between sample and template, the system grants the user access to the privileges of the application. The deployment of accurate, reliable, secure and efficient methods and systems in charge of the personal authentication process has always been a challenge to application and system designers. In 270 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 this section, the integration of two technologies, namely fingerprint biometrics and hardwaresoftware codesign on programmable logic platforms, is proposed to meet some of the technical challenges posed in the development of AFAS applications. In order to evaluate the performance of the proposed system architecture, the suggested recognition algorithm is ported to several computational workstations: a) the first platform corresponds to one commercial personal computer platform, based on an Intel Core2Duo processor running at 1.83 GHz under Windows operating system; b) the second type of platforms covers several embedded systems mainly based on low-cost and mid-performance microprocessors in charge of the execution of the recognition algorithm; and c) the third kind of platforms refers to those aforementioned embedded systems but modified with the integration of programmable logic resources where to synthesize made-to-measure hardware coprocessors able to act as companion chips used to accelerate the processing. Two 8-bit greyscale fingerprint images of size 460×268 pixels, obtained from the same finger, are used as reference in order to compare the reached performance in each of the platforms. One of both images is used as template fingerprint in the enrolment process, whereas the other emulates the query fingerprint in the authentication process. The set of tasks carried out in each of the stages is indicated in Figure 126, and some of the intermediate results generated along the recognition algorithm are shown in Figure 127. ENROLMENT Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Task B Acquisition Segmentation Normalization Enhancement Field Orientation Filtered Orientation Binarization Smoothing Thinning Extraction Alignment Matching AUTHENTICATION Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Task B Acquisition Segmentation Normalization Enhancement Field Orientation Filtered Orientation Binarization Smoothing Thinning Extraction Alignment Matching Figure 126. Set of processing tasks applied to template and query fingerprints along the enrolment and authentication stages. Test cases a) and b) correspond to purely software platforms in charge of the processing, whereas test cases c) refer to hardware-software systems. The usage of FPGA or SOPC devices adds the flexibility of programmable logic, and allows the implementation of the application by means of hardware-software codesign techniques. In order to determine the best partitioning of the application into hardware and software tasks in scenario c), first of all the original algorithm is ported to the embedded system platform and executed purely by software under the embedded CPU – scenario b) –. The total execution time reached in scenario b) under each of the evaluated platforms is in the range of several tenths of seconds, what clearly does not meet the real-time performance required to the recognition application. However, from the timing profile, it is possible to identify those time-consuming tasks that constraint the execution time so they are proposed to be accelerated by porting them to hardware in the FPGA in scenario c). The rest of less-expensive tasks, which do not compromise the real-time performance of the application, remain as software tasks to be executed by the system CPU. Moreover, those tasks based on the AFAS application flow control and the supervision of the FPGA reconfiguration process are implemented as software tasks, to be executed by the system CPU in scenario c). 271 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 8,9 8,9 7 7 B 1,2 6 A, A, B 6 1,2 3 3 4,5 4,5 Figure 127. Processing stages involved in the personal recognition algorithm: (1) & (2) fingerprint image acquisition of template (left side) and query (right side) based on sweeping technology sensors, fingerprint image reconstruction from acquired slices, image segmentation and normalization, (3) fingerprint image enhancement based on isotropic filtering, (4) & (5) computation of field orientation and filtered field orientation maps, (6) directional filtering and image binarization, (7) image smoothing, (8) & (9) image thinning, minutiae extraction and minutiae filtering, (A) & (B) template-query feature sets alignment and matching. The suggested FPGA/SOPC devices make possible the development of multiprocessor systems, with interconnected cores of different nature, able to work together in order to execute the different tasks involved in the recognition application, as summarized in Figure 128. CPU(s) General-purpose or Application-specific processors. Hard-cores or Soft-cores. Recognition Application Tasks Other Peripheral(s) Standard processors. Hard-cores. HW Coprocessor(s) Application-specific processors. Soft-cores. Figure 128. System processors in charge of the execution of the application tasks 272 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The main features of the SOPC devices evaluated in this work are shown in Table 57. SOPC Features Device technology Processor Type of CPU core Processor data bus CPU operating frequency On-chip single-port SRAM On-chip dual-port SRAM Program SRAM Data SRAM Cache Interrupt controller UART 2-wire Serial Interface Timer/Counter Watchdog timer Real-time Clock Hardware multiplier (8-bit × 8-bit) DSP block (18-bit × 18-bit multiplier, 48-bit accumulator and integrated adder) Floating point unit FPGA Hardware coprocessors operating frequency Logic cells / Logic elements LUTs configuration Registers configuration FPGA LUTs FPGA registers FPGA SRAM bits (RAM blocks) Device configuration memory bits Atmel AT94K40AL 350 nm AVR Risc MCU Hard-core 8 bits 12.5 MHz – 36 Kbytes 20 Kbytes – 32 Kbytes (1) 4 Kbytes – 16 Kbytes (1) No Yes 2 Yes 3 Yes Yes Yes No No Atmel AT40K40 25 MHz 2304 3-input LUT 1-bit Register 2304 (2×3-input LUT) 2862 18432 815832 Altera EPXA10F1020C1 180 nm ARM922T Hard-core 32 bits 200 MHz 256 Kbytes 128 Kbytes Fully configurable Fully configurable 8 kB Inst. Cache [8 kB Data Cache] (2) Yes Yes No Yes Yes No No No No Apex EP20K1000E 48 MHz / 24 MHz 38400 4-input LUT 1-bit Register 38400 (1×4-input LUT) 38400 327680 8995904 Xilinx XC4VLX25 90 nm MicroBlaze Soft-core 32 bits 100 MHz – (3) Fully configurable Fully configurable 8 kB Inst. Cache (4) 8 kB Data Cache (4) Yes Yes No Yes No No No Yes [No] (2) Virtex-4 XC4VLX25 100 MHz / 50 MHz 24192 4-input LUT 1-bit Register 21504 (1×4-input LUT) 21504 1327104 7819904 Note 1: Configurable on-chip memory. A maximum of 36 kB of Dual-Port SRAM available for program and data. Note 2: Configurable and available resource that is not used in the design. Note 3: Based on available FPGA RAM blocks (18 kbits/block). Note 4: The amount of cache memory to be used is configurable. Table 57. Technical description of SOPC devices used in this work. All SOPCs under test are SRAM-based technology devices so the functionality instantiated in the programmable logic fabric and the information stored in the on-chip program and data memories are lost once the supply of the part is removed. Therefore, additionally to the on-chip volatile memory embedded in the SOPC devices, some off-chip memory is also needed in the system. The off-chip system memory can be of two different types: (i) non-volatile (e.g. FLASH), and (ii) volatile (e.g. RAM). The operating system, the application code, the initial content to be downloaded in the programmable logic fabric as well as the reconfigurable contexts that are uploaded along the processing need to be stored in off-chip non-volatile memory like FLASH memory. Moreover, nonvolatile memory is needed to store all that information aiming at user personalization like user’s templates, user’s name, bank accounts and other personal data used in a real application. Off-chip RAM memory, normally of type SRAM or SDRAM, is also used as complementary volatile memory for temporary storage in those scenarios where those on-chip memory resources are not enough for the application. FPGA Reconfiguration Reconfiguration controller Reconfiguration mode Reconfiguration grain Reconfiguration data bus Reconfiguration clock speed Atmel AT94K40AL Cache Logic Dynamic Partial (1-bit) 8-bit 18 – 25 MHz Altera EPXA10F1020C1 Serial Static Full (all-FPGA) 1-bit 16 – 50 MHz Xilinx XC4VLX25 SelectMap / ICAP Dynamic Partial (x16 CLB tall) 32-bit 100 MHz Table 58. FPGA reconfiguration characteristics of each of the devices under evaluation. 273 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 All FPGA/SOPC devices under evaluation allow the reconfiguration of the programmable logic fabric. Table 58 shows the main features of the reconfiguration controllers in each of the chips. The Atmel and Xilinx devices allow dynamic partial reconfiguration of the logic fabric, whereas the Altera device only allows full and static reconfiguration. The reconfiguration data bandwidth is quite different, ranging from up to 50 Mbps or 200 Mbps for the Altera and Atmel devices respectively, till 3.2 Gbps for the Xilinx FPGA. 5.8.2. Atmel FPSLIC Development Platform and Physical Prototype The FPSLIC device AT94K40AL (Field Programmable System Level Integrated Circuit) incorporates one 8-bit hard-core RISC AVR microprocessor, on-chip memory (up to 36 kB of SRAM memory that can be partitioned into program and data memory blocks), some fixed peripherals such as three programmable timers, two serial UARTs, one I2C controller, one interrupt controller, one 8-bit hardware multiplier module, two I/O programmable ports driven by the microprocessor, and one programmable logic device of type FPGA that incorporates up to 40K equivalent gates and dynamic reconfigurability performance in one single chip. The device is built under monolithic 0.35µm CMOS technology, and the layout of the chip is shown in Figure 129. Figure 129. Physical layout of FPSLIC device. The block diagram of FPSLIC device is also depicted in Figure 130, where the three main blocks – the CPU, the on-chip memory, and the FPGA– can be shown. Figure 130. Block diagram of FPSLIC device. One commercial development board from Atmel Corporation, based on FPSLIC device, is used to design, by means of hardware-software codesign techniques, the first of the stages of any 274 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 fingerprint-based personal recognition system. The selected platform, shown in Figure 131, permits to evaluate the proof of concept of the suggested system architecture in the development of the fingerprint acquisition and reconstruction on-the-fly processes. Figure 131. Atmel FPSLIC-based development platform. Apart from the FPSLIC device, one external EEPROM memory is used as non-volatile memory where to store the design, that is, the hardware circuitry that needs to be instantiated in the FPGA region of the device as well as the AVR program code and the application data that is used along the application. The content of the EEPROM memory is automatically downloaded in the FPSLIC device after a system reset or a power-up. This non-volatile memory is additionally used to store any configuration parameter or any other application data such as the user’s template deduced from the acquired fingerprints. Since the evaluation platform does not include any fingerprint sensor, an external fingerprint sensor is connected to the system through one dedicated interface with the FPGA portion of the device. The block diagram of the fingerprint acquisition system is shown in Figure 132. One application-specific hardware coprocessor is instantiated in the FPGA to interface the system with the fingerprint sensor. The fingerprint slices acquisition controller aims at transferring the acquired slices to two specific on-chip buffers. The fingerprint reconstruction controller, also instantiated in the FPGA, aims at combining the acquired slices as they are captured by the acquisition controller, and check the overlapped results between them. The microcontroller makes use of such information and builds the reconstructed version of the fingerprint image by storing those non-overlapped slices (previously identified) in the system memory. The fingerprint acquisition and reconstruction process is therefore implemented by means of hardware-software codesign techniques, being the CPU the master controller, and the hardware coprocessors instantiated in the FPGA portion of the device two slave controllers driven by the master CPU. They are in charge of off-loading the master CPU in order to make possible to reconstruct on-thefly the acquired images. 275 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 ACQUISITION RECONSTRUCTION Slices Acquisition Controller Image Reconstruction Controller Slice A Slice B Image Storage Controller µCONTROLLER FINGERPRINT SENSOR RECONFIGURABLE FPGA S T O R A G E MEMORY Figure 132. Fingerprint acquisition system under FPSLIC-based development platform. The resultant system is able to acquire and reconstruct any user’s fingerprint on-the-fly, operating at an acquisition rate of 200 slices per second, which is proven to provide reliable and good quality fingerprint impressions. The summary of resources used in the application is shown in Table 59. Task ID Processing Stage 1-bit Flip Flop 422 / 2862 Task 0 Fingerprint acquisition and image reconstruction System Resources FPGA Resources 3-input LUT 1-bit RAM Logic cell 977 / 4608 16384 / 18432 912 / 2304 Memory Resources Data Memory (bytes) Program Memory (bytes) 8753 / 16384 1907 / 20480 Operating Frequency CPU (KHz) FPGA (KHz) 12500 400 Table 59. FPSLIC resource usage in the development of the automatic fingerprint acquisition system. After evaluation of the suggested architecture, one specific prototype that embeds all the required components is developed, as depicted in Figure 132. One of the UART controllers of FPSLIC is used to build a communication link between the prototype board and the external world for debugging purposes. In this way, it is possible to obtain the acquired slices and the reconstructed images along the processing. A photograph of the physical prototype is shown in Figure 133. Figure 133. FPSLIC-based physical prototype developed in this work. 276 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The block diagram of the developed system is shown in Figure 134. ATMEL FPSLIC PLATFORM C O N F I G U R A T I O N C O N T R O L L E R INTERRUPT CONTROLLER I/O PORTS WATCHDOG AVR MCU TIMER 3 TIMER 2 DUAL-PORT SRAM PROGRAM DATA MEMORY MEMORY 16 TIMER 1 HW MULTIPLIER I2C UNIT UART 2 4 FPGA UART 1 EEPROM MEMORY INT LINES RS-232 ATMEL AT94K40AL SYSTEM ON CHIP FINGERPRINT SENSOR USER I/F Figure 134. Automatic fingerprint acquisition system developed with Atmel FPSLIC device. 5.8.3. Altera Excalibur EPXA10 Development Platform The SOPC Excalibur device EPXA10F1020C1 from Altera Corporation embeds in one single chip the following functional blocks: - one 32-bit ARM922T hard-core RISC processor operating at up to 200MHz, with Harvard cache architecture and separate and configurable 8 kB instruction and 8 kB data caches; - one APEX 20KE-like field programmable logic gate array with an equivalent capacity of 1Mgates (equivalent to the commercial FPGA APEX 20K1000E), 38400 logic elements (composed of one 4-input LUT, one programmable register, and carry and cascade chains), and 327680 RAM bits; - a considerable amount of on-chip memory: up to 256 kB of single-port SRAM, and up to 128 kB of dual-port SRAM (accessible by the system CPU and the programmable logic device); - additional on-chip memory controllers like one SDRAM controller and one External Bus Interface –EBI– controller, which enable the access to off-chip memories and others memory-mapped peripherals; - one programmable logic device –PLD– configuration controller, aimed at reconfiguring the programmable logic device by means of the embedded ARM922T processor; and - other key peripherals like one interrupt controller, general-purpose timers, universal asynchronous receivers/transmitters, PLLs, and one watchdog timer. The SOPC device is split into the field programmable logic region and one stripe that allocates the rest of components. The layout of the chip is depicted in Figure 135. 277 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Stripe (MCU + Memory + Peripherals) FPGA Figure 135. Physical layout of EXCALIBUR device. The commercial development board EPXA10 DDR from Altera Corporation is used to develop the AFAS application. Apart from the SOPC device, the development board integrates other components. Among them, the main ones used by the application are the following: - some external memory: a) up to 128 Mbytes of DDR SDRAM, driven at 2×125 MHz and managed by the onchip SDRAM controller, where to store the application data and the fingerprint images handled by the recognition system; and b) up to 32 Mbytes of FLASH memory, connected to the on-chip EBI controller, where to store the program code to be downloaded into the on-chip SRAM memory at system power up, the different bitstreams to be downloaded into the FPGA at system power up and along the application execution time, as well as the fingerprint templates of the users that will be recorded by the system in the enrolment stages; - one RS-232 transceiver directly connected to the UART controller of the SOPC device, used as a direct link with a personal computer platform for interfacing the AFAS with the external world; and - some I/O ports allocated in the programmable logic region, which are used to interconnect the processing platform with an external fingerprint sweeping sensor FingerChip FCD4B14. The system architecture of the AFAS application is presented in Figure 136. A first implementation of the recognition algorithm purely by software under the only action of the ARM922T microprocessor is developed in order to identify those compute-intensive tasks that constrain the application execution time performance. Based on the resulting tasks timing profile, the partitioning of the application into hardware and software tasks is suggested in a second implementation of the recognition system that makes use of those programmable logic resources available in the FPGA portion of the device. The SOPC device features a multi-bus architecture. The ARM922T processor is based on Advanced Microcontroller Bus Architecture (AMBA) specification, specifically on Advanced HighPerformance Bus (AHB). Two AHB buses support the efficient connection of the CPU, the programmable logic, on-chip and off-chip memories, and others on-chip and off-chip peripherals. AHB1 is the high-speed bus, considered as the main system bus, able to provide a high-bandwidth interface between the elements that are involved in the majority of transfers. AHB2 refers to the low-speed bus, with lower bandwidth and where most of the peripheral devices in the system are located. Both buses are 32-bits wide. The PLD is directly connected to AHB2, so specific AHB controllers, one master and one slave, are designed in the PLD in order to allow the bidirectional communication between the system CPU and the different coprocessors instantiated in the FPGA. Apart form AHB1 and AHB2, there is one Avalon bus in charge of interfacing the DP-SRAM with the PLD. One Avalon master controller is synthesized in the FPGA for such a purpose. 278 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 EXCALIBUR EPXA10 SYSTEM ON CHIP STRIPE Watchdog Timer 200 MHz 200 MHz 100 MHz 100 MHz Slave Master AHB1 – AHB2 BRIDGE Configuration Logic AHB2 Expansion Bus Interface 100 MHz 48 MHz DP SRAM ( Data ) AVALON Bus ARM922T Core + I&D MMUs + I&D Caches Interrupt Controller AHB1 DDR SDRAM ( Fingerprints ) FLASH ( Bitstreams ) SDRAM Controller UART SP SRAM ( Program ) Master Slave TIMER 100 MHz 48 MHz Slave Master PLD – STRIPE BRIDGE STRIPE – PLD BRIDGE RS-232 SENSOR I/F 48 MHz AVALON MASTER 48 MHz 48 MHz 24 MHz AHB MASTER IN FIFO OUT FIFO 48 MHz 24 MHz 48 MHz AHB SLAVE AFAS I/F CTRL regs CTRL regs STS regs 48 MHz 48 MHz 48 MHz 24 MHz FINGERPRINT SENSOR APEX 20K FPGA APPLICATION SPECIFIC HARDWARE COPROCESSORS STS regs STS rXILINX 24 MHz INTs Figure 136. Automatic fingerprint authentication system developed under Altera EPXA10 evaluation platform. The system is composed of three different clock domains: - based on a reference system frequency clock of 50 MHz, one PLL is configured in order to increase the working frequency of AHB1 bus to 200 MHz, and AHB2 bus to 100 MHz; - a second PLL is used to configure the working frequency for the DDR SDRAM memory to 2×125 MHz. The clock source is also the reference system frequency clock of 50 MHz; and - the PLD runs from another specific system clock generator embedded in the board that is based on a frequency of 48 MHz. This clock is used as the reference clock for the AVALON and AHB master and slave controllers instantiated in the PLD, as well as for the rest of application-specific coprocessors in charge of the image processing tasks. Several bridges are present in the SOPC device to decouple the clocks of the various domains, as shown in Figure 136. Those application-specific hardware coprocessors instantiated in the FPGA are in charge of processing the fingerprint images so read and write accesses from the FPGA to on-chip and off-chip memories are needed. Those accesses, handled by the AHB master controller, consist of 32-bit burst transactions in order to speed up the transfers as much as possible. When transferring data (RD/WR) from the PLD through the AHB bus, it has been experimentally measured transfer speed ratios close to one 32-bit word in two clocks (at 48 MHz) for the embedded system. Therefore, and in order to absorb such a bus latency, it is possible to design specific hardware coprocessors in the PLD running in pipeline at a half of the working frequency (24 MHz) without losing system speed performance. Some FIFO memories have been installed as interface buffers between the two different clock domains. In this direction, it is possible to exploit the inherent latency of the AHB bus to reduce the power consumption of the system (by reducing the system clock frequency) without losing global performance. The AHB slave controller is in charge of establishing a bidirectional communication link between the system CPU and the PLD through a set of 32 dedicated 32-bit registers instantiated in the PLD. This communication channel is used by the system CPU to configure each of the hardware coprocessors synthesized in the PLD, to send the start command for each of the processing tasks, to read the status flags that inform about the progress of each hardware task, as well as to read the results of the processing. Furthermore, several interrupt sources are established in the PLD region in order the different hardware coprocessors instantiated in the PLD to inform the system CPU when 279 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 one specific task is finished, or when specific actions per part of the system CPU are needed. In this way, the CPU and the coprocessors are able to run concurrently along the AFAS application. The Avalon master implements an alternative way to write/read data from the FPGA. The on-chip DP-SRAM memory is accessible from the stripe (through AHB2 bus) and from the Avalon master controller (through Avalon bus). In summary, in the FPGA, two different masters can access the different shared resources of the system: the AHB master controller provides access to all shared resources in AHB2 bus, and the Avalon master controller provides access to DP-SRAM memory map only. Meanwhile one of the masters reads the input images stored in the system memory, the input data can be internally processed in pipeline in the PLD and the output result can be written to memory by the second master controller, which allows speeding up the execution in some of the stages of the algorithm. DMA operations independent of the system CPU can be carried out along the processing through those master controllers synthesized in the FPGA. The ARM922T processor is the only master controller present in AHB1 bus. In AHB2, however, three masters are connected: the AHB1-AHB2 bridge controller, the configuration logic controller (responsible for the reconfiguration of the FPGA), and the PLD-to-stripe controller. The SOPC device is configured at system power-up with the data stored in a non-volatile FLASH memory. This configuration process initializes the system (hardware resources) as well as the application (software) that will be executed by the ARM922T CPU. Moreover, there exists a specific interface that enables the system CPU to treat the FPGA as a memory block and configure the FPGA by writing to a virtual memory location, what makes easy the reconfiguration of the FPGA by the system CPU at any time after power up. Thus, the embedded programmable logic device can be configured on-the-fly with the specific functionality required at any moment along the execution time. Once the FPGA is initialized after power up, it can be reconfigured in-circuit by resetting and loading new data to the logic device from the system CPU. Therefore, real-time changes can be made during normal operation. Although the FPGA is in reset meanwhile it is being reconfigured, the rest of the system keeps running (CPU, memory controllers, and other peripherals) so the AFAS application keeps alive. The ARM922T CPU (hard-core) is the main core of the application, in charge of managing the application flow and the reconfiguration of the FPGA, whereas the coprocessors instantiated in the FPGA (soft-cores) play the role of companion or secondary cores in a run-time reconfigurable system. The application can be split in several FPGA contexts, and each context corresponds to a full PLD bitstream that defines the functional content of the FPGA. Depending on the size of the FPGA and the resources needed by each hardware coprocessor, only one, few, or many hardware coprocessors can be instantiated into the FPGA in one context. The FPGA can be reconfigured as many times as the application requires, depending on the complexity of the recognition algorithm. In this way, the complexity of the recognition algorithm can be increased without penalty on the system resources, except for what refers to the size of the non-volatile memory where to store the different bitstreams that have to be downloaded along the application execution time. Each time a new context is needed, a reconfiguration process of the FPGA takes place. The penalty in execution time is reduced to the FPGA reconfiguration latency, as well as the execution time linked to the added processing stages. Since the accesses to the shared system resources (e.g. on-chip and off-chip memory) are proven to be the main bottlenecks that limit the execution time performance of the application, the hardware coprocessors in charge of the processing are designed in the way of reducing the amount accesses (RD/WR) to the shared memory resources. Each hardware coprocessor receives some input data from the system memory and generates new output data that is stored back in the system memory. Parallelism and pipeline techniques are used to improve the execution time performance. Although the selected SOPC is an old-technology (0.18 µm) device, it serves as proof of concept in this work since it is a good example of system-on-chip device that combines a general-purpose 280 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 MPU with a run-time reconfigurable FPGA. The physical implementation of the personal recognition application under the proposed platform proves that the proposed system architecture is able to meet the demanded requirements of high computational performance at low cost. Table 60 and Table 61 provide the execution time performance of enrolment and authentication stages in two different scenarios: (i) when the application is executed purely by software under the system CPU, and (ii) when the application is implemented by means of hardware-software codesign techniques with dynamic reconfigurability performance. The final partitioning of the application into hardware and software tasks is also detailed. Task ID Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 (1) Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Reconfiguration (0–5) → 6 Image directional filtering and binarization Reconfiguration 6 → (7–9) Image smoothing Image thinning Minutiae extraction and minutiae filtering Software-only Implementation 500.000 ms (1) 1083.219 ms 178.940 ms 5304.010 ms 834.062 ms 97.061 ms 3792.712 ms Hardware-Software Implementation Sw-only Task Hw-Sw Task 500.000 ms 1.288 ms 2.267 ms 2.179 ms 1.288 ms 90.234 ms 179.918 ms 1.337 ms 179.918 ms 1.407 ms 1.621 ms 81.216 ms 542.673 ms 1536.114 ms 1695.930 ms 76.626 ms 14598.674 ms Total Execution Time (2): : One application-specific hardware coprocessor is synthesized in the FPGA to interface the processor with the fingerprint sensor. (2) : Task 0 is not included in the computation of the total execution time. Table 60. Execution time performance reached in the enrolment stage: SW-only versus HW-SW implementations. Task ID Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Task B Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Reconfiguration (0–5) → 6 Image directional filtering and binarization Reconfiguration 6 → (7–B) Image smoothing Image thinning Minutiae extraction and minutiae filtering Field orientation maps alignment Minutiae alignment, feature sets matching and authentication decision Total Execution Time (2): (1) Software-only Implementation 500.000 ms (1) 1083.219 ms 178.940 ms 5304.010 ms 987.089 ms 113.959 ms 4460.569 ms 1752.322 ms 1767.383 ms 93.783 ms 279636.069 ms 370.712 ms 295748.055 ms Hardware-Software Implementation Sw-only Task Hw-Sw Task 500.000 ms 1.288 ms 2.267 ms 2.179 ms 1.288 ms 105.479 ms 179.918 ms 1.337 ms 179.918 ms 1.407 ms 1.441 ms 95.395 ms 312.074 ms 71.851 ms 955.842 ms : One application-specific hardware coprocessor is synthesized in the FPGA to interface the processor with the fingerprint sensor. (2) : Task 0 is not included in the computation of the total execution time. Table 61. Execution time performance reached in the authentication stage: SW-only versus HW-SW implementations. All tasks are ported to hardware with the exception of the filtered field orientation map computation stage and the minutiae extraction and filtering steps, which remain as software tasks in the final partitioning because of their low latency when executed by the system CPU. For the rest of the tasks, fully- or partially-pipelined hardware coprocessors are implemented. The improvements in execution time performances achieved in the second scenario with regard to the first scenario are notorious. 281 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The complete bitstream that defines the AFAS application and that resides in the non-volatile memory –FLASH– of the system is composed of the following information: - the definition of Excalibur resources: system CPU, memory map, and other peripherals; - the initialization of the system’s configuration registers; - the PLD configuration data that describes the hardware content instantiated in the FPGA along any of the contexts required by the application; - the AFAS application program code to be executed by the system CPU; and - a bootloader used in order the CPU to start up the system (making the necessary memory mappings, on-chip peripherals and system registers initializations; transferring the first hardware context into the FPGA; copying the user’s program code –AFAS application– into the internal SP-SRAM memory block, as well as the program data into the SDRAM memory block) before passing the control to the AFAS application program. The size of the portion of the bitstream that refers to the PLD configuration data –FPGA functional content– of one context is 1124488 bytes. This size is constant, and only depends on the size of the selected FPGA. In the FPGA, two levels of functionality exist: a) Level 1, which corresponds to the Avalon master, AHB master and AHB slave controllers, used to link the FPGA with the rest of the system resources (CPU, on-chip and off-chip memories, and other peripherals). They operate at a working frequency of 48 MHz in the present application. b) Level 2, which consists of the specific hardware coprocessors oriented to the execution of those computationally expensive tasks involved in the personal recognition algorithm. All those hardware accelerators are connected to the AHB and Avalon interfaces. They consist of pipelined designs running at clock rates of 24 or 48 MHz. Task ID Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Image directional filtering and binarization Image smoothing Image thinning Field orientation maps alignment Minutiae alignment, feature sets matching and authentication decision 1-bit Flip Flop Hardware Resources 4-input LUT 1-bit RAM Logic cell Tasks 0 – 4 12166 33136 171456 37889 Task 5 Task 6 – 11541 – 37236 – 301824 – 38301 Tasks 7 – 8 and A – B 6659 27853 253440 29327 Task 9 Minutiae extraction and minutiae filtering Total Design Resources: Total Device Resources: – 30366 38400 – 98225 38400 – 726720 327680 – 105517 38400 Table 62. FPGA resources usage in each of the contexts in which the application is partitioned. The FPGA embedded in Excalibur device does not allow partial reconfiguration, only reconfiguration of the complete FPGA is possible. Therefore, in order the system CPU to reconfigure the FPGA with a new context, the CPU is sequentially transferring the complete configuration data of the new context to a specific register of the on-chip slave configuration controller. The accesses of the CPU to the configuration register are 32 bits-wide, however, the transfer from the configuration register to the PLD configuration logic is serial, 1 bit per clock, configured at a maximum rate of 50 MHz, which results in a configuration latency of 179.918 ms per context. This is the penalty in time –reconfiguration overhead– added to the application every time a new context with a new set of hardware coprocessors is downloaded in the FPGA. After synthesis and verification of each hardware coprocessor individually, all them are grouped in FPGA contexts. As many coprocessors as possible have been embedded in each of the contexts in 282 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 order to reduce the number of contexts and therefore the amount of FPGA reconfiguration cycles needed in the application. A total of three contexts are required in order to instantiate the different coprocessors demanded by the application. In the first context (downloaded at system power up), the FPGA accommodates those hardware accelerators needed in the first stage of the authentication process corresponding to the fingerprint acquisition and the image enhancement tasks. The second context is downloaded in the FPGA once those previous stages are completed. In this new context, those hardware accelerators in charge of the directional filtering and the image binarization processes are downloaded. Finally, in the third context, the rest of hardware coprocessors responsible for the feature extraction, alignment and matching stages are instantiated in the FPGA. Table 62 summarizes the FPGA resources usage in each application context. The FPGA-based hardware coprocessors are used in conjunction with the rest of system resources in order to deploy the fingerprint-based personal recognition application. By making use of hardware-software codesign techniques, and taking advantage of parallelism and pipeline strategies available in programmable logic design, it has been possible to speed up those computational tasks that were constraining the real-time performance of the original algorithm when implemented purely by software by the ARM core of the embedded system platform. Figure 137. Altera EXCALIBUR-based development platform. The physical AFAS platform developed in this work is shown in Figure 137. As a summary of the implementation, it can be concluded that it is possible to deploy one application that demands 105517 logic cells and 726720 memory bits in one device that only features 38400 logic cells and 327680 memory bits at the expense of the reconfiguration overhead. When proceeding in this way, a speed-up of ×309.4 is reached in the authentication process when comparing against the execution under the low-cost embedded processor alone, or a speed-up of ×3.4 is achieved when comparing against the execution of the same algorithm under a personal computer platform (refer to Table 52). The usage of run-time reconfigurable hardware and the deployment of hardware-software codesign techniques are key elements that enable this approach to be successful. 5.8.4. Xilinx Virtex-4 ML401 Evaluation Platform The FPGA Virtex-4 XC4VLX25 from Xilinx Inc. is selected as the core device from which to build the AFAS application in this scenario. It is developed in 90 nm CMOS technology, and is provided with one reconfiguration ICAP controller to allow dynamic partial reconfiguration. 283 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The selected device provides in one single chip the following features: - 1 CLB array of 96 rows and 28 columns embedding up to 10752 slices (24192 logic cells); - 1 ICAP controller that makes possible the reconfiguration of part of the physical resources of the device while the rest of the FPGA remains active; - 48 dedicated DSP slices provided with one 18×18 multiplier, one integrated adder, and one 48-bit accumulator; - 11 Input/Output Blocks –IOB– that provide the interface between the package pins and the internal configurable logic; - 72 block RAM modules –BRAM– of dual-port RAM memory with a size of 18 kbits per block that are cascadable to form larger memory blocks; and - 8 Digital Clock Managers –DCM– and 4 Phase Matched Clock Dividers –PMCD– to manage the clock sources of any design. Those hardware resources are the basis to instantiate any circuit, including descriptions of microprocessor units such as the 32-bit soft-core processor MicroBlaze (based on a RISC architecture with Harvard-style separate 32-bit instruction and data buses), data and instructions cache memories, on-chip program and data memories, standard peripherals such as timers, universal asynchronous receivers/transmitters, floating point units, etc. and other application-specific hardware accelerators. The layout of the FPGA is presented in Figure 138. Figure 138. Physical layout of VIRTEX-4 XC4VLX25 device. The commercial development platform ML401 from Xilinx Inc. embeds the suggested FPGA and other physical resources such as additional FLASH and SDRAM memory blocks (where to store those bitstreams that define the physical content of the FPGA, as well as the program code and application data handled by the recognition system), one serial RS-232 transceiver (which is used to establish a communication link with an external host in the application), and some I/O ports to interface the platform with an external fingerprint sensor. The same fingerprint sensor based on sweeping technology used in the previous test scenarios is now used in that embedded system. The presented platform is suggested to implement the AFAS application. The block diagram of the embedded system is depicted in Figure 139. The FPGA is partitioned in two regions in the biometric application: one static region and one reconfigurable region. The physical resources available in each region are shown in Table 63. Resources 1-bit Flip Flop 4-input LUT 1-bit RAM DSP Block Xilinx XC4VLX25 21504 21504 1327104 48 Spatial Partitioning Static Region Reconfigurable Region 10240 11264 10240 11264 921600 405504 4 44 Table 63. Spatial partitioning of the programmable logic device into one static and one reconfigurable regions. 284 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 XILINX ML401 PLATFORM UART CONTROLLER INT CONTROLLER TIMER EXT MEMORY CONTROLLER PLBV46 LINEAR FLASH DDR SDRAM MULTI-PORT MEMORY CONTROLLER NPI DXCL IXCL PLBV46 BRAM LOCAL MEMORY MICROBLAZE ILMB DLMB RS-232 MMU MST INTs MMU SLV Reg Reg Sts Ctrl Di Do Di PARTIALLY RECONFIGURABLE REGION BM BM FINGERPRINT SENSOR APPLICATION SPECIFIC HARDWARE COPROCESSORS PRR FIFO BM Reg Reg PRR FIFO PRR FIFO Cfg FIFO BM BM AFAS I/F BM PRR RECONFIGURATION CONTROLLER PLATFORM FLASH SelectMAP I/F FPGA CONFIGURATION MEMORY ICAP I/F FPGA VIRTEX4 XC4VLX25 SYSTEM ON CHIP Figure 139. Automatic fingerprint authentication system developed under Xilinx ML401 evaluation platform. In the static region, different components that are permanently present along the application execution time are instantiated such as one 32-bit MicroBlaze soft-core processor acting as master CPU, data and instruction caches, local memory, memory controllers to access external memory devices, one dedicated reconfiguration controller acting as master of the ICAP block (in charge of the dynamic reconfiguration of the partial reconfigurable region), some made-to-measure hardware coprocessors in charge of establishing a bidirectional link between those coprocessors instantiated in the partial reconfigurable region and the rest of the system, and other standard peripherals such as one interrupt controller, one timer unit, one UART unit, general-purpose input/output ports, etc. In the partial reconfigurable region (PRR) of the FPGA, application specific coprocessors are instantiated as soon as they are required along the application execution time. Those dynamic hardware coprocessors are present only when they are needed, thus the same hardware resources available in the PRR are reused to instantiate different circuits in the application. The layout of the physical partitioning is depicted in Figure 140. In this new approach, the master CPU takes care of the fingerprint acquisition and reconstruction processes in both enrolment and authentication stages. The master CPU operates at a maximum frequency of 100 MHz, and the hardware coprocessors operate at either 100 MHz or 50 MHz. As it is shown in Figure 139, the suggested platform is provided with two types of non-volatile memory: a) The Platform FLASH memory block (4 Mbytes) stores the initial bitstream that defines the configuration of the FPGA upon power up. This initial configuration is composed of the hardware content of the static region (master CPU, memory controllers and other peripherals), and one bootloader that is executed by the master CPU, in charge of initializing the system. The initial content of the reconfigurable region of the FPGA remains blank after power up. The transfer of the initial bitstream from the platform FLASH to the internal configuration memory of the FPGA is automatically done during power up through a dedicated SelectMAP interface present in the FPGA. b) The Linear FLASH memory block (8 Mbytes) contains the definition of those reconfigurable hardware coprocessors to be instantiated in the reconfigurable region of the FPGA along the application execution time, as well as the AFAS program code to be executed by the master CPU. Moreover, the linear FLASH is used in the AFAS application to store the templates of those genuine users registered into the system in the enrolment stage. The FPGA reconfiguration process of the PRR is done by means of one dedicated reconfiguration controller instantiated in the static region and the hard-core ICAP controller inherent to the FPGA. 285 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Figure 140. VIRTEX-4 XC4VLX25 FPGA floorplan. Partitioning between Static (black area) and Partially Reconfigurable (grey area) regions suggested in this work. During the power up sequence, the bootloader application is in charge of initializing the different controllers instantiated in the static region of the FPGA and transferring to the SDRAM memory block (64 Mbytes) the content of the linear FLASH, that is, the AFAS program code and the partial bitstreams to be downloaded into the PRR along the AFAS application. In this way, the SDRAM memory acts as program and data memory in the application and can be accessed either by the CPU through the PLB bus, or the MMU master controller through a dedicated NPI bus. Once all the information is properly transferred to the SDRAM memory, the bootloader gives the control to the AFAS application, and the AFAS application starts, under the control of the system CPU. A multi-bus system architecture permits the interconnection among the different processing blocks. Two specific made-to-measure memory management units –MMU master and slave– are instantiated in the static region, which aim at interfacing the master CPU and the rest of controllers provided in the static region with those reconfigurable coprocessors instantiated in the PRR. The interface between the static and reconfigurable regions is built through specific Bus Macros (BM) and some bidirectional FIFO memories intended for the fast exchange of big amounts of data. Moreover, some registers are instantiated in the static region in order the master CPU to configure the static and reconfigurable hardware coprocessors, and control and monitor the processing flow. The interface between the MMU master and the PRR reconfiguration controller present in the static region is also implemented through a dedicated FIFO memory, as depicted in Figure 139. The reconfiguration controller is in charge of reading the partial bitstreams previously saved in the SDRAM memory block, and transferring them to the ICAP, which lately configures the reconfigurable region of the FPGA with the new functional content defined by each bitstream. Another FIFO memory block is instantiated in the static region, which acts as a temporary buffer of that information that needs to be shared between different contexts of the PRR. Before reconfiguring a new context in the PRR, those parameters that have to be used in the next contexts are saved in that FIFO. After the reconfiguration process, the content of the dedicated FIFO is transferred back to the PRR in order the new processors to make use of such information. A RS-232 transceiver is also connected to the UART present in the embedded system in order to build a serial communication link between the recognition module and the host or high-level application that makes use of the personal recognition result. Following the same procedure than in the previous test scenarios, the application is partitioned into hardware and software tasks, and only those tasks demanding a high computational power are ported to hardware. The interface between the master CPU and those application-specific hardware coprocessors instantiated in the FPGA, either in the static or reconfigurable regions, is provided with some interrupt lines in order any of those hardware coprocessors to be able to notify to the master CPU about the end of the processing task that is executed by hardware. In order to reduce the reconfiguration time of the PRR, the size of the reconfigurable region has been minimized as much as possible. A specific reconfiguration controller is instantiated in the static region of the FPGA in order to allow fast reconfiguration without impacting on CPU load. 286 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The CPU is only responsible for indicating to the reconfiguration controller the specific partial bitstream that has to be downloaded, and once this is defined, the reconfiguration controller is in charge of the reconfiguration process without the need of any further action per part of the master CPU. Once the reconfiguration is done, the reconfiguration controller notifies the end of the task to the CPU, and the master CPU continues driving the AFAS application program flow. Task ID Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Reconfiguration 1 → 2 Image normalization Reconfiguration 2 → 3 Image isotropic filtering Reconfiguration 3 → 4 Field orientation Reconfiguration 4 → 5 Filtered field orientation Reconfiguration 5 → 6 Image directional filtering and binarization Reconfiguration 6 → 7 Image smoothing Reconfiguration 7 → 8 Image thinning Reconfiguration 8 → 9 Minutiae extraction and minutiae filtering Total Execution Time (2): (1) Software-only Implementation (1) 500.000 ms 232.046 ms 33.087 ms 512.171 ms 285.485 ms 19.143 ms 656.043 ms 253.553 ms 416.316 ms 25.699 ms 2433.543 ms Hardware-Software Implementation Sw-only Task Hw-Sw Task 500.000 ms 0.672 ms 0.841 ms 0.850 ms 1.045 ms 2.563 ms 1.025 ms 0.669 ms 1.046 ms 0.419 ms 1.107 ms 2.465 ms 1.045 ms 0.447 ms 0.974 ms 0.902 ms 0.943 ms 4.919 ms 21.932 ms : The execution times are slightly higher than in the scenario shown in the Embedded System 3 of Table 51 because of the reduction of the size of the cache memories in this new scenario in order to allocate additional memories in the hardware coprocessors (now only 8 kB of Instruction and Data caches are instantiated in MicroBlaze interface, instead of the inital 32 kB Instruction cache + 64 kB Data cache). (2) : Task 0 is not included in the computation of the total execution time. Table 64. Execution time performance reached in the enrolment stage: SW-only versus HW-SW implementations. Task ID Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Task B Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Reconfiguration 1 → 2 Image normalization Reconfiguration 2 → 3 Image isotropic filtering Reconfiguration 3 → 4 Field orientation Reconfiguration 4 → 5 Filtered field orientation Reconfiguration 5 → 6 Image directional filtering and binarization Reconfiguration 6 → 7 Image smoothing Reconfiguration 7 → 8 Image thinning Reconfiguration 8 → 9 Minutiae extraction and minutiae filtering Reconfiguration 9 → A Field orientation maps alignment Reconfiguration A → B Minutiae alignment, feature sets matching and authentication decision Total Execution Time (2): (1) Software-only Implementation (1) 500.000 ms 232.046 ms 33.087 ms 512.171 ms 337.419 ms 22.178 ms 774.750 ms 287.507 ms 417.350 ms 32.497 ms 139935.838 ms 108.608 ms 142693.451 ms Hardware-Software Implementation Sw-only Task Hw-Sw Task 500.000 ms 0.672 ms 0.841 ms 0.850 ms 1.045 ms 2.563 ms 1.025 ms 0.669 ms 1.046 ms 0.419 ms 1.107 ms 2.465 ms 1.045 ms 0.447 ms 0.974 ms 0.820 ms 0.943 ms 7.606 ms 1.045 ms 157.671 ms 1.035 ms 20.737 ms 205.025 ms : The execution times are slightly higher than in the scenario shown in the Embedded System 3 of Table 52 because of the reduction of the size of the cache memories in this new scenario in order to allocate additional memories in the hardware coprocessors (now only 8 kB of Instruction and Data caches are instantiated in MicroBlaze interface, instead of the inital 32 kB Instruction cache + 64 kB Data cache). (2) : Task 0 is not included in the computation of the total execution time. Table 65. Execution time performance reached in the authentication stage: SW-only versus HW-SW implementations. Table 64 and Table 65 provide the execution time performance of the application in both enrolment and authentication stages in two different scenarios: (i) when the application is executed purely by 287 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 software under the system CPU, and (ii) when the application is implemented by means of hardware-software codesign techniques making use of the dynamic reconfigurability performance of the suggested FPGA. The final partitioning of the application into hardware and software tasks is also detailed. In the latest scenario, all tasks are ported to hardware except the fingerprint acquisition and reconstruction process, which is kept as a software task under the action of the CPU. A total of 9 and 11 contexts are used in order to instantiate all those hardware accelerators needed to execute the different tasks involved in the enrolment and authentication processes, respectively. In the suggested embedded system platform, the reconfiguration process, although monitored by the master CPU, is handled by a dedicated reconfiguration coprocessor specifically designed to drive the ICAP module of the device. It results in a notorious improvement of the reconfiguration bandwidth when compared against the reconfiguration process in the previous embedded system with Excalibur SOPC device. The reconfiguration time of the partial reconfigurable region is not constant; it depends on the size of the reconfigurable area, and the complexity or amount of resources used by the hardware coprocessors to be instantiated in each context. Moreover, the reconfiguration process is done at the maximum throughput allowed by Virtex4 technology, which is 3.2 Gbps –since the reconfiguration datapath is parallel (32-bits), and the maximum allowable device reconfiguration frequency is 100 MHz–. By means of such a dedicated hardware reconfiguration controller, developed from scratch and described in VHDL, it has been possible to reach the maximum reconfiguration throughput in the proposed AFAS application. The time needed to reconfigure the content of the reconfigurable region in the presented platform results in ~1ms in average. The total reconfiguration overhead is therefore much lower than in the previous scenario (8.026 ms in the enrolment stage and 10.106 ms the authentication stage versus 359.836 ms in both enrolment and authentication stages in the previous embedded system). Task ID – Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Task B Processing Stage Application flow (static design) Fingerprint acquisition and image reconstruction Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Image directional filtering and binarization Image smoothing Image thinning Minutiae extraction and minutiae filtering Field orientation maps alignment Minutiae alignment, feature sets matching and authentication decision Total Design Resources: Total Device Resources: 1-bit Flip Flop 7005 – 4978 371 5275 3339 2857 5462 4892 1013 487 2632 642 38953 21504 Hardware Resources 4-input LUT 1-bit RAM 8888 755712 – – 4612 147456 334 0 5831 92160 3166 92160 2983 129024 4166 313344 3265 147456 2821 239616 3379 55296 8943 387072 4379 52767 21504 258048 2617344 1327104 DSP Block 4 – 20 8 28 8 0 29 0 0 0 0 5 102 48 Table 66. FPGA resources usage in each of the contexts in which the application is partitioned. The FPGA resource usage in both static and reconfigurable regions is shown in Table 66. The physical resources needed to implement each of the hardware coprocessors are also detailed. The reconfigurability performance of the FPGA permits to reduce the amount of resources needed in the programmable logic device in comparison with the amount of resources that would be needed in case of using a non-reconfigurable FPGA, where all coprocessors would be instantiated permanently in a static way. The reuse of hardware resources lowers the cost of the system at the expense of the reconfiguration overhead. Thanks to the fact that (i) the reconfiguration bandwidth is higher than in Excalibur embedded system scenario, (ii) most of the tasks are ported to hardware, and (iii) the hardware coprocessors run at operating frequencies slightly higher (100 MHz & 50 MHz instead of 48 MHz & 24 MHz, respectively), the total authentication execution time results in 205.025 ms in this scenario, what leads to a speed up of ×695.98 when compared against the purely software implementation of the recognition algorithm under the embedded system platform, and a speed up of ×15.97 with regard 288 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 to the execution time performance featured by the HPC platform presented in section 5.2.3. A picture of the physical platform ML401 used in this work is shown in Figure 141. Figure 141. Xilinx VIRTEX4-based development platform. 5.9. Conclusions Nowadays, the development of automatic biometrics-based personal recognition systems is a reality in the current technological age. Biometrics reliability is clearly overcoming those other personal recognition systems that rely on physical tokens (keys, ID cards, etc.) or the user’s knowledge of specific information (PINs, passwords, answers to checking point questions, etc.). Not only those applications demanding stringent security levels but also many daily use consumer applications request the existence of high-performance computational platforms in charge of recognizing the identity of an individual based on the analysis of his/her physiological or behavioural characteristics. The market trends show a clear spread of embedded biometric security systems in the form of many daily use applications such as access control, e-commerce, e-passport, e-health, smart ID cards, border control, mobile phones, ATM transactions, consumer electronics, etc. Among the different biometric features (iris, retina, face, hand geometry, fingerprints, voice, gait, signature, etc.) and the different biometric applications (identification or authentication stages based on either multimodal or monomodal features), the author is focused on fingerprint-based embedded authentication systems in this work. Despite the continuous advances in the field of biometrics, and the maturity of recognition technologies, the state of the art points out two main open problems in the implementation of such automatic applications: on the one hand, the needed improvement of the reliability level of the existing recognition systems in terms of accuracy, security and real-time performances; and on the other hand, the cost reduction of those physical platforms in charge of the processing. This work addresses some of the limitations featured by current systems and aims at finding the proper system architecture to develop this kind of high-performance applications at low cost. Given a fingerprint-based human recognition algorithm, its execution time performance is evaluated under different processing platforms frequently used in DSP applications. As a result of the evaluation, it is proven that those classical systems based on purely software platforms under powerful microprocessing units such as personal computers are able to provide the requested real-time performance at the expense of high system costs. Because of that, those existing solutions based on expensive multiprocessor systems like HPC, GPU, or PC platforms are discarded in this work. Other alternatives based on embedded systems with mid- or low-performance CPUs are able to meet the costs requirements, but they present huge limitations in terms of real-time performances, therefore they are also discarded. 289 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 The usage of made-to-measure hardware accelerators used as companion chips in those aforementioned embedded systems, either in the way of ASIC devices or in the way of programmable VLSI instances on FPGAs, has been already approached in some works in the scientific literature [Schaumont et al., 2005]. Although this solution is proven to accelerate the processing by exploiting the parallelism inherent to the hardware design, some limitations exist because of the usage of static hardware. Once the ASIC device is built, or the hardware processors are instantiated in the FPGA, the functionality of those devices does not change along the application execution time. Therefore, when the complexity or the amount of hardware coprocessors needed in the application increases, so do the required ASIC/FPGA hardware resources and the size of the chips, which impacts on the cost of the system. For this reason, the static implementation of multiple hardware accelerators under ASIC/FPGA devices, acting as companion chips of the system microprocessors, has been also discarded in this work, and instead of such a solution, a system architecture approach based on flexible and run-time reconfigurable hardware is proposed under programmable logic in the way of FPGA or SOPC devices. Although dynamic partial reconfiguration of FPGAs has been already used in other technical fields such as cryptography or software-defined radio [Delahaye et al., 2004], to the best of the author’s knowledge, this is the first work that exploits hardware-software co-design techniques and dynamic reconfiguration of FPGA devices in the field of biometrics to implement an AFAS application. The programmability and run-time reconfigurability performances of FPGA devices together with the inherent parallelism of hardware design provide the needed flexibility to develop made-tomeasure coprocessors in charge of accelerating those time-critical computational tasks. The scheduling of the recognition application into a series of mutually exclusive tasks, and the reutilization of those functional resources available in the FPGA by multiplexing different coprocessors in the same area along the application execution time allow reducing the size of the device, and therefore its cost, at the expense of the reconfiguration overhead. The penalty of this approach is split into (i) the reconfiguration latency –defined as the time that elapses between the request for new circuitry to be loaded onto the FPGA, and the point at which this circuitry is ready for use– and (ii) the extra non-volatile memory needed to allocate the full or partial reconfiguration bitstreams, corresponding to the different contexts that are time-multiplexed in the FPGA. The hardware–software co-design of an automatic fingerprint-based authentication system under different run-time reconfigurable platforms is presented as proof of concept of the suggested architecture. Two different implementations of the same application, driven by hardware-software codesign techniques under run-time reconfigurable platforms, have been carried out in this work. In the first of the embedded system platforms, the reconfiguration process is managed by the system CPU (as software task), whereas in the second embedded system platform, one dedicated hardware coprocessor takes care of the reconfiguration process (hardware task). In the first of the embedded system platforms, the SOPC device only allows full reconfiguration of the programmable logic fabric, and the reconfiguration overhead is constrained by the limited configuration bandwidth and by the fact that the reconfiguration process can only be performed by the system CPU. Therefore, and in order to minimize the execution time performance, as many coprocessors as possible are instantiated in each of the FPGA contexts to minimize the number of reconfigurations needed in the application. In the second embedded system platform, however, the SOPC device allows partial reconfiguration, and the reconfiguration process can be managed by hardware through the implementation of a made-to-measure reconfiguration controller. Once fixed the size of the reconfigurable region in the design and built the reconfiguration controller, the reconfiguration latency is limited by the complexity of the circuit instantiated in the PRR in each of the contexts. As many reconfigurations as needed are performed in this scenario since the reconfiguration penalty is much lower than in the previous scenario. In this scenario, a smaller FPGA is used to implement the whole application so the system cost can be improved with regard to the previous scenario. 290 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 Although both implementations feature different reconfiguration overheads, and different levels of parallelism are implemented in some of the hardware coprocessors based on the available hardware resources in each of the devices, the proposed system architecture in both scenarios permits to reach better execution time performances than in the case of purely software implementations under highbandwidth processors. The outstanding results achieved in this work, summarized in Table 67, pave the way for the implementation of biometric applications by means of hardware-software codesign techniques under run-time reconfigurable FPGAs. Task ID Task 0 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Task 8 Task 9 Task A Personal Computer Platform (Software-only Implementation) Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Image directional filtering and binarization Image smoothing Image thinning Minutiae extraction and minutiae filtering Field orientation maps alignment Minutiae alignment, feature sets matching and authentication decision Total Execution Time : (1) Embedded System Platform 1 (Dynamic Full Reconfiguration) (1) Embedded System Platform 3 (Dynamic Partial Reconfiguration) (1) Imp. SW SW SW SW SW SW SW SW SW SW SW Latency 500.000 ms 2.810 ms 0.470 ms 7.030 ms 2.500 ms 0.620 ms 15.940 ms 14.220 ms 1.410 ms 0.630 ms 3224.530 ms Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Image normalization Image isotropic filtering Field orientation Filtered field orientation Reconfiguration (0–5) → 6 Image directional filtering and binarization Reconfiguration 6 → (7–B) Image smoothing Image thinning Minutiae extraction and minutiae filtering Field orientation maps alignment Minutiae alignment, feature sets matching and authentication decision Total Execution Time : (1) Imp. HW-SW HW-SW HW-SW HW-SW HW-SW SW SW HW-SW SW HW-SW HW-SW SW HW-SW Latency 500.000 ms 1.288 ms 2.267 ms 2.179 ms 1.288 ms 105.479 ms 179.918 ms 1.337 ms 179.918 ms 1.407 ms 1.441 ms 95.395 ms 312.074 ms Processing Stage Fingerprint acquisition and image reconstruction Image segmentation Reconfiguration 1 → 2 Image normalization Reconfiguration 2 → 3 Image isotropic filtering Reconfiguration 3 → 4 Field orientation Reconfiguration 4 → 5 Filtered field orientation Reconfiguration 5 → 6 Image directional filtering and binarization Reconfiguration 6 → 7 Image smoothing Reconfiguration 7 → 8 Image thinning Reconfiguration 8 → 9 Minutiae extraction and minutiae filtering Reconfiguration 9 → A Field orientation maps alignment Reconfiguration A → B Minutiae alignment, feature sets matching and authentication decision (1) Imp. SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW HW-SW Latency 500.000 ms 0.672 ms 0.841 ms 0.850 ms 1.045 ms 2.563 ms 1.025 ms 0.669 ms 1.046 ms 0.419 ms 1.107 ms 2.465 ms 1.045 ms 0.447 ms 0.974 ms 0.820 ms 0.943 ms 7.606 ms 1.045 ms 157.671 ms 1.035 ms 20.737 ms 205.025 ms (1) Task B SW 4.220 ms HW-SW 71.851 ms Tasks 0-B (1) 3274.380 ms 955.842 ms Total Execution Time : : Task 0 is not included in the computation of the total execution time. Table 67. Execution time performance comparison table. The state of the art in personal recognition algorithms points out a major effort in the coming years to develop more accurate algorithms. Therefore, it is expected some continuous improvements in the accuracy rates of the personal recognition applications, and it is necessary to develop physical platforms provided with open architectures able to embed those future and more complex algorithms without impacting negatively on the system costs. The combination of a general-purpose CPU and a run-time reconfigurable FPGA provides the required system flexibility at two levels: at software level, with the program code executed by the CPU; and at hardware level, with the reconfigurability performance of the hardware resources available in the FPGA. This system architecture approach is able to merge sequential processing (CPU-related) with the efficiency of parallel hardware (FPGA-related). Owing to the fact that the recognition algorithms can be split into a sequence of mutually exclusive tasks, it seems reasonable to make use of the run-time reconfiguration performance of FPGAs. In this way, it is possible to reuse the hardware resources available in the FPGA and to use a smaller FPGA to deploy the biometric application, what helps in reducing the total cost of the system when comparing against the static implementation of all the needed coprocessors in a larger FPGA. Hardware and software flexibility, real-time performance, and low-cost are some of the advantages featured by the suggested system architecture. With the relentless advances on fingerprint biometric algorithms and system-on-chip technology, the cost of solid-state fingerprint sensors on the decline, and the continuous shrinkage in semiconductor ICs and programmable logic technologies, the day is not far off when every person will have his customized embedded ID token system with a built-in fingerprint sensor as his daily use authentication tool. 291 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 292 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 6. Research Contribution Topics like the main conclusions derived from this thesis, and the present and future scientific and technological challenges to be faced in the cited areas of research are covered in the next sections. 6.1. Conclusions of the Thesis Although big efforts have been carried out by the biometrics research community in the latest forty years, nowadays, the development of reliable automatic personal authentication systems based on biometrics is still a technical challenge. As in most biometric systems studied today, those state-ofthe-art fingerprint-based personal recognition algorithms present some limitations in accuracy performance. In order to get better performances, further processing stages and computational complexity need to be introduced into the recognition algorithms, what has some impact on the architecture and the computational power of the physical systems in charge of the processing. Moreover, the implementation of more secure fingerprint matchers, able to sustain fraudulent attacks while performing real-time recognition, remains a challenge that implies the development of more efficient processing platforms. Besides, the always-present request of reducing the cost of the final product is pushing and limiting the amount of resources available in the physical systems. All the aforementioned requirements must be accomplished in order to make feasible the exploitation and the spread of the embedded biometric security worldwide in daily use consumer applications. Because of the stringent list of features demanded to the next generation of recognition systems, and the fact that continuous improvements in the recognition algorithms are foreseen for the coming future to progressively reach better accuracy levels, one solution based on hardware-software codesign techniques under run-time reconfigurable embedded systems is evaluated in this work. Two of the main features exhibited by the suggested system architecture are the flexibility at both hardware and software levels, and its parallel computational power performance. Even though other solutions may exist, the one disclosed in this work is proven to be an alternative to those already existing and more expensive solutions based on personal computer platforms running the recognition algorithms purely by software under the action of one or multiple microprocessing units at high frequency rates. The model based on purely software applications under a fixed hardware is prone to various limitations in case that some future evolutions of the recognition algorithms imply new features for the hardware platform that affect the bill of materials, the design of the PCB, etc. In contrast to this model, the introduction of programmable and runtime reconfigurable hardware adds flexibility and better performances since flexibility at both software and hardware levels is possible thanks to FGPA technologies and the usage of HDLs, IPs and EDA tools already available today. The main conclusions of the suggested approach are as follow: (i) it is feasible to develop biometrics-based user authentication systems by means of hardware-software co-design techniques under system-on-chip architectures; (ii) acceleration factors in the range from one to two orders of magnitude can be achieved when comparing the physical implementation of hardware-software embedded systems against purely software-based solutions; (iii) with the advent of hardwaresoftware oriented platforms, more complex algorithms can be implemented and executed in real time, which leads to benefits in terms of recognition accuracy and system reliability performances; and (iv) concerning to development costs issues, it is proven that application-specific hardware accelerators developed into highly-integrated FPGAs or SOPCs result in a good alternative to those other solutions based on ASICs thanks to the short design and verification cycles linked to EDA tools, as well as the high flexibility performance in terms of design maintenance tasks (future application evolutions/upgrades) exhibited by programmable logic technologies. The development of a physical embedded system from which to build the suggested AFAS application is split in two main goals: (a) the partitioning of the recognition application in a set of mutually exclusive tasks and the hardware-software codesign of each of the tasks, and (b) the introduction of run-time reconfiguration of some of the hardware resources available in the system in order to improve the functional density of the design and to reduce its cost without constraining 293 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 the final system performance. It is needed to take care of both concepts to finally succeed in the implementation of the proposed application. The main focus of this thesis is the first of the goals, whereas, in parallel, a second thesis has been developed to overcome those constraints linked to the functional density of the system, the dynamic reconfiguration of its physical resources, and the minimization of the reconfiguration overhead. The fusion of both thesis results in the development of several prototypes that prove the suitability of the suggested run-time reconfigurable embedded system architecture for the implementation of the biometric application. Although this thesis contributes to the development of a biometric application because of its inherent demanding processing features, the benefits of this thesis can be extended to any other application that requires flexibility at both hardware and software levels, high processing power through parallelism, and real-time performances at low cost. 6.2. Reseach Projects Along the development of the thesis, the author has been a member of the DES (Development of Embedded Systems) Research Group of Universitat Rovira i Virgili. Within this group, the author has had the opportunity of taking part in several national and international research projects that are detailed in the next sections. 6.2.1. TRUST-eS TRUST-eS: Technology Responses To Ubiquitous Security Threats for e-Security (2004 – 2006). Trust-eS is an international project developed by companies and universities of up to seven different European countries. This project aims at improving the security and overcoming those technological limitations present in card-based transaction platforms in the field of e-Security and eGovernment applications. Trust-eS project targets authentication technologies addressing multimodal identification with smart-card-based biometrics, cryptography and system-on-chip technologies. It is mainly focused on five research areas: a) system architecture, in terms of performance, portability, attack detection and resistance of smart card solutions based on SOC designs; b) client/server applications, in which the card is integrated into distributed architectures so it can function as a server or securely run applications that do not reside on the card itself; c) reconfigurable blocks for secure SOCs, making the card more flexible, customisable and secure by integrating a configurable hardware block and a software embedded driver; d) authentication techniques and technologies, essentially addressing fingerprint sensor technology, new multimode fused biometric identification algorithms, and biometric system architectures based on smart cards; and e) integration of smart-card interfaces in system environments, combining highly secured embedded technology with connectivity to secure smart card interface platforms. DES Research Group of Universitat Rovira i Virgili has been one of the partners involved in TrusteS project in the following research areas: (i) WP1. SOC Architecture and technologies: · WP1-T2. Reconfigurable blocks design · WP1-T4. Dedicated co-processors for biometry (ii) WP2. Smart card architecture, security and integration: · WP2-T2. Secure downloading (iii) WP6. Integration of SOC in secure environment: · WP6-T2. Biometry processing and security under grant MEDEA+ A-306, MCyT PROFIT, FIT-070000-2003-930 and FIT-360000-2005-13. 6.2.2. DELFIN DELFIN: Development of a Fingerprint Co-processing System (2004). 294 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 DELFIN is a Spanish project developed by DES Research Group under grant MCyT, SEG-200405592. The main purpose of DELFIN is the development of a low-power and low-cost system-onchip including, besides one embedded processor and its corresponding data and program memories, specific coprocessors that support the execution of the primitive operations (i.e. enrolment and matching) of any access control system based on fingerprint biometrics. The fingerprint coprocessing system is connected to the external world through two interfaces: on the one hand, the fingerprint image is stored in an external memory-like circuit (graphical interface); and on the other hand, it receives commands and data from one external host processor (application interface). The main goals of the project are (i) the development of the processing algorithms involved in the recognition process, (ii) the definition of the communication protocols with the external chips, and (iii) the physical targeting of the application by means of hardware-software codesign techniques. 6.2.3. PIBES PIBES-COBI: Biometrics Improvement and the Evaluation of its Security - Biometric Coprocessors onto Reconfigurable Hardware (2006 – 2008). PIBES is a Spanish research project leaded by the Universidad Carlos III of Madrid and the Universitat Politècnica de Catalunya of Barcelona under grant MCyT, TEC2006-12365-C02-02. PIBES tries to contribute to the improvement and the more accurate validation of biometric systems. The major research lines of this project are (i) the enhancement of the performance and the usability of biometric techniques, (ii) to progress in the development of multimodal biometric systems, (iii) the physical implementation of more efficient biometric ID devices (Biometric ID Tokens) at lower cost, and (iv) the development of a methodology and specific tools for the evaluation of the security of biometric systems. The project is split into two research areas: - PIBES-ABME: Biometric Algorithms and Methodology of Evaluation, dealt by researchers of the Universidad Carlos III and the Universidad Politécnica de Madrid. - PIBES-COBI: Biometric Co-processors onto Reconfigurable Hardware, dealt by researchers of the Universitat Politècnica de Catalunya and the Universitat Rovira i Virgili. DES Research Group of Universitat Rovira i Virgili has participated in the PIBES-COBI subproject, which aims at the development of better hardware devices for biometric processing to be integrated in portable systems. Those researchers involved in PIBES-COBI subproject have been in charge of the hardware-software codesign of image and signal processing algorithms oriented to personal biometric identification purposes under run-time reconfigurable embedded system architectures based on field programmable gate arrays and/or system-on-programmable-chips. 6.3. Future Work The physical implementation of personal recognition systems based on biometric features (e.g. iris, retina, face, hand geometry, fingerprints, voice, gait, signature, etc.) demands a multidisciplinary approach. The development of those applications involves multiple skills in several technical fields such as image processing, pattern recognition, statistics, cryptography, signal processing, embedded systems architecture, electronic design, power consumption reduction techniques, etc. Although the main focus of this thesis is the selection and the evaluation of one embedded system architecture from which to build highly flexible, efficient, and accurate personal recognition systems at low cost; other research areas need to be further analysed in order to be, in a not-toodistant future, closer to that ideal personal recognition system able to unequivocally assess the identity of any user based on those genuine biometric features. Apart from the experimental proof of concept carried out in this work when developing the application by means of hardware-software codesign techniques in case of fingerprint biometrics, additional research is needed in the following fields to optimize and to extend the usage of biometric systems in real-world applications: 295 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 a) In the field of fingerprint biometrics, although automated recognition applications have been successfully developed for law-enforcement, civilian and consumer applications in the last decades, the accuracy and/or reliability of state-of-the-art automated fingerprint matching systems is still not comparable to those manual or semi-automatic authentication systems based on human experts. The improvement of the accuracy exhibited by the recognition algorithms is one of the key factors in order to make possible the exploitation of biometrics-based recognition applications worldwide. b) Another important aspect to take into consideration is the security and the privacy protection when interacting with the authentication system. As it has been noticed in this work, the input of information to the AFAS refers to the digitized fingerprint of the user and his identity claim, whereas the output generated by the system is the authentication response (the user is or is not the person who claims to be). The authentication system handles sensitive information such as the individuals’ biometric templates, and those templates are used to protect the access to restricted areas, confidential information, or protected resources. Biometric templates clearly overcome the distinctiveness power of other identifiers such as PINs, passwords or tokens in proving the identity of the user. Nonetheless, certain repudiation levels on biometric systems exist nowadays because of the existence of known open problems that need to be tackled like the protection of the authentication system against external attacks (loss of privacy in case of disclosure of the biometric traits): user’s template retrieval, fake finger detection, finger aliveness detection, etc. In this direction, the development of system-on-chip architectures where the main components of the system (fingerprint sensor, processing units, volatile and non-volatile memory blocks, communication peripherals with an external host, etc.) are embedded in a single chip can help in improving the security of the whole system, as well as the development of cryptographic protocols in the communication link of the system with the external world. c) Another point of interest is the search of ways to reduce the power consumption of the system. The application of power reduction techniques at both device and system levels is encouraged, specially in case of autonomous and/or portable authentication embedded systems like laptops, mobile phones or PDAs. For such a purpose, the development of a physical prototype tailored to the requirements of the recognition application is suggested as an alternative to using commercial evaluation boards that embed general functionality. d) Other interests focus on the continuous reduction of the system costs. In this direction, the usage of state-of-the-art SOPC devices based on the latest narrower silicon technologies, able to embed more hardware resources in less silicon area, and with on-chip FPGA functional blocks featuring dynamic reconfiguration are some potential research fields. The minimization of the reconfiguration overhead by using devices that allow higher configuration data bandwidths (i.e. higher reconfiguration frequencies and wider configuration buses) makes possible the optimization of the reconfiguration throughput. Therefore, it is possible to partition the application in multiple contexts having less penalties in reconfiguration latency so the usage of smaller and less expensive devices can be feasible, trying thus to make the usage of those systems accessible to whoever, wherever and whenever. Once the previous goals are overcome in the short term, more ambitious targets can be planed in the mid or long term: e) Although the proposed solution is proven to achieve real-time performance when dealing with personal authentication processes (one to one or one to few matching), the next step consists of extending the advantages of the suggested system architecture to the personal identification scenario, where one to many matching processes need to be carried out meeting real-time performances as well. f) In order to improve even more the security, accuracy and reliability aspects of the personal recognition systems, in either authentication or identification applications, different monomodal biometrics can be merged in conjunction with PINs, passwords or ID tokens to build multimodal biometric systems able to provide a greater confidence to the users. The fusion of traditional and modern technologies can help in the successful deployment of biometric systems around the world. 296 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 References/Bibliography [Aboalsamh, 2010] Hatim A. Aboalsamh. A potable biometric access device using dedicated fingerprint processor. WSEAS (World Scientific and Engineering Academy and Society) Transactions on Computers, Vol. 9, No. 8, pp. 878-887, 2010. [Alibeigi et al., 2007] E. Alibeigi, S. Samavi, Z. Rahmani, and S. Shirani. Pipelined orientation estimation in fingerprint images. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim 2007), pp. 276-279, 2007. [Alibeigi et al., 2009] Eman Alibeigi, Majid Toghiani Rizi, and Parisa Behnamfar. Pipelined minutiae extraction from fingerprint images. Canadian Conference on Electrical and Computer Engineering (CCECE 2009), pp. 239-242, 2009. [Allah, 2005] Mohamed Mostafa Abd Allah. A fast and memory efficient approach for fingerprint authentication system. IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS 2005), pp. 259-263, 2005. [Alonso-Fernandez et al., 2005] Fernando Alonso-Fernandez, Julian Fierrez-Aguilar, and Javier Ortega-Garcia. An enhanced Gabor filter-based segmentation algorithm for fingerprint recognition systems. International Symposium on Image and Signal Processing and Analysis (ISPA 2005), pp. 239-244, 2005. [Anderson et al., 1991] S. Anderson, W. H. Bruce, P. B. Denyer, D. Renshaw, and G. Wang. A single chip sensor & image processor for fingerprint verification. IEEE Custom Integrated Circuits Conference (CICC 1991), pp. 12.1.1-12.1.4, 1991. [Areekul et al., 2004] Vutipong Areekul, Ukrit Watchareeruetai, and Sawasd Tantaratana. Fast separable Gabor filter for fingerprint enhancement. International Conference on Biometric Authentication (ICBA 2004), Lecture Notes in Computer Science, Vol. 3072, pp. 403-409, 2004. [Arjona et al., 2010] Rosario Arjona, Iluminada Baturone, and Santiago Sánchez-Solano. Microelectronics implementation of directional image-based fuzzy templates for fingerprints. IEEE International Conference on Microelectronics (ICM 2010), pp. 323326, 2010. [Bakhteri and Hani, 2009] Rabia Bakhteri, and Mohamed Khalil Hani. Biometric encryption using fingerprint fuzzy vault for FPGA-based embedded systems. IEEE Region 10 Conference (TENCON 2009), pp. 1-5, 2009. [Barrenechea et al., 2007] Maitane Barrenechea, Jon Altuna, and Miguel San Miguel. A low-cost FPGA-based embedded fingerprint verification and matching system. Workshop on Intelligent Solutions in Embedded Systems (WISES, 2007), pp. 250-261, 2007. [Barrenechea et al., 2009] Maitane Barrenechea, Jon Altuna, Mikel Mendicute, and Javier Del Ser. A low-cost FPGA-based embedded fingerprint verification and matching system. Intelligent Technical Systems, Lecture Notes in Electrical Engineering, Vol. 38, No. 5, pp. 247-260, 2009. [Bartunek et al., 2006] Josef Strom Bartunek, Mikael Nilsson, Jorgen Nordberg, and Ingvar Claesson. Adaptive fingerprint binarization by frequency domain analysis. Asilomar Conference on Signals, Systems and Computers (ACSSC 2006), pp. 598-602, 2006. [Bazen and Gerez, 2000] Asker M. Bazen, and Sabih H. Gerez. Directional field computation for fingerprints based on the principal component analysis of local gradients. Workshop on Circuits, Systems and Signal Processing (ProRISC 2000), pp. 1-7, 2000. [Bazen and Gerez, 2001] Asker M. Bazen, and Sabih H. Gerez. Segmentation of fingerprint images. Workshop on Circuits, Systems and Signal Processing (ProRISC 2001), pp. 276-280, 2001. [Bazen and Gerez, 2002] Asker M. Bazen, and Sabih H. Gerez. Systematic methods for the computation of the directional fields and singular points of fingerprints. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 7, pp. 905-919, 2002. 297 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Bazen et al., 2000] Asker M. Bazen, Gerben T. B. Verwaaijen, Sabih H. Gerez, Leo P. J. Veelenturf, and Berend Jan van der Zwaag. A correlation-based fingerprint verification system. Workshop on Circuits, Systems and Signal Processing (ProRISC 2000), pp. 205-213, 2000. [Becker et al., 2007] Jürgen Becker, Michael Hübner, Gerhard Hettich, Rainer Constapel, Joachim Eisenmann, and Jürgen Luka. Dynamic and partial FPGA exploitation. Proceedings of the IEEE, Vol. 95, No. 2, pp. 438-452, 2007. [Bernard et al., 2002] Sylvain Bernard, Nozha Boujemaa, David Vitale, and Claude Bricot. Fingerprint segmentation using the phase of multiscale Gabor wavelets. Asian Conference on Computer Vision (ACCV 2002), pp. 27-32, 2002. [Binbin et al., 2010] Shi Binbin, Xu Fengliang, and Dai Minli. Design and implementation of wireless fingerprint identity authentication based on GPRS. IEEE International Conference on Communication Systems, Networks and Applications (ICCSNA 2010), pp. 286-289, 2010. [Bonato et al., 2003] Vanderlei Bonato, Rolf Fredi Molz, João Carlos Furtado, Marcos Flôres Ferrão, and Fernando G. Moraes. Propose of a hardware implementation for fingerprint systems. International Conference on Field-Programmable Logic and Applications (FPL 2003), Lecture Notes in Computer Science, Vol. 2778, pp. 1158-1161, 2003. [Callaly et al., 2007] Frank Callaly, Catalin Cucu, Alex Cucos, Mark Leyden, and Peter Corcoran. Real-time fingerprint analysis & authentication for embedded appliances. IEEE International Conference on Consumer Electronics (ICCE 2007), pp. 1-2, 2007. [Cappelli et al., 2010] Raffaele Cappelli, Matteo Ferrara, and Davide Maltoni. Minutia cylinder-code: a new representation and matching technique for fingerprint recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 12, pp. 2128-2141, 2010. [Carvalho and Yehia, 2004] Cristiano Carvalho, and Hani Yehia. Fingerprint alignment using line segments. International Conference on Biometric Authentication (ICBA 2004), Lecture Notes in Computer Science, Vol. 3072, pp. 380-387, 2004. [Challita et al., 2010] Khalil Challita, Hikmat Farhat, and Khaldoun Khaldi. Biometric authentication for intrusion detection systems. IEEE International Conference on Integrated Intelligent Computing (ICIIC 2010), pp. 195-199, 2010. [Chao et al., 2005] Gwo-Cheng Chao, Shung-Shing Lee, Hung-Chuan Lai, and Shi-Jinn Horng. Embedded fingerprint verification system. IEEE International Conference on Parallel and Distributed Systems (ICPADS 2005), Vol. 2, pp. 52-57, 2005. [Chen and Chiu, 2006] Ching-Han Chen, and Kuo-En Chiu. 1-D Gabor directional filtering for low-quality fingerprint image enhancement. IEEE Conference on Industrial Electronics (IECON 2006), pp. 3466-3470, 2006. [Chen and Dai, 2005] Ching-Han Chen, and Jia-Hong Dai. An embedded fingerprint authentication system with reduced hardware resources requirement. IEEE International Symposium on Consumer Electronics (ISCE 2005), pp. 145-150, 2005. [Chen and Jain, 2007] Yi Chen, and Anil K. Jain. Dots and incipients: extended features for partial fingerprint matching. IEEE Biometrics Symposium, pp. 1-6, 2007. [Chen and Jain, 2009] Yi Chen, and Anil K. Jain. Beyond minutiae: a fingerprint individuality model with pattern, ridge and pore features. Advances in Biometrics, International Conference on Biometrics (ICB 2009), pp. 523-533, 2009. [Chen and Wang, 2003] Chaur-Chin Chen, and Yaw-Yi Wang. An AFIS using fingerprint classification. Image and Vision Computing New Zealand Conference (ICVNZ 2003), pp. 233-238, 2003. [Chen et al., 2004a] Xinjian Chen, Jie Tian, Jiangang Cheng, and Xin Yang. Segmentation of fingerprint images using linear classifier. EURASIP Journal on Applied Signal Processing, Vol. 2004, No. 4, pp. 480-494, 2004. 298 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Chen et al., 2004b] J. S. Chen, Y. S. Moon, and K. F. Fong. Efficient fingerprint image enhancement for mobile embedded systems. European Conference on Computer Vision, Workshop on Biometric Authentication (ECCV 2004, BioAW 2004), Lecture Notes in Computer Science, Vol. 3087, pp. 146-157, 2004. [Chen et al., 2005] Xinjian Chen, Jie Tian, Qi Su, Xin Yang, and Fei Yue Wang. A secured mobile phone based on embedded fingerprint recognition systems. Intelligence and Security Informatics (ISI 2005), Lecture Notes in Computer Science, Vol. 3495, pp. 549-553, 2005. [Cheng and Tian, 2004] Jiangang Cheng, and Jie Tian. Fingerprint enhancement with dyadic scale-space. Pattern Recognition Letters, Vol. 25, No. 11, pp.1273-1284, 2004. [Cheng et al., 2002] Jiangang Cheng, Jie Tian, and Tanghui Zhang. Fingerprint enhancement with dyadic scale-space. IEEE International Conference on Pattern Recognition (ICPR 2002), Vol. 1, pp. 200-203, 2002. [Cheng et al., 2004] Jiangang Cheng, Jie Tian, and Hong Chen. Fingerprint minutiae matching with orientation and ridge. International Conference on Biometric Authentication (ICBA 2004), Lecture Notes in Computer Science, Vol. 3072, pp. 351-358, 2004. [Cheng et al., 2011] Octavian Cheng, Waleed Abdulla, and Zoran Salcic. Hardware-software codesign of automatic speech recognition system for embedded real-time applications. IEEE Transactions on Industrial Electronics, Vol. 58, No. 3, pp. 850-859, 2011. [Chung et al., 2004] Yongwha Chung, Daesung Moon, Sung Bum Pan, Min Kim, and Kichul Kim. A hardware implementation of fingerprint verification for secure biometric authentication systems. Internacional Conference on Image Analysis and Recognition (ICIAR 2004), Lecture Notes in Computer Science, Vol. 3212, pp. 770-777, 2004. [Chung et al., 2005] Yongwha Chung, Kichul Kim, Min Kim, Sungbum Pan, and Neungsoo Park. A hardware implementation for fingerprint retrieval. Knowledge-Based Intelligent Information and Engineering Systems (KES 2005), Lecture Notes in Computer Science, Vol. 3683, pp. 374-380, 2005. [Danese et al., 2007] G. Danese, M. Giachero, F. Leporati, G. Matrone and N. Nazzicari. A dedicated hardware for fingerprint authentication. Knowledge-Based Intelligent Information and Engineering Systems (KES 2007), Lecture Notes in Computer Science, Vol. 4693, pp. 117-124, 2007. [Danese et al., 2009] G. Danese, M. Giachero, F. Leporati, G. Matrone, and N. Nazzicari. An FPGA-based embedded system for fingerprint matching using phase-only correlation algorithm. IEEE Euromicro Conference on Digital System Design, Architectures, Methods and Tools (DSD 2009), pp. 672-679, 2009. [Danese et al., 2010] G. Danese, M. Giachero, F. Leporati, and N. Nazzicari. A multicore embedded processor for fingerprint recognition. IEEE Euromicro Conference on Digital System Design, Architectures, Methods and Tools (DSD 2010), pp. 779-784, 2010. [Danese et al., 2011] G. Danese, M. Giachero, F. Leporati, and N. Nazzicari. An embedded multi-core biometric identification system. Microprocessors and Microsystems, Vol. 35, No. 5, pp. 510-521, 2011. [Delahaye et al., 2004] Jean-Philippe Delahaye, Guy Gogniat, Christian Roland, and Pierre Bomel. Software radio and dynamic reconfiguration on a DSP/FPGA platform. Frequenz, Journal of Telecommunications, Vol. 58, No. 5-6, pp. 152-159, 2004. [Derawi et al., 2010] Mohammad O. Derawi, Davrondzhon Gafurov, Rasmus Larsen, Christoph Busch, and Patrick Bours. Fusion of gait and fingerprint for user authentication on mobile devices. IEEE International Workshop on Security and Communication Networks (IWSCN 2010), pp. 1-6, 2010. 299 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Erat et al., 2007] Murat Erat, Kenan Danisman, Salih Ergün, Alper Kanak, and Mehmet Kayaoglu. An embedded fingerprint authentication system integrated with a hardware-based truly random number generator. International Conference on Computer Analysis of Images and Patterns (CAIP 2007), Lecture Notes in Computer Science, Vol. 4673, pp. 366-373, 2007. [Fang et al., 2007] Gang Fang, Sargur N. Srihari, Harish Srinivasan, and Prasad Phatak. Use of ridge points in partial fingerprint matching. Proceedings of SPIE: Biometric Technology for Human Identification, Vol. IV, pp. 65390D1–65390D9, 2007. [Faundez-Zanuy, 2004] Marcos Faundez-Zanuy. A door-opening system using a low-cost fingerprint scanner and a PC. IEEE Aerospace and Electronic Systems Magazine, Vol. 19, No. 8, pp. 23-26, 2004. [Faundez-Zanuy and Fabregas, 2005] Marcos Faundez-Zanuy, and Joan Fabregas. Testing report of a fingerprint-based door-opening system. IEEE Aerospace and Electronic Systems Magazine, Vol. 20, No. 6, pp 18-20, 2005. [Feng and Cai, 2006] Jianjiang Feng, and Anni Cai. Fingerprint representation and matching in ridge coordinate system. IEEE International Conference on Pattern Recognition (ICPR 2006), Vol. 4, pp. 485-488, 2006. [Feng et al., 2005] Jianjiang Feng, Zhengyu Ouyang, Fei Su, and Anni Cai. An exact ridge matching algorithm for fingerprint verification. Advances in Biometric Person Authentication (IWBRS 2005), Lecture Notes in Computer Science, Vol. 3781, pp. 103110, 2005. [Feng et al., 2006] Jianjiang Feng, Zhengyu Ouyang, and Anni Cai. Fingerprint matching using ridges. Pattern Recognition, Vol. 39, No. 11, pp. 2131-2140, 2006. [Fujii et al., 2002] Koji Fujii, Mamoru Nakanishi, Satoshi Shigematsu, Hiroki Morimura, Takahiro Hatano, Namiko Ikeda, Toshishige Shimamura, Yukio Okazaki, and Hakaru Kyuragi. A 500-dpi cellular-logic processing array for fingerprint-image enhancement and verification. IEEE Custom Integrated Circuits Conference (CICC 2002), pp. 261-264, 2002. [Galy et al., 2007] Nicolas Galy, Benoît Charlot, and Bernard Courtois. A full fingerprint verification system for a single-line sweep sensor. IEEE Sensors Journal, Vol. 7, No. 7, pp. 1054-1065, 2007. [Gao and Hall, 1989] Z. Gao, and W. R. Hall. Parallel thinning with two-subiteration algorithm. Communications of the ACM, Vol. 32, No. 3, pp. 359-373, 1989. [Gao and Xie, 2006] Jing-jing Gao, and Mei Xie. The layered segmentation, Gabor filtering and binarization based on orientation of fingerprint preprocessing. IEEE International Conference on Signal Processing (ICSP 2006), pp. 1-4, 2006. [Gil et al., 2003] Younhee Gil, Daesung Moon, Sungbum Pan, and Yongwha Chung. Fingerprint verification system involving smart card. International Conference on Information Security and Cryptology (ICISC 2002), Lecture Notes in Computer Science, Vol. 2587, pp. 510-524, 2003. [Gil et al., 2010] Charo Gil, Manuel Castro, and Mudasser Wyne. Identification in web evaluation in learning management system by fingerprint identification system. ASEE/IEEE Frontiers in Education Conference (FIE 2010), pp. T4D/1-T4D/6, 2010. [Greenberg et al., 2002] Shlomo Greenberg, Mayer Aladjem, and Daniel Kogan. Fingerprint image enhancement using filtering techniques. Real-Time Imaging, Vol. 8, No. 3, pp. 227–236, 2002. [Gu et al., 2004] Jinwei Gu, Jie Zhou, and David Zhang. A combination model for orientation field of fingerprints. Pattern Recognition, Vol. 37, No. 3, pp. 543-553, 2004. [Gu et al., 2006] Jinwei Gu, Jie Zhou, and Chunyu Yang. Fingerprint recognition by combining global structure and local cues. IEEE Transactions on Image Processing, Vol. 15, No. 7, pp. 1952-1964, 2006. 300 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Gupta et al., 2005] Pallav Gupta, Srivaths Ravi, Anand Raghunathan, and Niraj K. Jha. Efficient fingerprint-based user authentication for embedded systems. Design Automation Conference (DAC 2005), pp. 244-247, 2005. [Hadhoud et al., 2006] Mohiy M. Hadhoud, Wael S. El-kelany, and Mina Ibrahim Samaan. An adaptive algorithm for improved enhancement of fingerprints. National Radio Science Conference (NRSC 2006), pp. 1-7, 2006. [Hatami et al., 2005] Safar Hatami, Reshad Hosseini, Mahmoud Kamarei, and Hossein Ahmadi. Wavelet based fingerprint image enhancement. IEEE International Symposium on Circuits and Systems (ISCAS 2005), Vol. 5, pp. 4610-4613, 2005. [He et al., 2003] Yuliang He, Jie Tian, Xiping Luo, and Tanghui Zhang. Image enhancement and minutiae matching in fingerprint verification. Pattern Recognition Letters, Vol. 24, No. 9-10, pp. 1349-1360, 2003. [Helfroush and Ghassemian, 2007] Sadegh Helfroush, and Hassan Ghassemian. Nonminutiae-based decision-level fusion for fingerprint verification. EURASIP Journal on Advances in Signal Processing, Vol. 2007, No. 60590, pp. 1-11, 2007. [Hepp et al., 2008] S. Hepp, G. Klima, A. Kadlec, L. Krammer, W. Luckner, D. Prokesch, S. Resch, S. Tauner, A. Wasicek, M. Wenzl, J. Wilhelm, P. Tummeltshammer, and M. Delvai. Exploring hardware software partitioning on the example of a fingerprint verification system. Austrian Workshop on Microelectronics, pp. 1-6, 2008. [Hermanto et al., 2010] Lingga Hermanto, Sunny Arief Sudiro, and Eri Prasetyo Wibowo. Hardware implementation of fingerprint image thinning algorithm in FPGA device. IEEE International Conference on Networking and Information Technology (ICNIT 2010), pp. 187-191, 2010. [Hiew et al., 2007] B. Y. Hiew, Andrew B. J. Teoh, and Y. H. Pang. Touch-less fingerprint recognition system. IEEE Workshop on Automatic Identification Advanced Technologies (AutoID 2007), pp. 24-29, 2007. [Hong, 2011] Sun Hong. Design of embedded automated fingerprint identification system based on DSP. Advanced Engineering Forum, Vol. 1, pp. 97-101, 2011. [Hong et al., 1996] Lin Hong, Anil Jain, Sharathcha Pankanti, and Ruud Bolle. Fingerprint enhancement. IEEE Workshop on Applications of Computer Vision (WACV 1996), pp. 202-207, 1996. [Hong et al., 1998] Lin Hong, Yifei Wan, and Anil Jain. Fingerprint image enhancement: algorithm and performance evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 8, pp. 777-789, 1998. [Hongbin et al., 2007] Pu Hongbin, Chen Junali, and Zhang Yashe. Fingerprint thinning algorithm based on mathematical morphology. IEEE International Conference on Electronic Measurement and Instruments (ICEMI 2007), pp. 2/618-2/621, 2007. [Houari et al., 2010] Kobzili El Houari, Benbouchama Cherrad, and Irki Zohir. A software-hardware mixed design for the FPGA implementation of the real-time edge detection. IEEE International Conference on Systems, Man and Cybernetics (SMC 2010), pp. 4091-4095, 2010. [Hsiao et al., 2004] Pei-Yung Hsiao, Chun-Ho Hua, and Chien-Chen Lin. A novel FPGA architectural implementation of pipelined thinning algorithm. IEEE International Symposium on Circuits and Systems (ISCAS 2004), Vol. 2, pp. 593-596, 2004. [Hsiao et al., 2006] P. Y. Hsiao, X. Z. Chen, C. C. Lin, C. H. Hua, and C. C. Chang. Employing pipelined thinning architecture for realtime fingerprint verifier. IEE Proceedings Computers and Digital Techniques, Vol. 153, No. 5, pp. 348-354, 2006. [Hu et al., 2008] Chunfeng Hu, Jianping Yin, En Zhu, Hui Chen, and Yong Li. Fingerprint alignment using special ridges. IEEE International Conference on Pattern Recognition (ICPR 2008), pp. 1-4, 2008. 301 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Huang et al., 2007] Peihao Huang, Chia-Yung Chang, and Chaur-Chin Chen. Implementation of an automatic fingerprint identification system. IEEE International Conference on Electro/Information Technology (EIT 2007, pp. 412-415, 2007. [Hwang and Verbauwhede, 2004] David D. Hwang, and Ingrid Verbauwhede. Design of portable biometric authenticators -energy, performance, and security tradeoffs. IEEE Transactions on Consumer Electronics, Vol. 50, No. 4, pp. 1222-1231, 2004. [Hwang et al., 2003] David Hwang, Bo-Cheng Lai, Patrick Schaumont, Kazuo Sakiyama, Yi Fan, Shenglin Yang, Alireza Hodjat, and Ingrid Verbauwhede. Design flow for HW/SW acceleration transparency in the ThumbPod secure embedded system. ACM/IEEE Design Automation Conference (DAC 2003), pp. 60-65, 2003. [Idros et al., 2010] M. F. M. Idros, S. A. Mohamed, A. H. A. Razak, A. S. Zoolfakar, and S. A. M. Al-Junid. Improvisation of Gabor filter design using Verilog HDL. IEEE International Conference on Electronic Devices, Systems and Applications (ICEDSA 2010), pp. 183-186, 2010. [Im et al., 2000] Sang-Kyun Im, Hyung-Man Park, Soo-Won Kim, Chang-Kyung Chung, and Hwan-Soo Choi. Improved vein pattern extracting algorithm and its implementation. IEEE International Conference on Consumer Electronics (ICCE 2000), pp. 2-3, 2000. [Ito et al., 2005] Koichi Ito, Ayumi Morita, Takafumi Aoki, Hiroshi Nakajima, Koji Kobayashi, and Tatsuo Higuchi. A fingerprint recognition algorithm combining phase-based image matching and feature-based matching. Advances in Biometrics, International Conference on Biometrics (ICB 2006), Lecture Notes in Computer Science, Vol. 3832, pp. 316-325, 2005. [Jain et al., 1997a] Anil Jain, Lin Hong, and Ruud Bolle. On-line fingerprint verification. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 4, pp. 302-314, 1997. [Jain et al., 1997b] Anil K. Jain, Lin Hong, Sharath Pankanti, and Ruud Bolle. An identity-authentication system using fingerprints. Proceedings of the IEEE, Vol. 85, No. 9, pp. 1365-1388, 1997. [Jain et al., 1999a] Anil K. Jain, Ruud Bolle, and Sarath Pankanti. Biometrics: personal identification in networked society, Kluwer Academic Publishers, ISBN:0792383451, 1999. [Jain et al., 1999b] Anil K. Jain, Salil Prabhakar, and Lin Hong. A multichannel approach to fingerprint classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 4, pp. 348-359, 1999. [Jain et al., 1999c] L. C. Jain, U. Halici, I. Hayashi, S. B. Lee, and S. Tsutsui. Intelligent biometric techniques in fingerprint and face recognition, CRC Press, ISBN: 0-8493-2055-0, 1999. [Jain et al., 2000] Anil K. Jain, Salil Prabhakar, Lin Hong, and Sharath Pankanti. Filterbank-based fingerprint matching. IEEE Transactions on Image Processing, Vol. 9, No. 5, pp. 846-859, 2000. [Jain et al., 2001] Anil Jain, Arun Ross, and Salil Prabhakar. Fingerprint matching using minutiae and texture features. IEEE International Conference on Image Processing (ICIP 2001), Vol. 3, pp. 282-285, 2001. [Jain et al., 2006] Anil Jain, Yi Chen, and Meltem Demirkus. Pores and ridges: fingerprint matching using level 3 features. IEEE International Conference on Pattern Recognition (ICPR 2006), pp. 477-480, 2006. [Jain et al., 2007] Anil K. Jain, Yi Chen, and Meltem Demirkus. Pores and ridges: high-resolution fingerprint matching using level 3 features. IEEE Transactions on Pattern Analysis and Machine Intelligence,Vol. 29, No. 1, pp. 15-27, 2007. [Jain et al., 2010] Anil K. Jain, Jianjiang Feng, Karthik Nandakumar. Fingerprint matching. IEEE Computer, Vol 43, No. 2, pp. 36-44, 2010. 302 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Jang et al., 2005] Wonchurl Jang, Deoksoo Park, Dongjae Lee, and Sung-jae Kim. Fingerprint image enhancement based on a half Gabor filter. Advances in Biometrics, International Conference on Biometrics (ICB 2006), Lecture Notes in Computer Science, Vol. 3832, pp. 258-264, 2005. [Jea and Govindaraju, 2005] Tsai-yang Jea, and Venu Govindaraju. A minutia-based partial fingerprint recognition system. Pattern Recognition, Vol 38, pp. 1672-1684, 2005. [Ji et al., 2007] Luping Ji, Zhang Yi, Lifeng Shang, and Xiaorong Pu. Binary fingerprint image thinning using template-based PCNNs. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, Vol. 37, No. 5, pp. 1407-1413, 2007. [Jiang, 2000] Xudong Jiang. Fingerprint image ridge frequency estimation by higher order spectrum. IEEE International Conference on Image Processing (ICIP 2000), Vol. 1, pp. 462-465, 2000. [Jiang and Crookes, 2008] Richard M. Jiang, and Danny Crookes. FPGA-based minutia matching for biometric fingerprint image database retrieval. Journal of Real-Time Image Processing, Vol. 3, No. 3, pp. 177-182, 2008. [Jiang and Yau, 2000] Xudong Jiang, and Wei-Yun Yau. Fingerprint minutiae matching based on the local and global structures. IEEE International Conference on Pattern Recognition (ICPR 2000), Vol. 2, pp.1038-1041, 2000. [Jiang et al., 2001] Xudong Jiang, Wei-Yun Yau, and Wee Ser. Detecting the fingerprint minutiae by adaptive tracing the gray-level ridge. Pattern Recognition, Vol. 34, No. 5, pp. 999-1013, 2001. [Jinhai, 2011] Zhang Jinhai. Hardware design of embedded fingerprint identification system. IEEE International Conference on Consumer Electronics, Communications and Networks (CECNet 2011), pp. 4995-4998, 2011. [Kannavara and Bourbakis, 2009] Raghudeep Kannavara, and Nikolaos G. Bourbakis. Fingerprint biometric authentication based on local global graphs. IEEE National Aerospace & Electronics Conference (NAECON 2009), pp. 200-204, 2009. [Kannavara et al., 2009] Raghudeep Kannavara, George Bebis, and Nikolaos Bourbakis. An FPGA implementation of the local global graphbased voice biometric authentication scheme. IEEE International Conference on Digital Signal Processing (DSP 2009), pp. 1-7, 2009. [Kertész, 2008] Cs. Z. Kertész. Speed-optimized fingerprint image enhancement for embedded systems. IEEE International Conference on Optimization of Electrical and Electronic Equipment (OPTIM 2008), pp. 75-79, 2008. [Khalil-Hani and Eng, 2010] M. Khalil-Hani, and P. C. Eng. FPGA-based embedded system implementation of finger vein biometrics. IEEE Symposium on Industrial Electronics and Applications (ISIEA 2010), pp. 700-705, 2010. [Kheiri et al., 2005] Farshad Kheiri, Shadrokh Samavi, and Nader Karimi. A new pipeline design for binarization and thinning of fingerprint images. Canadian Conference on Electrical and Computer Engineering (CCECE 2005), pp. 2013-2016, 2005. [Kim and Park, 2003] Doo-Hyun Kim, and Rae-Hong Park. Fingerprint binarization using convex threshold. IASTED International Conference on Computer Graphics and Imaging, pp. 224-227, 2003. [Kim et al., 2005] Seong-Jin Kim, Kwang-Hyun Lee, Sang-Wook Han, and Euisik Yoon. A 200×160 pixel CMOS fingerprint recognition SoC with adaptable column-parallel processors. IEEE International Solid-State Circuits Conference (ISSCC 2005), Vol. 1, pp. 250-596, 2005. [Kim et al., 2007] Ki-Hoon Kim, Pham Dai Xuan, Pham Cong Thien, and Jae-Wook Jeon. Real-time skeletonization using FPGA. IEEE International Conference on Control, Automation and Systems (ICCAS 2007), pp. 1182-1186, 2007. 303 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Kim et al., 2008] Ki Hoon Kim, Pham Cong Thien, Seung Hun Jin, Dong Kyun Kim, and Jae Wook Jeon. Dedicated parallel thinning architecture based on FPGA. IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI 2008), pp. 208-213, 2008. [Klimanee and Nguyen, 2004] C. Klimanee, and D. Nguyen. On the design of 2D Gabor filtering of fingerprint images. IEEE Consumer Communications and Networking Conference (CCNC 2004), pp. 430-435, 2004. [Kovács-Vajna, 2000] Zsolt Miklós Kovács-Vajna. A fingerprint verification system based on triangular matching and dynamic time warping. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 11, pp. 1266-1276, 2000. [Kovács-Vajna et al., 2000] Zs. M. Kovács-Vajna, R. Rovatti, and M. Frazzoni. Fingerprint ridge distance computation methodologies. Pattern Recognition, Vol. 33, No. 1, pp. 69-80, 2000. [Krivec et al., 2003] Vuk Krivec, Josef Alois Birchbauer, Wolfgang Marius, and Horst Bischof. A hybrid fingerprint matcher in memory constrained environments. International Symposium on Image and Signal Processing and Analysis (ISPA 2003), pp. 617-620, 2003. [Kulkarni et al., 2006] Jayant V. Kulkarni, Bhushan D. Patil, and Raghunath S. Holambe. Orientation feature for fingerprint matching. Pattern Recognition, Vol. 39, No. 8, pp. 1551-1554, 2006. [Kumar et al., 2007] A. Pavan Kumar, V. Kamakoti, and Sukhendu Das. System-on-programmable-chip implementation for on-line face recognition. Pattern Recognition Letters, Vol. 28, No.3, pp. 342-349, 2007. [Lam et al., 1992] Louisa Lam, Seong-Whan Lee, and Ching Y. Suen. Thinning methodologies – A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 14, No. 9, pp. 869-885, 1992. [Lee and Wang, 1999] Chih-Jen Lee, and Sheng-De Wang. A Gabor filter-based approach to fingerprint recognition. IEEE Workshop on Signal Processing Systems (SiPS 1999), pp. 371-378, 1999. [Lee et al., 2002] Dongjae Lee, Kyoungtaek Choi, and Jaihie Kim. A robust fingerprint matching algorithm using local alignment. IEEE International Conference on Pattern Recognition (ICPR 2002), Vol. 3, pp. 803-806, 2002. [Lee et al., 2003] Dongjae Lee, Kyoungtaek Choi, Sanghoon Lee, and Jaihie Kim. Fingerprint fusion based on minutiae and ridge for enrollment. Audio- and Video-Based Biometric Person Authentication (AVBPA 2003), Lecture Notes in Computer Science, Vol. 2688, pp. 478–485, 2003. [Lei et al., 2010] Feng Lei, Zhang Xin, Liang Chunhui, and Li Tianyu. Research of fingerprint module detector system. IEEE Internation Conference on Computer Application and System Modeling (ICCASM 2010), Vol. 7, pp. 497-500, 2010. [Li and Chen, 2010] Guodong Li, and Hu Chen. A new high-level security portable system based on USB key with fingerprint. IEEE International Conference on Computer Design and Applications (ICCDA 2010), Vol. 1, pp. 159- 162, 2010. [Li and Qi, 2010] Chao Li, and Jin Qi. A two-factor authentication design of fingerprint recognition system based on DSP and RF card. IEEE International Conference on Computer and Automation Engineering (ICCAE 2010), Vol. 2, pp. 441-445, 2010. [Li et al., 2005] Fang Li, Maylor K.H. Leung, and Chuan Liu. Fingerprint alignment using ring model. IEEE International Conference on Information Technology and Applications (ICITA 2005), Vol. 1, pp. 738-743, 2005. [Li et al., 2007] Linchuan Li, Yao Zhang, Chengdong Ge. Fingerprint identification system based on the Nios II processor. Altera Corporation: Nios II Embedded Processor Design Contest - Outstanding Designs 2007. 304 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Lim et al., 2010] Sung Jin Lim, Seung-Hoon Chae, and Sung Bum Pan. VLSI architecture of the fuzzy fingerprint vault system. Artificial Neural Networks in Pattern Recognition (ANNPR 2010), Lecture Notes in Computer Science, Vol. 5998, pp. 252-258, 2010. [Lindoso and Entrena, 2007] Almudena Lindoso, and Luis Entrena. High performance FPGA-based image correlation. Journal of Real-Time Image Processing, Vol. 2, No. 4, pp. 223-233, 2007. [Lindoso et al., 2005] Almudena Lindoso, Luis Entrena, Celia López-Ongil, and Judith Liu. Correlation-based fingerprint matching using FPGAs. IEEE International Conference on Field-Programmable Technology (FPT 2005), pp. 87-94, 2005. [Lindoso et al., 2007a] Almudena Lindoso, Luis Entrena, Judith Liu-Jimenez, and Enrique San Millan. Correlation-based fingerprint matching with orientation field alignment. Advances in Biometrics, Internacional Conference on Biometrics (ICB 2007), Lecture Notes in Computer Science, Vol. 4642, pp. 713-721, 2007. [Lindoso et al., 2007b] Almudena Lindoso, Luis Entrena, and Juan Izquierdo. FPGA-based acceleration of fingerprint minutiae matching. IEEE Southern Conference on Programmable Logic (SPL 2007), pp. 81-86, 2007. [Lindoso et al., 2007c] Almudena Lindoso, Luis Entrena, Judith Liu-Jimenez, and Enrique San Millan. Increasing security with correlationbased fingerprint matching. IEEE International Carnahan Conference on Security Technology (CCST 2007), pp. 37-43, 2007. [Lindoso et al., 2008] Almudena Lindoso, Luis Entrena, Juan Izquierdo, and Judith Liu-Jimenez. Coarse-grain dynamically reconfigurable coprocessor for image processing in SoPC. IEEE International Conference on Field Programmable Logic and Applications (FPL 2008), pp. 539-542, 2008. [Liu et al., 2000] Jinxiang Liu, Zhongyang Huang, and Kap Luk Chan. Direct minutiae extraction from gray-level fingerprint image by relationship examination. International Conference on Image Processing (ICIP 2000), Vol. 2, pp. 427-430, 2000. [Liu et al., 2005] Manhua Liu, Xudong Jiang, and Alex Chichung Kot. Fingeprint reference-point detection. EURASIP Journal on Applied Signal Processing, Vol. 2005, No. 4, pp. 498-509, 2005. [Liu et al., 2010a] Jun-bao Liu, Shuai Wang, Yi Li, Jun Han, Xiao-yang Zeng. Configurable pipelined Gabor filter implementation for fingerprint image enhancement. IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT 2010), pp. 584-586, 2010. [Liu et al., 2010b] Chunjiang Liu, and Fang Lv. Design of highly reliable fingerprint access control system based on C8051F020 single chip. IEEE International Conference on Computer and Automation Engineering (ICCAE 2010), pp. 271-274, 2010. [Liu et al., 2010c] Wenzhou Liu, ChangQing Cai, Zhuo Zhang, and Li Zhang. Study and design of deposit box based on fingerprint recognition. IEEE International Conference on Computational Intelligence and Software Engineering (CiSE 2010), pp. 1-4, 2010. [Liu-Jimenez et al., 2005] Judith Liu-Jimenez, Raul Sanchez-Reillo, and Carmen Sanchez-Avila. Full hardware solution for processing iris biometrics. IEEE International Carnahan Conference on Security Technology (CCST 2005), pp. 157-163, 2005. [Liu-Jimenez et al., 2006] Judith Liu-Jimenez, Raul Sanchez-Reillo, Almudena Lindoso, and Oscar Miguel-Hurtado. FPGA implementation for an iris biometric processor. IEEE International Conference on Field-Programmable Technology (FPT 2006), pp. 265-268, 2006. [Liu-Jimenez et al., 2007] Judith Liu-Jimenez, Oscar Miguel-Hurtado, Almudena Lindoso, and Belen Fernandez-Saavedra. Hardware/software codesign for an iris biometric search engine. EUROCON, The International Conference on "Computer as a Tool", pp. 642-648, 2007. 305 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Liu-Jimenez et al., 2011] Judith Liu-Jimenez, Raul Sanchez-Reillo, and Belen Fernandez-Saavedra. Iris biometrics for embedded systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 19, No. 2, pp. 274-282, 2011. [López and Cantó, 2006] Mariano López García, and Enrique F. Cantó Navarro. FPGA implementation of a ridge extraction fingerprint algorithm based on Microblaze and hardware coprocessor. IEEE International Conference on Field Programmable Logic and Applications (FPL 2006), pp. 1-5, 2006. [López and Cantó, 2008] Mariano López, and Enrique Cantó. FPGA implementation of a minutiae extraction fingerprint algorithm. IEEE International Symposium on Industrial Electronics (ISIE 2008), pp. 1920-1925, 2008. [López et al., 2005] Víctor López Lorenzo, Pablo Huerta Pellitero, José Ignacio Martínez Torre, and Javier Castillo Villar. Fingerprint minutiae extraction based on FPGA and Matlab. XX Conference on Design of Circuits and Integrated Systems (DCIS 2005), pp. 1-6, 2005. [López et al., 2011] M. López, J. Daugman, and E. Cantó. Hardware-software co-design of an iris recognition algorithm. IET Information Security, Vol. 5, No. 1, pp. 60-68, 2011. [López-Ongil et al., 2004] Celia López-Ongil, Raul Sanchez-Reillo, Judith Liu-Jimenez, Fernando Casado, Leslie Sánchez, and Luis Entrena. FPGA implementation of biometric authentication system based on hand geometry. International Conference on Field Programmable Logic and Applications (FPL 2004), Lecture Notes in Computer Science, Vol. 3203, pp. 43-53, 2004. [Lorrentz et al., 2009] P. Lorrentz, W. G. J. Howells, and K. D. McDonald-Maier. A fingerprint identification system using adaptive FPGAbased enhanced probabilistic convergent network. NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2009), pp. 204-211, 2009. [Luo et al., 2000] Xiping Luo, Jie Tian, and Yan Wu. A minutia matching algorithm in fingerprint verification. IEEE International Conference on Pattern Recognition (ICPR 2000), Vol. 3, pp. 833-836, 2000. [Ma et al., 2005] Rui Ma, Yaxuan Qi, Changshui Zhang, and Jiaxin Wang. A novel approach to fingerprint ridge line extraction. IEEE International Symposium on Communications and Information Technology (ISCIT 2005), pp. 2-5, 2005. [Mainguet et al., 2000] Jean-François Mainguet, Marc Pégulu, and John B. Harris. Fingerprint recognition based on silicon chips. Future Generation Computer Systems, Vol. 16, No. 4, pp. 403-415, 2000. [Maio and Maltoni, 1997] Dario Maio, and Davide Maltoni. Direct gray-scale minutiae detection in fingerprints. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 1, pp. 27-40, 1997. [Maio and Maltoni, 1998] Dario Maio, and Davide Maltoni. Ridge-line density estimation in digital images. IEEE International Conference on Pattern Recognition (ICPR 1998), Vol. 1, pp. 534-538, 1998. [Malki et al., 2006] Suleyman Malki, Yu Fuqiang, and Lambert Spaanenburg. Vein feature extraction using DT-CNNs. IEEE International Workshop on Cellular Neural Networks and Their Applications(CNNA 2006), pp. 1-6, 2006. [Maltoni et al., 2003] Davide Maltoni, Dario Maio, Anil K. Jain, Salil Prabhakar. Handbook of fingerprint recognition, Springer, New York, ISBN: 0-387-95431-7, 2003. [Maltoni et al., 2009] Davide Maltoni, Dario Maio, Anil K. Jain, Salil Prabhakar. Handbook of fingerprint recognition – Second Edition, Springer, London, ISBN: 978-1-84882-253-5, 2009. [Marana and Jain, 2005] Aparecido Nilceu Marana, and Anil K. Jain. Ridge-based fingerprint matching using Hough transform. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2005), pp. 112-119, 2005. 306 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Martell and Abe, 2009] Mario Alberto Chapa Martell, and Koki Abe. Fingerprint image enhancement algorithm implemented on an FPGA. The Proceedings of UEC International Mini-Conference for Exchange Students XXII (JUSST Program), pp. 1-6, 2009. [Marupudi et al., 2006] Naveena Marupudi, Eugene John, and Fred Hudson. Fingerprint verification in multimodal biometrics. IEEE Region 5 Conference, pp. 130-136, 2006. [Matai et al., 2011] Janarbek Matai, Ali Irturk, and Ryan Kastner. Design and implementation of an FPGA-based real-time face recognition system. IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2011), pp. 97100, 2011. [Mateos and Pizarro, 2005] Marino Tapiador Mateos, Juan A. Sigüenza Pizarro. Tecnologías biométricas aplicadas a la seguridad. Edición Ra-ma, ISBN: 84-7897-636-1, 2005. [Meenen and Adhami, 2005] Peter Meenen, and Reza Adhami. Approaches to image binarization in current automated fingerprint identification systems. IEEE Southeastern Symposium on System Theory (SSST 2005), pp. 276-281, 2005. [Milici et al., 2005] G. Milici, G. Raia, S. Vitabile, F. Sorbello. Fingerprint image enhancement using directional morphological filter. IEEE International Conference on Computer as a Tool (EUROCON 2005), pp. 967-970, 2005. [Militello et al., 2008] C. Militello, V. Conti, F. Sorbello, and S. Vitabile. A novel embedded fingerprints authentication system based on singularity points. IEEE International Conference on Complex, Intelligent and Software Intensive Systems (CISIS 2008), pp. 72-78, 2008. [Militello et al., 2009] C. Militello, V. Conti, F. Sorbello, and S. Vitabile. An embedded module for iris micro-characteristics extraction. IEEE International Conference on Complex, Intelligent and Software Intensive Systems (CISIS 2009), pp. 223-230, 2009. [Militello et al., 2011] C. Militello, V. Conti, S. Vitabile, and F. Sorbello. Embedded access points for trusted data and resources access in HPC systems. The Journal of Supercomputing , Vol. 55, No.1, pp. 4-27, 2011. [Mital and Teoh, 1996] Dinesh P. Mital, and Eam Khwang Teoh. An automated matching technique for fingerprint identification. IEEE International Conference on Industrial Electronics, Control, and Instrumentation (IECON 1996), Vol. 2, pp. 806-811, 1996. [Mohd-Yasin et al., 2004] F. Mohd-Yasin, A. L. Tan, and M. L. Reaz. The FPGA prototyping of iris recognition for biometric identification employing neural network. IEEE International Conference on Microelectronics (ICM 2004), pp. 458-461, 2004. [Moon et al., 2003] Daesung Moon, Youn Hee Gil, Sung Bum Pan, and Yongwha Chung. Implementation of the USB token system for fingerprint verification. Scandinavian Conference on Image Analysis (SCIA 2003), Lecture Notes in Computer Science, Vol. 2749, pp. 998-1005, 2003. [Moon et al., 2004] Daesung Moon, Youn Hee Gil, Dosung Ahn, Sung Bum Pan, Yongwha Chung, and Chee Hang Park. Fingerprint-based authentication for USB token systems. International Workshop on Information Security Applications (WISA 2003), Lecture Notes in Computer Science, Vol. 2908, pp. 355-364, 2004. [Moon et al., 2005] Y. S. Moon, K. F. Fong, and K. C. Chan. Secure and fast fingerprint authentication on smart card. IEEE International Conference on Sciences of Electronic, Technologies of Information and Telecommunications (SETIT 2005), pp. 1-5, 2005. [Moon et al., 2009] Daesung Moon, Yongwha Chung, Sung Bum Pan, and Jin-Won Park. Integrating fingerprint verification into the smart card-based healthcare information system. EURASIP Journal on Advances in Signal Processing, Vol. 2009, No. 845893, pp. 1-12, 2009. 307 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Morimura et al., 2002] H. Morimura, S. Shigematsu, T. Shimamura, K. Fujii, C. Yamaguchi, H. Suto, Y. Okazaki, K. Machida, and H. Kyuragi. An advanced fingerprint sensor LSI and its application to a fingerprint identification system. IEEE Symposium on VLSI Circuits (VLSIC 2002), pp. 272-275, 2002. [Moskovitch et al., 2009] Robert Moskovitch, Clint Feher, Arik Messerman, Niklas Kirschnick, Tarik Mustafic, Ahmet Camtepe, Bernhard Lohlein, Ulrich Heister, Sebastian Moller, Lior Rokach, and Yuval Elovici. Identity theft, computers and behavioral biometrics. IEEE International Conference on Intelligence and Security Informatics (ISI 2009), pp. 155-160, 2009. [Mueller and Sanchez-Reillo, 2009] Robert Mueller, and Raul Sanchez-Reillo. An approach to biometric identity management using low-cost equipment. IEEE International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), pp. 1096-1100, 2009. [Munir and Javed, 2005] Muhammad Umer Munir, and Muhammad Younus Javed. Fingerprint matching using ridge patterns. IEEE International Conference on Information and Communication Technologies (ICICT 2005), pp. 116-120, 2005. [Nakajima et al., 2006] Hiroshi Nakajima, Koji Kobayashi, Makoto Morikawa, Atsushi Katsumata, Koichi Ito, Takafumi Aoki, and Tatsuo Higuchi. Fast and robust fingerprint identification algorithm and its application to residential access controller. Advances in Biometrics, International Conference on Biometrics (ICB 2006), Lecture Notes in Computer Science, Vol. 3832, pp. 326-333, 2006. [Nandakumar and Jain, 2004] Karthik Nandakumar, and Anil K. Jain. Local correlation-based fingerprint matching. Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2004), pp. 503-508, 2004. [Nanni and Lumini, 2007] Loris Nanni, and Alessandra Lumini. A hybrid wavelet-based fingerprint matcher. Pattern Recognition, Vol. 40, No. 11, pp. 3146-3151, 2007. [Nanni and Lumini, 2008] Loris Nanni, and Alessandra Lumini. Local binary patterns for a hybrid fingerprint matcher. Pattern Recognition, Vol. 41, No. 11, pp. 3461–3466, 2008. [Neji et al., 2011] Nihel Neji, Anis Boudabous, Wajdi Kharrat, and Nouri Masmoudi. Architecture and FPGA implementation of the CORDIC algorithm for fingerprints recognition systems. IEEE International Multi-Conference on Systems, Signals & Devices (SSD 2011), pp. 1-5, 2011. [Nie et al., 2005] Dongdong Nie, Lizhuang Ma, XueZhong Xiao, and Shuangjiu Xiao. Optimization based fingerprint direction field estimation. IEEE International Conference of the Engineering in Medicine and Biology Society (EMBS 2005), pp. 6265-6268, 2005. [Nilsson and Bigun, 2002] Kenneth Nilsson, and Josef Bigun. Prominent symmetry points as landmarks in fingerprint images for alignment. IEEE International Conference on Pattern Recognition (ICPR 2002), Vol. 3, pp. 395-398, 2002. [Ning, 2010] Chang Ning. The implementation of fingerprint identification preprocessing algorithm on DSP. IEEE International Conference on Intelligent Computation Technology and Automation (ICICTA 2010), pp. 836-839, 2010. [Onnia and Tico, 2002] Vesa Onnia, and Marius Tico. Adaptive binarization method for fingerprint images. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002), Vol. 4, pp. 3692-3695, 2002. [Pan et al., 2003] Sung Bum Pan, Daesung Moon, Younhee Gil, Dosung Ahn, and Yongwha Chung. An ultra-low memory fingerprint matching algorithm and its implementation on a 32-bit smart card. IEEE Transactions on Consumer Electronics, Vol. 49, No. 2, pp. 453-459, 2003. 308 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Pan et al., 2006] Sung Bum Pan, Daesung Moon, Kichul Kim, and Yongwha Chung. A VLSI implementation of minutiae extraction for secure fingerprint authentication. IEEE International Conference on Computational Intelligence and Security (ICCIAS 2006), pp. 1217-1220, 2006. [Pan et al., 2007] Sung Bum Pan, Daesung Moon, Kichul Kim, and Yongwha Chung. A VLSI implementation of minutiae extraction for secure fingerprint authentication. Computational Intelligence and Security (CIS 2006), Lecture Notes in Computer Science, Vol. 4456, pp. 605-615, 2007. [Pan et al., 2008] Sung Bum Pan, Daesung Moon, Kichul Kim, and Yongwha Chung. A fingerprint matching hardware for smart cards. The Institute of Electronics, Information and Communication Engineers, IEICE Electronics Express, Vol. 5, No. 4, pp.136-144, 2008. [Park et al., 2006] Chul-Hyun Park, Joon-Jae Lee, Mark J. T. Smith, and Kil-Houm Park. Singular point detection by shape analysis of directional fields in fingerprints. Pattern Recognition, Vol. 39, No. 5, pp. 839-855, 2006. [Park et al., 2007] Byungkwan Park, Daesung Moon, Yongwha Chung, and Jin-Won Park. Impact of embedding scenarios on the smart card-based fingerprint verification. International Workshop on Information Security Applications (WISA 2006), Lecture Notes in Computer Science, Vol. 4298, pp. 110-120, 2007. [Park et al., 2008] Unsang Park, Sharath Pankanti, and A. K. Jain. Fingerprint verification using SIFT features. Proceedings of SPIE 6944, 69440K-69440K-9, 2008. [Paul and Lourde, 2006] Anto Melvin Paul,and R. Mary Lourde. A study on image enhancement techniques for fingerprint identification. IEEE International Conference on Video and Signal Based Surveillance (AVSS 2006), pp. 16-21, 2006. [Peng et al., 2008] Jian Peng, Min Wu, and Yadong Liu. Design and implementation of an embedded fingerprint identification system for the bank staff identity authentication. IEEE International Conference on Embedded Software and Systems Symposia (ICESS 2008), pp. 69-72, 2008. [Ping et al., 2010] Wu Ping, Wu Guichu, Xie Wenbin, Lu Jianguo, and Li Peng. Remote monitoring intelligent system based on fingerprint door lock. IEEE International Conference on Intelligent Computation Technology and Automation (ICICTA 2010), Vol. 2, pp. 1012-1014, 2010. [Porwik and Wieclaw, 2004] Piotr Porwik, and Lukasz Wieclaw. A new approach to reference point location in fingerprint recognition. IEICE Electronic Express, Vol. 1, No. 18, pp. 575-581, 2004. [Qi et al., 2005] Jin Qi, Suzhen Yang, and Yangsheng Wang. Fingerprint matching combining the global orientation field with minutia. Pattern Recognition Letters, Vol. 26, No. 15, pp. 2424-2430, 2005. [Rakvic et al., 2009] Ryan N. Rakvic, Bradley J. Ulis, Randy P. Broussard, Robert W. Ives, and Neil Steiner. Parallelizing iris recognition. IEEE Transactions on Information Forensics and Security, Vol. 4, No. 4, pp. 812-823, 2009. [Ramos-Lara et al., 2009] Rafael Ramos-Lara, Mariano López-García, Enrique Cantó-Navarro, and Luís Puente-Rodriguez. SVM speaker verification system based on a low-cost FPGA. IEEE International Conference on Field Programmable Logic and Applications (FPL 2009), pp. 582-586. 2009. [Ranganathan and Venugopal, 1994] N. Ranganathan, and Satish Venugopal . An efficient VLSI architecture for template matching based on moment preserving pattern matching. IEEE International Conference on Pattern Recognition (ICPR 1994), Vol. 3, pp. 388-390, 1994. [Rao and Schunck, 1989] A. Ravishankar Rao, and Brian G. Schunck. Computing oriented texture fields. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 1989), pp. 61-68, 1989. 309 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Ratha et al., 1995a] Nalini K. Ratha, Anil K. Jain, and Diane T. Rover. An FPGA-based point pattern matching processor with application to fingerprint matching. Computer Architectures for Machine Perception (CAMP 1995), pp. 394-401, 1995. [Ratha et al., 1995b] Nalini K. Ratha, Shaoyun Chen, and Anil K. Jain. Adaptive flow orientation based feature extraction in fingerprint images. Pattern Recognition, Vol. 28, No. 11, pp. 1657-1672, 1995. [Ratha et al., 1996] Nalini K. Ratha, Kalle Karu, Shaoyun Chen, and Anil K. Jain. A real-time matching system for large fingerprint databases. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No. 8, pp. 799-813, 1996. [Razak and Taharim, 2009] A. H. A. Razak, and R. H. Taharim. Implementing Gabor filter for fingerprint recognition using Verilog HDL. IEEE Internacional Colloquium on Signal Processing & Its Applications (CSPA 2009), pp. 423-427, 2009. [Ribalda et al., 2010] Ricardo Ribalda, Guillermo González de Rivera, Ángel de Castro, and Javier Garrido. A mobile biometric system-ontoken system for signing digital transactions. IEEE Security and Privacy, Vol. 8, No. 2, pp. 13-19, 2010. [Rikin et al., 2002] Andy Surya Rikin, Wang Yiwen, Taro Nakada, Li Dongju, Tsuyohi Isshiki, and Hiroaki Kunieda. Realization of fingerprint identification module on DSP board. Asia-Pacific Conference on Circuits and Systems (APCCAS 2002), Vol. 1, pp. 509-512, 2002. [Rodríguez et al., 2007] David Rodríguez, Juan M. Sánchez, and Arturo Duran. Mobile fingerprint identification using a hardware accelerated biometric service provider. Reconfigurable Computing: Architectures and Applications (ARC2006), Lecture Notes in Computer Science, Vol. 3985, pp. 383-388, 2006. [Ross et al., 2002] Arun Ross, James Reisman, and Anil Jain. Fingerprint matching using feature space correlation. Post-ECCV Workshop on Biometric Authenticaiton (ECCV 2002), Lecture Notes in Computer Science, Vol. 2359, pp. 48-57, 2002. [Ross et al., 2003] Arun Ross, Anil Jain, and James Reisman. A hybrid fingerprint matcher. Pattern Recognition, Vol. 36, No. 7, pp. 16611673, 2003. [Rosshidi and Hadi, 2009] H. T. Rosshidi, and A. R. Hadi. Reconfigurable Gabor filter for fingerprint recognition using FPGA Verilog. International Conference on Nanoscience and Nanotechnology-2008, American Institute of Physics Conference Proceedings, Vol. 1136, pp. 796-800, 2009. [Ruili and Jing, 2008] Jiao Ruili, and Fan Jing. VC5509A based fingerprint identification preprocessing system. IEEE International Conference on Signal Processing (ICSP 2008), pp. 2859-2863, 2008. [Sagar et al., 1995] V. K. Sagar, C. Greening, W. Y. Tan, and C. S. A. Leung. Hardware/software co-design of a fingerprint recognition system. IEEE Colloquium on Partitioning in Hardware-Software Codesigns, pp.10/1-10/5, 1995. [Schaumont et al., 2005] Patrick Schaumont, David Hwang, and Ingrid Verbauwhede. Platform-based design for an embedded fingerprint authentication device. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 24, No. 12, pp. 1929-1936, 2005. [Shafer et al., 2010] Jennifer L. Shafer, Hau Ngo, and Robert W. Ives. Using an FPGA to accelerate pupil isolation in iris recognition. Asilomar Conference on Signals, Systems and Computers (ASILOMAR 2010), pp. 1774-1777, 2010. [Shen et al., 2001] LinLin Shen, Alex Kot, and WaiMun Koo. Quality measures of fingerprint images. Audio- and Video-Based Biometric Person Authentication (AVBPA 2001), Lecture Notes in Computer Science, Vol. 2091, pp. 266-271, 2001. [Sherlock et al., 1994] B. G. Sherlock, D. M. Monro, and K. Millard. Fingerprint enhancement by directional Fourier filtering. IEE Proceedings on Vision, Image and Signal Processing (IP-VIS 1994), Vol. 141, No. 2, pp. 87-94, 1994. 310 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Shi and Xie, 2009] Huaibin Shi, and Mei Xie. Realization of fingerprint identification on DSP. International Symposium on Advances in Computation and Intelligence (ISICA 2009), Lecture Notes in Computer Science, Vol. 5821, pp. 525-532, 2009. [Shi et al., 2004] Zhongchao Shi, Jin Qi, Xuying Zhao, and Yangsheng Wang. A study of minutiae matching algorithm based on orientation validation. Advances in Biometric Person Authentication, Sinobiometrics 2004, Lecture Notes in Computer Science, Vol. 3338, pp. 481-489, 2004. [Shi et al., 2006] Peng Shi, Jie Tian, Weihua Xie, and Xin Yang. Fast fingerprint matching based on the novel structure combining the singular point with its neighborhood minutiae. Progress in Pattern Recognition, Image Analysis and Applications (CIARP 2006), Lecture Notes in Computer Science, Vol. 4225, pp. 804-813, 2006. [Shigematsu and Morimura, 1999] Satoshi Shigematsu, and Hiroki Morimura. A high-speed pixel-parallel fingerprint identifier for fingerprint identification system on a single chip. IEEE International ASIC/SOC Conference (ASIC 1999), pp. 310-314, 1999. [Shigematsu et al., 1999] Satoshi Shigematsu, Hiroki Morimura, Yasuyuki Tanabe, Takuya Adachi, and Katsuyuki Machida. A single-chip fingerprint sensor and identifier. IEEE Journal of Solid-State Circuits, Vol. 34, No. 12, pp. 1852-1858, 1999. [Su et al., 2005a] Qi Su, Jie Tian, Xinjian Chen, and Xin Yang. A fingerprint authentication mobile phone based on sweep sensor. Pattern Recognition and Image Analysis, International Conference on Advances in Pattern Recogniton (ICAPR 2005), Lecture Notes in Computer Science, Vol. 3687, pp. 295-301, 2005. [Su et al., 2005b] Qi Su, Jie Tian, Xinjian Chen, and Xin Yang. A fingerprint authentication system based on mobile phone. Audio- and Video-Based Biometric Person Authentication (AVBPA 2005), Lecture Notes in Computer Science, Vol. 3546, pp. 151-159, 2005. [Su et al., 2010] Fei Su, Peng Sun, Long Wang, and Xiaohui Xie. An efficient minutiae-based fingerprint matching algorithm for resource constrained implementation. IEEE International Conference on Network Infrastructure and Digital Content (ICNIDC 2010), pp. 214-218, 2010. [Sudiro et al., 2008] Sunny Arief Sudiro, Michel Paindavoine, and Tubagus Maulana Kusuma. Improvement of fingerprint sensor reading using FPGA devices. IEEE Internation Conference on Computer and Electrical Engineering (ICCEE 2008), pp. 829833, 2008. [Survenika et al., 2009] S. S. Survenika, V. Karthick, K. Sreenath, and D. Selva. Fingerprint identification system with BLACKFIN processor and ATMEL’s FingerChip sensor. MASAUM Journal of Basic and Applied Sciences, Vol. 1, No. 1, pp. 31-34, 2009. [Suto et al., 2004] Hiroki Suto, Satoshi Shigematsu, Takahiro Hatano, Chikara Yamaguchi, Yukio Okazaki, and Katsuyuki Machida. Compact fingerprint verification device: FingerToken. NTT Technical Review Journal, Vol. 2, No. 2, pp. 65-69, 2004. [Tang et al., 2004] T. Y. Tang, Y. S. Moon, and K. C. Chan. Efficient implementation of fingerprint verification for mobile embedded systems using fixed-point arithmetic. ACM Symposium on Applied Computing (SAC 2004), pp. 821-825, 2004. [Thomas et al., 2010] Jithin P. Thomas, K. R. S. N. Kumar, Vamsidhar Addanki, Anu Gupta, and Nitin Chaturvedi. Hardware implementation of a biometric fingerprint identification system with embedded Matlab. IEEE International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2010), pp. 155-157, 2010. [Tico and Kuosmanen, 2003] Marius Tico, and Pauli Kuosmanen. Fingerprint matching using an orientation-based minutia descriptor. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, No. 8, pp. 1009-1014, 2003. [Tiri et al., 2005] Kris Tiri, David D. Hwang, Alireza Hodjat, Bo-Cheng Lai, Shenglin Yang, Patrick Schaumont, and Ingrid Verbauwhede. AES-based cryptographic and biometric security coprocessor IC in 0.18-µm CMOS resistant to sidechannel power analysis attacks. IEEE Symposium on VLSI Circuits (VLSIC 2005), pp. 216-219, 2005. 311 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Tong, 2010] Zeng Tong. Design and implement of fingerprint identification system based on embedded microprocessor. Electronic Design Engineering, Vol. 18, No. 4, pp. 119-122, 2010. [Tselios et al., 2008] K. Tselios, E. N. Zois, N. A. Livanos, and A. Nassiopoulos. A real time fingerprint identification system based on AFS8600 sensor and the C6713 DSP processor. Physica Status Solidi(c), Current Topics in Solid State Physics, Vol. 5, No. 12, pp. 3846-3849, 2008. [Tulabandhula et al., 2009] Theja Tulabandhula, Samuel Antão, and Leonel Sousa. A class of software-hardware processors for fingerprint matching on the Fourier domain. HiPEAC (European Network of Excellence on High Performance and Embedded Architecture and Compilation) Workshop on Reconfigurable Computing, pp. 1-10, 2009. [Tumeo et al., 2010] Antonio Tumeo, Francesco Regazzoni, Gianluca Palermo, Fabrizio Ferrandi, and Donatella Sciuto. A reconfigurable multiprocessor architecture for a reliable face recognition implementation. Design, Automation & Test in Europe Conference & Exhibition (DATE 2010), pp. 319-322, 2010. [Turroni et al., 2011] F. Turroni, D. Maltoni, R. Cappelli, and D. Maio. Improving fingerprint orientation extraction. IEEE Transactions on Information Forensics and Security, Vol.6, No.3, pp. 1002-1013, 2011. [Udupa et al., 2001] Raghavendra Udupa U., Gaurav Garg, and Pramod Sharma. Fast and accurate fingerprint verification. Audio- and Video-Based Biometric Person Authentication (AVBPA 2001), Lecture Notes in Computer Science, Vol. 2091, pp. 192-197, 2001. [Venkatesan and Rao, 2001] Muthukumar Venkatesan, and Daggu Venkateshwar Rao. Hardware acceleration of edge detection algorithm on FPGAs. Celoxica Inc. research papers, 2001. [Vitabile et al., 2005] Salvatore Vitabile, Vincenzo Conti, Giuseppe Lentini, and Filippo Sorbello. An intelligent sensor for fingerprint recognition. International Conference on Embedded and Ubiquitous Computing (EUC 2005), Lecture Notes in Computer Science, Vol. 3824, pp. 27-36, 2005. [Vitabile et al., 2007] S. Vitabile, V. Conti, C. Militello, and F. Sorbello. A self-contained biometric sensor for ubiquitous authentication. IEEE International Conference on Intelligent Pervasive Computing (IPC 2007), pp. 289-294, 2007. [Wakahara et al., 2007] Toru Wakahara, Yoshimasa Kimura, Akira Suzuki, Akio Shio, and Mutsuo Sano. Fingerprint verification using ridge direction distribution and minutiae correspondence. Systems and Computers in Japan, Vol. 38, No. 3, pp. 72-82, 2007. [Wan and Zhang, 2011] JiHua Wan, and Li Zhang. Design of embedded fingerprint identification system based on TMS320C5515. IEEE International Conference on Computer Science and Service System (CSSS 2011), pp. 3160-3163, 2011. [Wang and Gao, 2011] Fuqiang Wang, and Guohong Gao. Embedded fingerprint identification system based on DSP chip. Advances in Intelligent and Soft Computing, Advances in Computer Science, Intelligent System and Environment, Vol. 104, pp. 595-599, 2011 [Wang et al., 2004] Yiwen Wang, Dongju Li, Tsuyosi Isshiki, and Hiroaki Kunieda. A novel SOC architecture embedded bit serial FPGA. IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS 2004), Vol. 1, pp. 133-136, 2004. [Wang et al., 2005] Yiwen Wang, Dongju Li, Tsuyoshi Isshiki, and Hiroaki Kunieda. A novel fingerprint SOC with bit serial FPGA engine. Transactions of Information Processing Society of Japan, Vol. 46, No. 6, pp. 1366-1373, 2005. [Wang et al., 2007a] Xuchu Wang, Jianwei Li, and Yanmin Niu. Fingerprint matching using OrientationCodes and PolyLines. Pattern Recognition, Vol. 40, No. 11, pp. 3164-3177, 2007. 312 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Wang et al., 2007b] Ji Wang, Liang Wu, Yong Liu. Nios II processor-based fingerprint identification system. Altera Corporation: Nios II Embedded Processor Design Contest - Outstanding Designs 2007. [Wang et al., 2008] Chongwen Wang, Gangyi Ding, and Zhiwei Zheng. Fingerprint matching combining the adjacent feature with curvature of ridges. IEEE World Congress on Intelligent Control and Automation (WCICA 2008), pp. 6811-6816, 2008. [Wang et al., 2009] Jingyan Wang, Yongping Li, Ping Liang, Guohui Zhang, and Xinyu Ao. An effective multi-biometrics solution for embedded device. IEEE International Conference on Systems, Man, and Cybernetics (SMC 2009), pp. 917-922, 2009. [Wang et al., 2010] Yan Wang, Hongli Liu, and Jun Feng. The design of an intelligent security access control system based on fingerprint sensor FPC1011C. Circuits and Systems, Vol. 1, pp. 30-33, 2010. [Wang et al., 2011a] Yanpeng Wang, Qing Li, and Li Zhang. Design of embedded fingerprint identification system based on DSP. IEEE International Conference on Anti-Counterfeiting, Security and Identification (ASID 2011), pp. 47-50, 2011. [Wang et al., 2011b] Jingyan Wang, Yongping Li, Ying Zhang, and Yuefeng Huang. Implementing multimodal biometric solutions in embedded systems. Biometrics – Unique and Diverse Applications in Nature, Science, and Technology, ISBN 978-953307-187-9, Intech, 2011. [Wei et al., 2004] Liu Wei, Yan Pu-liu, Xia De-ling, and Zhou Cong. An approach to dynamic fingerprint image enhancement. IEEE International Conference on Image and Graphics (ICIG 2004), pp. 294-297, 2004. [Wen et al., 2005] Miao-li Wen, Yan Liang, Quan Pan, and Hong-cai Zhang. A Gabor filter based fingerprint enhancement algorithm in wavelet domain. IEEE International Symposium on Communications and Information Technology (ISCIT 2005), pp. 1421-1424, 2005. [Whittington et al., 2009] Jim Whittington, Kapeel Deo, Tristan Kleinschmidt, and Michael Mason. FPGA implementation of spectral subtraction for automotive speech recognition. IEEE Workshop on Computational Intelligence in Vehicles and Vehicular Systems (CIVVS 2009), pp. 72-79, 2009. [Wu et al., 2004] Chaohong Wu, Zhixin Shi, and Venu Govindaraju. Fingerprint image enhancement method using directional median filter. Biometric Technology for Human Identification, Proceedings of the SPIE 2004, Vol. 5404, pp. 66-75, 2004. [Xia and O’Gorman, 2003] Xiongwu Xia, and Lawrence O’Gorman. Innovations in fingerprint capture devices. Pattern Recognition, Vol. 36, No. 2, pp. 361-369, 2003. [Xian-chun et al., 2011] Wu Xian-chun, Gao Shi-bin, and Huang Feng. Design of low-power micro fingerprint lock based on STM32F103ZE. IEEE International Conference on Electric Information and Control Engineering (ICEICE 2011), pp. 1299-1301, 2011. [Xie et al., 2005] Xiaohui Xie, Fei Su, and Anni Cai. Ridge-based fingerprint recognition. Advances in Biometrics, International Conference on Biometrics (ICB 2006), Lecture Notes in Computer Science, Vol. 3832, pp. 273-279, 2005. [Xie et al., 2008] Mei Xie, Chengpu Yu, and Jin Qi. A novel fingerprint matching method combining geometric and texture features. Intelligent Information Processing IV, IFIP Advances in Information and Communication Technology, Vol. 288, pp. 155-164, 2008. [Xie et al., 2010] Shan Juan Xie, Sook Yoon, Hui Gong, Jinwook Shin, and Dong Sun Park. Fingerprint reference point determination based on orientation features. IEEE International Conference on Network and System Security (NSS 2010), pp. 216222, 2010. [Xu et al., 2009] Hui Xu, Yifan Qu, Yan Zhang, and Feng Zhao. FPGA based parallel thinning for binary fingerprint image. Chinese Conference on Pattern Recognition (CCPR 2009), pp. 1-4, 2009. 313 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Yager and Amin, 2004] Neil Yager, and Adnan Amin. Fingerprint verification based on minutiae features: a review. Pattern Analysis and Applications, Vol. 7, No. 1, pp. 94-113, 2004. [Yager and Amin, 2005] Neil Yager, and Adnan Amin. Coarse fingerprint registration using orientation fields. EURASIP Journal on Applied Signal Processing, Vol. 2005, No. 13, pp. 2043-2053, 2005. [Yager and Amin, 2006] Neil Yager, and Adnan Amin. Fingerprint alignment using a two stage optimization. Pattern Recognition Letters, Vol. 27, No. 5, pp. 317-324, 2006 [Yang and Verbauwhede, 2003] Shenglin Yang, and Ingrid M. Verbauwhede. A secure fingerprint matching technique. ACM SIGMM Workshop on Biometrics Methods and Applications (WBMA 2003), pp. 89-94, 2003. [Yang and Verbauwhede, 2004] Shenglin Yang, and Ingrid Verbauwhede. A realtime, memory efficient fingerprint verification system. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASP 2004), Vol. 5, pp. 189-192, 2004. [Yang et al., 2003] Shenglin Yang, Kazuo Sakiyama, and Ingrid M. Verbauwhede. A compact and efficient fingerprint verification system for secure embedded devices. 37th Asilomar Conference on Signals, Systems, and Computers (ACSSC 2003), Vol 2, pp. 2058-2062, 2003. [Yang et al., 2005] Shenglin Yang, Patrick Schaumont, and Ingrid Verbauwhede. Microcoded coprocessor for embedded secure biometric authentication systems. IEEE/ACM/IFIP International Conference on Hardware - Software Codesign and System Synthesis (CODES+ISSS 2005), pp. 130-135, 2005. [Yang et al., 2006] Shenglin Yang, Kazuo Sakiyama, and Ingrid Verbauwhede. Efficient and secure fingerprint verification for embedded devices. EURASIP Journal on Applied Signal Processing, Vol. 2006, No. 58263, pp. 1-11, 2006. [Yau et al., 2004] Wei Yun Yau, Tai Pang Chen, and Peter Morguet. Benchmarking of fingerprint sensors. Biometric Authentication Workshop (BIOAW 2004), Lecture Notes in Computer Science, Vol. 3087, pp. 89-99, 2004. [Yeung et al., 2005] Hoi Wo Yeung, Yiu Sang Moon, Jiansheng Chen, Fai Chan, Yuk Man Ng, Hin Shun Chung, and Kwok Ho Pun. A comprehensive and real-time fingerprint verification system for embedded devices. Biometric Technology for Human Identification II. Proceeding of SPIE, Vol. 5779, pp. 438-446, 2005. [Yoo et al., 2007] Jang-Hee Yoo, Jong-Gook Ko, Yun-Su Chung, Sung-Uk Jung, Ki-Hyun Kim, Ki-Young Moon, and Kyoil Chung. Design of embedded multimodal biometric systems. IEEE International Conference on Signal-Image Technologies and Internet-Based System (SITIS 2007), pp. 1058-1062, 2007. [You et al., 2005] Xinge You, Bin Fang, Yuan Yan Tang, and Jian Huang. Multiscale approach for thinning ridges of fingerprint. Pattern Recognition and Image Analysis, Lecture Notes in Computer Science, Vol. 3523, pp. 505-512, 2005. [Youssif et al., 2007] Aliaa A. A. Youssif, Morshed U. Chowdhury, Sid Ray, and Howida Youssry Nafaa. Fingerprint recognition system using hybrid matching techniques. IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), pp. 234-240, 2007. [Zhan et al., 2005] Xiaosi Zhan, Yilong Yin, Zhaocai Sun, and Yun Chen. A method based on continuous spectrum analysis and artificial immune network optimization algorithm for fingerprint image ridge distance estimation. IEEE International Conference on Computer and Information Technology (CIT 2005), pp. 728-733, 2005. [Zhang, 1997] You Yi Zhang. Redundancy of parallel thinning. Pattern Recognition Letters, Vol. 18, No. 1, pp. 27-35, 1997. [Zhang, 2011] Jinhai Zhang. Research on embedded fingerprint identification system. IEEE International Conference on Mechanic Automation and Control Engineering (MACE 2011), pp. 594-597, 2011. 314 UNIVERSITAT ROVIRA I VIRGILI HARDWARE ACCELERATORS FOR EMBEDDED FINGERPRINT-BASED PERSONAL RECOGNITION SYSTEMS Mariano Fons Lluís DL: T. 876-2012 [Zhang and Wang, 1996] Y. Y. Zhang, and P. S. P. Wang. A parallel thinning algorithm with two-subiteration that generates one-pixel-wide skeletons. IEEE International Conference on Pattern Recognition (ICPR 1996), Vol. 4, pp. 457-461, 1996. [Zhang and Xiao, 2006] Yuheng Zhang, and Qinghan Xiao. An optimized approach for fingerprint binarization. IEEE International Joint Conference on Neural Networks (IJCNN 2006), pp. 391-395, 2006. [Zhang and Xie, 2008] Lei Zhang, and Mei Xie. Realization of a new-style fingerprint recognition system based on DSP. IEEE International Symposium on IT in Medicine and Education (ITME 2008), pp. 1107-1111, 2008. [Zhang and Suen, 1984] T. Y. Zhang, and C. Y. Suen. A fast parallel algorithm for thinning digital patterns. Communications of the ACM, Vol. 27, No. 3, pp. 236-239, 1984. [Zhang et al., 2002] Wei-Peng Zhang, Qing-Ren Wang, and Yuan Y. Tang. A wavelet-based method for fingerprint image enhancement. IEEE International Conference on Machine Learning and Cybernetics (ICMLC 2002), pp. 1973-1977, 2002. [Zhang et al., 2003] Tanghui Zhang, Jie Tian, Yuliang He, Jiangang Cheng, and Xin Yang. Fingerprint alignment using similarity histogram. Audio- and Video-Based Biometric Person Authentication (AVBPA 2003), Lecture Notes in Computer Science, Vol. 2688, pp. 854-861, 2003. [Zhang et al., 2007] Yangyang Zhang, Xin Yang, Qi Su, and Jie Tian. Fingerprint recognition based on combined features. Advances in Biometrics, International Conference on Biometrics (ICB 2007), Lecture Notes in Computer Science, Vol. 4642, pp. 281-289, 2007. [Zhang et al., 2010] Kaisheng Zhang, Jiao She, Mingxing Gao, and Wenbo Ma. Study on the embedded fingerprint image recognition system. IEEE International Conference of Information Science and Management Engineering (ISME 2010), Vol. 2, pp. 169-172, 2010. [Zhang et al., 2011] Xiujuan Zhang, Chao Chen, Lina Ni, and Jinquan Zhang. Fingerprint minutia extraction algorithm based on DSP. IEEE International Conference on Electronic and Mechanical Engineering and Information Technology (EMEIT 2011), Vol. 9, pp. 4805-4808, 2011. [Zhao and Jain, 2010] Qijun Zhao, and Anil K. Jain. On the utility of extended fingerprint features: a study on pores. IEEE International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2010), pp. 9-16, 2010. [Zhao et al., 2010] Qijun Zhao, David Zhang, Lei Zhang, and Nan Luo. High resolution partial fingerprint alignment using pore–valley descriptors. Pattern Recognition, Vol. 43, No. 3, pp. 1050-1061, 2010. [Zheng et al., 2006] Xiaolong Zheng, Yangsheng Wang, and Xuying Zhao. A detection algorithm of singular points in fingerprint images combining curvature and orientation field. Intelligent Computing in Signal Processing and Pattern Recognition, Lecture Notes in Control and Information Sciences, Vol. 345, pp. 593-599, 2006. [Zhou and Gu, 2004] Jie Zhou, and Jinwei Gu. A model-based method for the computation of fingerprints’ orientation field. IEEE Transactions on Image Processing, Vol. 13, No. 6, pp. 821-835, 2004. [Zhou and Lu, 2009] K. L. Zhou, and Z. X. Lu. Design of vehicle locks based on DSP and fingerprint identification system. Journal of Shanxi University of Science & Technology, Vol. 27, No. 5, pp. 103-105, 2009. [Zhu and Xie, 2007] XingJia Zhu, and Mei Xie. Multiple biometric recognition system with the function of real-time display. IEEE International Conference on Communications, Circuits and Systems (ICCCAS 2007), pp. 990-994, 2007. [Zhu et al., 2005] En Zhu, Jianping Yin, and Guomin Zhang. Fingerprint matching based on global alignment of multiple reference minutiae. Pattern Recognition, Vol. 38, No. 10, pp. 1685-1694, 2005. 315