Development of face recognition timekeeping system for duyen viet companies using adaboost algorithm

DEVELOPMENT OF FACE RECOGNITION TIMEKEEPING SYSTEM FOR DUYEN VIET COMPANIES USING ADABOOST ALGORITHM A Special Project Presented to the Faculty of Institute of Graduate Studies and Resea

Trang 1

DEVELOPMENT OF FACE RECOGNITION TIMEKEEPING SYSTEM FOR DUYEN VIET COMPANIES USING ADABOOST ALGORITHM

A Special Project Presented to the Faculty of Institute of Graduate Studies and Research

Manuel S Enverga University Foundation

Lucena City

In Partial Fulfillment

of the Requirements for the Degree

Master in Information Technology

by

DUYEN THI NGUYEN

March 2014

Trang 3

Title : Development of Face Recognition Timekeeping System for Duyen Viet

Companies using Adaboost Algorithm Author : Duyen Thi Nguyen

Degree : Master in Information Technology School : Manuel S Enverga University Foundation, Lucena City, Philippines Adviser : Jose B Tan, Jr

Date : March 2014

Abstract

This study aims to develop a timekeeping system by face recognition using Adaboost algorithm The main objective of the project is to develop a face recognition timekeeping system that can be used as automated timekeeping which is very accurate to use by companies nowadays The companies’ system working hours of employees will be corrected and will improve work effectively The timekeeping system checks employee’s log in and log-out whenever they accessed their company’s biometric machine Rapid Application Development Methodology (RAD) was used to develop the system The project utilized Unified Modeling Language (UML) to build the models of the system through use case, activity, sequence, and class diagrams to design the database scheme C#

programming language is used by the proponent to be able to construct the system together with the Microsoft SQL serves as database for storing the data of the system

Questionnaires are used for measuring the efficiency of the proposed system as evaluated by the end-users based on the ISO 9126 standard, the results of the rating given

by the evaluators

Trang 4

The system will bring users a timekeeping system with friendly interface, especially the system was built based on current reality in Viet Nam The proponent is confident and believes that this project study has satisfied its objectives and purpose

Key words: timekeeping system, face recognition, AdaBoost

Trang 5

Acknowledgments

I would like to express heartfelt appreciation to all the people who offered help and support to this study This study would not have been possible without their assistance and suggestions

To my beloved parents whose boundless love inspires me every day of my life, they always pray and wish all the best things for me even though they are not with me all the time;

To my thesis adviser, Jose B Tan, Jr for his patience in reviewing and editing the contents of this study and for his valuable suggestions;

To the oral defense panel, Dr Benilda N Villenas, Dean Rodrigo C Belleza Jr., and Prof Raymond S Bermudez for the suggestions and comments they have given for the further improvement of this study;

To Dr Benilda N Villenas for her suggestions in improving the documentation format and style and for helping me focus my findings, conclusions and recommendations;

and,

To all my friends who were always there to lend a helping hand, my sincere thanks

to all of you

D T N

Trang 6

Dedication

To my beloved parents,

brothers, cousins and friends, the Information Technology and Communications University

my country – Vietnam

D T N

Trang 7

Table of Contents Chapter Page

I BACKGROUND OF THE INFORMATION SYSTEM PROJECT

II REVIEW OF RELATED LITERATURES, STUDIES AND SYSTEM

III SYSTEM/SOFTWARE DEVELOPMENT METHODOLOGY, MODELS, TECHNOLOGIES AND TOOLS

IV SYSTEM/SOFTWARE DEVELOPMENT PROCESS

Architectural Design of the Real Estate Management System 41

V SUMMARY OF FINDINGS, CONCLUSION, AND RECOMMENDATIONS

Trang 8

List of Figures Figure Page

2 Paradigm Showing the Need to Develop Timekeeping System 5

3 Rapid Application Development (RAD) Methodology 7

8 The Features Expanded of the Features Haar-Like Base 24

10 Example of the Calculation of the Value the Gray Level of D on Image 26

12 The System Detect Face Using AdaBoost Algorithm 28

19 View List of Employee Present Activity Diagram 46

Trang 9

20 View List of Employee Absent Activity Diagram 47

26 Sequence Diagram View List Employee Present 53

36 Respondents’ Evaluation on the Portability of the Proposed System 62

37 Summary of the Results Regarding the Software Criteria 62

Trang 10

CHAPTER I

BACKGROUND OF THE INFORMATION SYSTEM PROJECT

Overview of the Current State of Technology

We are living in a new era – the era of flourishing information technology

Information technology has developed to a higher level that includes digitizing all information data, strong rotation and connecting us all together All kinds of information (audio data, photographs, etc.) can be taken digitally so that any computer can store, process and forward them to as many people These tools and the connection of the digital age allows us to easily collect and share information and act on the basis of this information

in entirely new ways, which may lead to a series of changes about concepts, practices, traditions and habits, and even how to look for value in life Information technology is one

of the most important motivations of development

Nowadays, with the rapid advancement of science and technology in general, the science of image processing has gained great achievement and proved to be indispensable with broad applications in science and technology as well as social life A division of science in the field of image processing is attracting a lot of attention with the goal of building a world of visual system that can be magical in the human world Dreaming of a computer system can be integrated into the human world of visual senses which plays an important role in the contribution of research scientists all over the world

At the same time, the development of hardware devices both in terms of collection reception, display and speed processing has opened up many new directions for image processing technology It solved problems such as automated monitoring agency serving

in banks, treasury or in automatic traffic monitoring, parking service, automated toll booths or the detection and face recognition in military and security, etc

Trang 11

The system helps companies manage and supervise staff and workers in an efficient manner, reducing the cost of hiring guards and surveillance workers This also ensures high accuracy in time keeping which reduces tardiness in offices The system helps system administrators, as well as, companies to extract and to query for data faster resulting in a more stable and highly effective processes while creating jobs

Furthermore, it helps companies manage and monitor worker’s performance and accuracy Thus, this research project entitled “Development of Face Recognition

Timekeeping System and Intrusion Detection for Duyen Viet Companies Using AdaBoost Algorithm” is very timely especially in an emerging and growing country like Vietnam

Project Context

With the implosion of population globally and with the resurgence of tourism, security issues have become more strict requirements in every country in the world

Therefore, the application of human identity and development is increasingly appreciated

One of the problems is to recognize people's attention today with regards to face recognition

Face recognition is an important area and has many practical applications, such as

in security systems, identity verification and communication method between people and machines in the field of entertainment More importantly, it is the step to recognize faces and to detect them in images This is the object of research about face detection

The past years marked major advances in both speed and accuracy of the methods and techniques to detect faces Face detection systems have a huge range of applications so detection of human faces is extremely important

Trang 12

The subject was built for a company in the management and supervision of personnel with key functions such as face detection, facial recognition, timekeeping and unlocking the door

Figure 1 The block diagram of the system

Class of Face Detection

Determination of face images is an interesting problem It has challenged many researchers because of its wide applications in practice The fast face detection and high accuracy are two important factors in the success of this technology The choice of detection method using AdaBoost algorithm with Haar features ensures the two factors aforementioned above

Classification of Facial Recognition and Timekeeping

When the object is detected in the image, it will be compared to the face that has been installed in the database If a face was detected, the employees’ information in the

Trang 13

database will be displayed on a rectangular frame and thus will be forwarded to the timekeeper

Classification of Additional Employee

When adding new employees list to the system, the system administrator (admin) will check employees' faces that need to be added to the database The system will perform face detection and recognition, if needed If the employee’s ‘face’ was not added to the database, images of their ‘faces’ will be stored in the new database

The Need to Develop the Proposed System

The application of the control system should be put in place as it is a necessary task

Companies, schools and businesses can increase their business efficiency and effectiveness,

if there’s good working hours by employees

Specific Examples of Business

The usual number of business companies has 300 employees and an average per day for each person’s is only 5 minutes (head time working hours and between into shift)

The figure shows the following:

5 minutes x 300 persons x 26 working days = 39,000 minutes of work (1 month) Not to mention the side effects of tardiness that resulted in disruption of work of other employees Also, in manual timekeeping method, companies took a lot of time and efforts to gather data the monthly payroll Face recognition timekeeping system has worked effectively in job performance in a company leading to increased personnel costs of money for businesses

Based on the problems mentioned on the current manual system, the system has solved the inconveniences and brought order to the timekeeping system The system was written in C# The purpose of this study has built a face recognition timekeeping system for employees based on the information on the images of faces taken from a camera in real-time using AdaBoost method and Principal Component Analysis (PCA) In this study, Rapid Application Development (RAD) methodology was used

Trang 14

Figure 2 Paradigm showing the need to develop the timekeeping system

This project was developed for Vietnam companies to avoided issues in timekeeping system For managers, this timekeeping method had a better interface which is easier to use and to manage

Tedious work in monitoring

attendance

Manual method takes time

Less work in monitoring attendance

Unreliable timekeeping information

Faster and automatic

Reliable and accurate attendance information

RAD

Microsoft SQL Server

Problems in the Existing

System

Technology/Methodology

Proposed System

Manual report generation

Automatic report generation

Trang 15

The system was applied in Vietnam to create favorable working conditions for all company managers and in order for them to manage, especially a large number of existing employees they have in their companies For employee management, company managers had the authority to manage their company’s timekeeping system

When the system has been developed, the manager will provide all the necessary information about the employee, such as, employee's image, ID, etc All of these will be updated to ensure that they have the all the correct information The system also supported the system administrator in managing employees as a means of effective business strategy

Information System Development Methodology

The research was designed for a short time only so the proponent applied Rapid Application Development (RAD) methodology

Martin (1991) first coined the term Rapid Application Development (RAD) It is a development lifecycle designed to give much faster development and higher-quality results than those achieved with the traditional lifecycle It is designed to take the maximum advantage of powerful development software that has evolved recently

Below is a brief overview of the RAD process which consists of four life cycle stages:

requirements planning, user design, construction and implementation Also described are the typical pre- project activities and post-project activities

Trang 16

Figure 3.Rapid Application Development (RAD) methodology

Goals and Objectives

The main objective of the project study was designed and developed a face recognition timekeeping system and intrusion detection that facilitated the process of timekeeping utilizing RAD software development methodology, C #, Microsoft SQL server, and AdaBoost method

The specific objectives are the following:

1 To analyze timekeeping system in order to determine the possible opportunities as the basis of functional requirements and design specifications of the proposed system;

2 To design a timekeeping system utilizing Unified Modeling Language (UML)

as an object-oriented notation of system specifications based on the identified features and functionalities with the user employing use case, activities, sequence, collaboration, class, component diagrams, etc.;

3 To develop the proposed Timekeeping System using C#, Microsoft SQL Server, AdaBoost algorithm as software programming language;

4 To evaluate the system using ISO 9126 standards of software quality attributes such as functionality, reliability, usability, efficiency,

Trang 17

maintainability, and portability to be assessed by potential users through sampling method

Scope and Constraints

The system has successfully worked basing it on the control biometric facial recognition More so, removing the use of traditional security practices, such as, the manning of security guards or office maintenance of papers works containing access details, can minimize the cost of having security personnel Electronic security system captures the facial details of the individual and stored the electronic data into the database

of the computer

This project focused on building a timekeeping by face recognition in the actual situation in Vietnam; lastly, the chosen time keeping system was used by companies with large number of employees

The main interface displays all the functions of the timekeeping system, such as, decentralization, categories, adding new employee, and display list of employee’s timekeeping

For employees, the interface supported their timekeeping as simple, easy and for their own convenience

For administrators, included features are the following: decentralization, categories, and adding new employee This system also provided management functions such as management of employee and display list of employee’s timekeeping

The proponent used Rapid Application Development (RAD) approach RAD (Margaret Rouse, 2007) refers to a type of software development methodology that uses minimal planning in favor of rapid prototyping to achieve timely completion of information system project development The Unified Modeling Language (UML) was also used to document the system’s design specifications while C# with Microsoft SQL Server was used

as the development tool and database of the proposed system respectively

The system focuses more on the task of face detection and face recognition directly

on the camera Due to the difficult conditions of the face recognition, there are certain

Trang 18

problems that are bound to happen, like, problems on image resolution from the camera;

face was taken straight or there’s inconsiderable oblique angle; the picture was taken in normal conditions, but the background and the face is not too the same; and the face was not obscured

Benefits and Impacts

Along with the development of the world economy in particular and Vietnam in general, demand management and monitoring of business activities in the country is increasing especially the management of human resources People are the key factors, the success or failure of a company or organization

The proposed real face recognition timekeeping system would be useful for a company for improved management and control The company which eventually utilizes the proposed system will have advantages in managing employee, giving it an easy and fast way to manage information about employee

The system was specifically developed to support an actual company, the Duyen Viet Company in Vietnam

In particular, the following stakeholders are expected to benefit from this system development project once completed and implemented:

1 The employee Make employees aware of professional work The working time is

in accordance with the industry as the number 1 mandatory requirement for all employees Through the use of timekeeping face recognition, all employees who logged-in demonstrated a degree of consciousness, if a worker missed work or has been late for work In addition, the system also increases professionalism for business companies, as well as, to its business partners or customers

2 The company Manages and supervises staff and workers in an efficient manner,

reducing the cost of hiring guards and surveillance workers The system helps system administrators, as well as companies, extract data, query data fast, and is effective in creating jobs

Trang 19

3 Future researchers This serves as a reference for other researchers who will be

conducting other related timekeeping system project in the future They might be interested on the proposed system which will also be an avenue for them to enhance and develop new features that might not be available on the proposed system This study explores and provides insights as to what future timekeeping system projects would be needed

Definition of Terms

The following are the terms used by the researcher in the course of conducting the project proposal

Conceptual Definitions:

C# (“C Sharp”) is a general object-oriented programming (OOP) language for

networking and Web development, developed in 1999 by the Dutch software engineer Anders Hejlsberg’s team to complement to Microsoft’s NET framework It is specified as a common language infrastructure (CLI) language

(http://www.techopedia.com/definition/26272/c-sharp)

Database is a collection of information that is organized so that it can easily be

accessed, managed and updated It can be classified according to types of content:

bibliographic, full-text, numeric and images

(http://searchsqlserver.techtarget.com/definition/database)

Microsoft SQL Server is a cloud-ready information platform that helps

organizations unlock breakthrough insights across the organizations and quickly build solutions to extend data across on-premises and public cloud

(https://www.microsoft.com/en-us/sqlserver/product-info.aspx)

Operational Definitions:

Efficiency refers to the characteristic concerned with the system resources used

when providing the required functionality The amount of disk space, memory, network etc provides a good indication of this characteristic

Trang 20

Functionality refers to the sum or any aspect of what a product, such as a software

application or computing device, can do for a user Factors include like suitability, security, compliance, accuracy and interoperability

Maintainability is the ability to identify and fix a fault within a software component

and this is what the maintainability characteristic addresses In other software quality models this characteristic is referenced as supportability Anything that helps with identifying the cause of a fault and then fixing the fault is the concern of maintainability

Also the ability to verify (or test) a system, i.e testability, is one of the sub characteristics of maintainability (SearchSOA.com)

Portability refers to how well the software can adopt to changes in its environment

or with its requirements The sub characteristics of this include adaptability Object oriented design and implementation practices can contribute to the extent to which this

characteristic is present in a given system

Rapid Application Development (RAD) is a concept that products that can be

developed faster and of higher quality through gathering requirements using workshops or focus groups, prototyping and early, reiterative user testing of designs, the re-use of software components, a rigidly paced schedule that defers design improvements to the next product version and less formality in reviews and other team communication

Reliability refers to the capability of the system to maintain its service provision

under defined conditions for defined periods of time

Unified Modeling Language (UML) is a graphical language for visualizing,

specifying, constructing, and documenting the artifacts of a software-intensive system It offers a standard way to write a system's blueprints, including conceptual things such as business processes and system functions as well as concrete things such as programming language statements, database schemas, and reusable software components

Usability refers to the ease of use for a given function

Trang 21

CHAPTER II REVIEW OF RELATED LITERATURE, STUDIES AND SYSTEM

This chapter presents the review of related literature and related studies as well as related system/software which help me a lot in this study It also discusses the conceptual framework used in the conceptualization of the project

Models

The Unified Modeling Language (UML) is a family of graphical notations, backed by single metamodel, that helps in describing and designing software system, particularly software system built using the object-oriented (OO) style (Fowler, 2003)

UML is only a language and not a way of designing a system It is a way to model a system and can be broken into two main pieces – structural diagrams and behavioral diagrams (Roff, 2003)

Unified Modeling Language (UML) provided the users with a ready-to-use, expressive visual modeling language so they can develop and exchange meaningful models;

provide extensibility and specialization mechanisms to extend the core concepts; be independent of particular programming languages and development processes; provide a formal basis for understanding the modeling language; encourage the growth of the OO tools market; and support higher level development concepts such as collaborations, frameworks, patterns and components UML made it easy for the proponent to understand what to convey in the proposed system

Activity diagram provides much needed descriptions of a system by providing the next step in analyzing the system, following the use case diagrams According to Roff (2003), an activity diagram allows the reader to see the system’s execution and how it changes direction based upon different conditions and stimuli Although activity diagrams can also be used to model complex object behavior when getting into the system design portion of modeling, it deals primarily with the analysis phase using the activity diagrams

as a means for taking use cases to the next level They are helpful, particularly to use cases,

Trang 22

because they give the reader an obvious start and end state Activity diagrams can explain

to the reader what conditions need to be met for a use case to be valid, as well as the condition or state, a system is left once the use case has been completed

Sequence diagram is one of two types of interaction diagrams used to model object interactions arranged in time sequence and to distribute use case behavior to classes (Roff, 2003)

There are a number of specific reasons for modeling sequence diagrams, which, arguably, have the same role as activity diagrams One of the reasons, shared by activity diagrams, is to realize a use case In fact, sequence diagrams, just like activity diagrams, are used to provide the missing explanation of the generalized function that is specified by a particular use case

Class diagrams are static diagrams consisting of pieces that make up a system or subsystem, modeled throughout the analysis and the design stages of a project, starting with classes that business folk may understand, but most certainly ending up with classes that are only comprehensible to the development team and essential to any project that is object-oriented (Roff, 2003)

Implementation diagrams were used to show where the physical components of a system are going to be placed in relation to each other, the hardware, or the internet They can be written early on in the UML process to get an idea of what is needed for rolling out the finished product, but they cannot be formalized until the software has been completely modeled with class diagrams

The two types of implementation diagrams are component diagrams and deployment diagrams Component diagram are modeled to illustrate relationships between pieces of software while deployment diagrams are modeled to illustrate relationship

between pieces of hardware

Face Recognition

Recognition (Minh Viet, 2011) can be understood as a method in building computational systems capable of knowledge perception of physical objects similar to

Trang 23

human capabilities Recognition is closely related to the processing of signals in dimensional space, models, graphs, language, databases, methods of decisions, and others

multi-Recognition system must be able to demonstrate the process of human perception through the following levels:

 Level 1 (Level of Feel): This level provides the data collected by the sensors in the

recognition system For example, in speech recognition systems, the object here is to recognize the voice (speech) and receive input via Micro or audio files

 Level 2 (Level of Perception): This represents the model pattern of forming

Face recognition timekeeping is a device used for facial recognition for the record objectively and accurately as employee hours are manually verified successfully into the timekeeper

Timekeeping System

The use of timekeeping system started and flourished during the 18th decade when industrial revolution broke out, overtime status, dangerous working conditions and child labor, forced the government to regulate working hours and administrative measures especially in some industrialized countries

Trang 24

Timekeeping system lets you see, plan, and manage employees’ time, allowing you

to control labor costs with a consistent application of work and pay rules; minimize compliance risk by enforcing and tracking complex compliance requirements, such as union rules; and improve workforce productivity by reducing manual and timely administrative tasks and freeing your staff for more value-added activities

In fact, the biggest effect of the timekeeping system is to create a fair working timetable while each company’s employees are working or training voluntarily With a machine in control of your time it cannot be ignored because every employee is time-conscious now that their company has a timekeeping device The launch of the timekeeping system has helped businesses save more time and manpower With the current trend of development, almost companies have chosen a type of timekeeping devices, such as timekeeper fingerprint and timekeeper magnetic card It can be considered more intelligent as it plays an important role in creating a professional image for the company and business

The primary methods to identify face

Based on the properties of the methods for determining face image, the method is divided into four main categories which correspond to four different approaches There are also numerous studies that the method of determining face recognition is not only based on one direction only but is related to many directions

 The approach is based on intellect: Based on the algorithm, coding characteristics

and relationships may be base from the facial features of an individual This approach is styled top and down

 The approach is based on characteristics but do not change: Developing

algorithms does not change the position of the face but the camera position changes

 The approach is based on comparing and matching pattern: Using the standard

form of the face (these samples were selected and stored) to describe the faces or facial features (the samples are selected separately according to standards of the

Trang 25

proponent in comparing and matching the pattern) This method can be used to locate or detect faces in images

 The approach is based on face: In contrast to approaches based on the pattern,

the model (or models) will be learning from a set of training images that show typical characteristics of the appearance of the people’s face in the photo Then the system (model) will determine the face This method is known as the machine learning methods

Some specific methods or approaches based on face recognition

There are many methods used in machine learning approach based on face In this section, it highlighted some of the research methodology and approaches related to face recognition

AdaBoost

AdaBoost (Tran Vu Minh, 2009) is rated as the fastest approach in the development

of machine learning algorithms It is often combined with the model cascade of classifiers

to speed up face detection in images Though, it was AdaBoost algorithm that combines the weak classifier into a strong classifier During construction, the next weak classifier will be built based on the evaluation of the previous weak classifier, finally the weak classifiers will

be combined to become strong classifier

Support Vector Machine

Trang 26

A Support Vector Machine (SVM) (Mai Phuong, 2009) is a discriminative classifier formally defined by a separating hyperplane In other words, given labeled training data

(supervised learning), the algorithm outputs an optimal hyperplane which categorizes new

examples

Hidden Markov Model

Hidden Markov Model (Tran Vu Minh, 2009) is a statistical model in which the system is modeled to be a Markov process with unknown parameters before and the task is

to determine the hidden parameters from the observable parameters, based on the assumption of this recognition The model parameters are drawn and can then be used to

perform the following analysis, for example for pattern recognition applications

2) The method AdaBoost was used as a combination of characteristics which is very fast in computation, and suitable for detecting real time

3) The classifier AdaBoost can be built, even if handles minor complexities of distinguishing bad candidates for unrecognizable face image

Overview about AdaBoost

Based on studies, AdaBoost is an approach based on face recognition, Viola and Jones (2004) use AdaBoost in identifying the human face with the Haar wavelet-like features It has a relatively fast processing speed and it has a more accurate rate of 80% on grayscale image

Trang 27

Boosting approach

Boosting, the machine-learning method as cited by Schapire (2001) is based on the observation that finding many rough rules of thumb can be a lot easier than finding a single, highly accurate prediction rule To apply the boosting approach, we start with a method or algorithm for finding the rough rules of thumb The boosting algorithm calls this

“weak” or “base” learning algorithm repeatedly, each time feeding it a different subset of the training examples Each time it is called, the base learning algorithm generates a new weak prediction rule, and after many rounds, the boosting algorithm must combine these weak rules into a single prediction rule that, hopefully, will be much more accurate than any one of the weak rules

In 1990, Robert Schapire gave the first boosting algorithm In 1993, it was Drucker, Achapire and Simard in which they tested the recognition programs (OCR application)

Freund has continued the study of Schaprire, and until 1995, he along with Schapire, developed boosting into AdaBoost Thus, the basic principle of boosting is a combination of weak classifiers into a strong classifier

So to understand how it works, the proponent reviewed the classification problem

in 2 classes (the samples need to identify only 1 in 2 classes) with D in the training set consists of n samples The first, was selected in random n1 samples from the set D (n1<n) in creating D1 After that, the first weak classsifier C1 was built from D1 Next, we built the set

D2 for training classifier C2 D2 was built so that half of the samples were correctly classified by C1 and the other half were misclassified by C1 In this way, C2 contains additional information for C1 Now we built the training set for C2 from D2 Next, we built the set D3 from the unclassified good samples by combining C1 and C2: the remaining samples in D extracted from C1 and C2 gave different results Thus, D3 include samples C1

and C2 that do not operate effectively Finally, we examined the classification C3 from D3

Now, we have a strong classifier from the combinations of C1, C2 and C3 In conducting identification from sample X, results was determined by agreement of 3 sets C1,

C2 and C3 If both C1 and C2 were assigned X in the same class, then this class is the

Trang 28

classification results of X; and if C1 and C2 were assigned X into two different classes, C3

determined were X belongs to the class

Figure 4 Boosting

AdaBoost

AdaBoost (increases speed adaptation) is a powerful nonlinear classifier based on a complex boosting approach by Freund and Schapire given in 1995 AdaBoost works on the principle of a linear combination of weak classifier on based Haar- like features to form a strong classifier

Conceivably, intuitive ways follow: to see an image in people's faces or not, i ask T persons (equivalent to T classifiers built from T loops of boosting), assessed each person (equivalent to a weak classifier) It is better to select it randomly (error rate is below 50%) Then, the proponent evaluated each person (represented by the coefficient α), who has the ability to better evaluate the difficulty level of samples The level of importance in the final conclusion will be with good reviews in the samples The update of the weights of the samples after each loop boosting is to assess the difficulty of the samples (samples with more people misjudging is the increasingly difficult samples)

Trang 29

Figure 5 Strong classifier built with AdaBoost

The weak classifiers hk(x), is represented as follows:

( ) { ( )

in which:

x samples or child windows should be considered

x =(x 1 , x 2 , , x n ) is the feature vector of the sample

Ɵk threshold

fk the value of Haar-like features

pk decision-dimensional coefficients of equations Above formula can be interpreted as follows: if the value was characterized by the sample by evaluating the function of sorting If it passes a given threshold, the sample is the face (called object: object recognition), and another sample is the background (not the subject)

Trang 30

Figure 6 Combining the weak classifier to classify the strong classifier

Adaboost will combine the weak classifier to classify the strong classifier:

H(x) = sign( (x) + (x) + (x)) The process of training the classification is done by a loop for each iteration, the algorithm selects the weak classifier ht , it performed the classification with error εt the smallest (because that would be the best classification) was added to the strong classifier

When selected as classifier ht, AdaBoost then calculates the value αt above formula αt was selected in principle and reduce values errors εt

Coefficient αt say level of importance of ht in the formula of the classification H(x):

( ) (∑ ( )

)

Algorithm AdaBoost

1 Given a set of n samples marked (x1, y1), (x2, y2),… (xn, yn) with xk ∈ (xk1, xk2, … , xkm)

is the feature vector and yk ∈ (-1, 1) is the label of the sample (1 corresponding to object, -1 corresponding to the background)

2 Created initial weights for all samples where m is the number of samples correctly (corresponding to the object and y = 1) and l is the wrong sample number

(corresponding to the background and y = -1)

Trang 31

3 Build T weak classifiers Loop t = 1, …, T

For each feature in the feature vector, building a weak classifier hj with threshold θj

and error εj

∑ ( ) Pick out hj with εj min, get ht:

Zt: Coefficient used to give Wt+1 about paragraph [0,1] (normalization factor)

4 Strong classifier was built:

( ) ∑ ( )

In the classification, ht is contributing to the results of classification H(x), and level

of their contribution depends on the value αt corresponding: ht with αt to be greater the more it important role in H(x)

 The formula to calculate αt:

( ) Conspicuous the value αt inversely proportional to εt because ht was chosen with criteria to achieve εt minimum, so that it will ensure the greatest value αt

Trang 32

After calculating the value of αt, AdaBoost made updates to the weights of the samples: the samples weight gained that ht misclassification, reduces the weight of the samples correctly classified ht In this way, the weight of the samples reflects the degree of difficulty of identifying samples and ht+1 prioritizes this to sort out these samples

Loop strong classifier construction stopped after T iterations In fact installed (OpenCV library of Intel), which was rarely use value T because there is no formula that

ensures optimal T value for the training process Instead, they use value max false positive

or max false alarm (ratio of maximum misidentification background samples) The rate of the classifier should not be allowed to exceed this value Through these iterations, false

alarm of strong classifier Ht(x) was built (at loop t) then decreases, and the loop ends when

this ratio is lower than max false alarm

The Haar-Like Features

Viola and Jones (2004) included four basic features to identify the human face Each Haar-like characteristic is the combination of two or three rectangular "white" or "black"

Figure 7 Four basic features of Haar-like

Haar-like Features are preferred because of two reasons:

1 Classification is more powerful in determining human face

2 This is effective when used in table sum of areas or the full technical picture

To use this features in the determination of human faces, four (4) Haar-like features get essentially extended, and were divided into 3 sets characteristics as follows:

Trang 33

Features edge

Features road

Features around center

Figure 8 The features expanded of the features Haar-like base

Benefits of Haar-like features in the image (because it represented the relationship between the parts of the object), which themselves do not express individual pixels To calculate the value of Haar-like features, we compute the difference between the sums of the pixels of the black and white areas as in the following formula:

f(x) = Sum black (pixel) – Sum white (pixel)

Trang 34

Using this value, compared with the value of the pixel values raw, Haar-like features can increase / decrease or change in the outer layer of the face, so that sorting will be much easier

Thus we can see that, to calculate the value of Haar-like features, we have to calculate the sum of the pixels in the image But to calculate the value of the Haar-like feature for all positions on image requires large computational cost, it does not meet for the application requirements of the properties run-time Thus, Viola and Jones(2004) gave a concept called Integral Image, which is an array two-dimensions with a size equal to the size of the image which needs to compute the Haar-like features, with each element of the array was calculated by taking the sum of the pixels above (row - 1) and left (column - 1) of

it Starting above the location, which is left to lower position and to the right of the image, the calculation was based solely by adding the simple integer, therefore execution speed was very fast

( ) ∑ ( )

Figure 9 Calculation integral image

After the integral image calculation, the sum of the values in gray area on any particular image was made very simple as follows:

D = A + B + C + D – (A+B) – (A+C) + A

Trang 35

Figure 10 Example of the calculation of the value the gray level of D on image

With A + B + C + D as the value at point P4 on the integral image, similarly A + B is the value at point P2, A + C is the value at point P3 and A is the value at point P1 So we can rewrite the above expression for D as follows:

D = (X4, Y4) – (X2, Y2) – (X3, Y3) + (X1, Y1)

Cascade of Classifiers

In the training process, the classification was browsed through all the characteristics of the sample in the training set even if it takes a lot of time For this sample, we only need to consider one or few simple features that can be identified But for the conventional classifications, whether the sample is easy or difficult, the features will still be reviewed in the training process

The cascade of classifiers was built to shorten the processing time and to reduce false alarm for the classification Cascade tree includes several stages also called layer, each segment of the tree will be a stage classifier As a sample to be classified as objects, it needs to go through all the stages of the tree The stage classifiers in each stage used the negative sample before it identifies the errors Example, it will focus on learning from the background samples which are difficult With this structure, the background samples can easily identify which will be eliminated right from the first stages It helps to meet the best

Trang 36

Cascade training Algorithm

1 Call

 F is the value false alarm and d is the precision of weak classifier at each

stage

 P, N is the number samples of positive and negative

 P i , N i is the set positive and negative for classifiers at the ith floor

 F i , D i: the value false alarm and the precision of cascade before the come the

 N :=

 If Fi >Ftarget

N = {number of wrong samples at the stage current misclassification}

P = {the number of positive samples that stage correct classification}

Illustrations cascade training algorithm:

Trang 37

Figure 11 Cascade of classifiers

Figure 12 illustrates the training of a cascade which includes N stages At each stages, weak classifier corresponds which will be trained so that accuracy of it is h and false alarm by f

Figure 12 The system detect face using AdaBoost algorithm

Trang 38

In the figure above, the original image was calculated for the integral images, it was put through Haar function basic for the estimation of the characteristics The result

estimate was put through parameters adjustment of Adaboost for quickly removing the unlikely characteristic of human faces Only a small set of characteristic from that parameters adjustment of Adaboost is likely to be transferred to be a decision result The decision result synthesized the weak classifier to return the human face features

Timekeeping system gave the company the convenience in managing staff and enabling them to save cost and time With the combination of face recognition and unlocking the door also help companies in managing the entry and exit of employees in the company

Several researches and project studies which were found technically related to the design and development of the proposed method provided the researcher with additional knowledge

In the study of Hoang Phuong Anh (2009), it focuses on face detection method of AdaBoost Object detection is a fundamental problem and important in the field of computer vision The problem of fast face detection in images is important because the process of object recognition is inaccurate if it lacks in the steps to detect and locate objects The problem of fast face detection has important implications for detecting and tracking moving objects in video or camera

Detection method such as Adaboost based it on the idea of building the detection of weak though accuracy is not high, but the processing time is very fast AdaBoost method using a combination of characteristics is inherently very fast computation, suitable for detecting in real time

Krishna, Srinivasulu and Basak (2012)present an architecture for face detection based system on AdaBoost algorithm using Haar features They accelerated the processing speed of the face detection system by using the techniques containing image scaling, integral image generation, pipelined processing to classifiers and parallel processing multiple classifiers Also they discussed the optimization of the proposed architecture

Trang 39

which can be scalable for configurable devices with variable resources The face detection has been deployed in model sim and used the proposed architecture that has been

designed using Verilog HDL Its performance has been measured and compared with an equivalent hardware implementation They show about 35 times increase of system performance over the equivalent hardware implementation

Kumar, Prasad, Semwal and Tripathi (2011) presented an automated system for human face recognition in a real time background world for a large homemade dataset of persons face There is a huge variation in human face image in terms of size, pose and expression Except in an image, the system proposed collapses most of this variance They used Adaboost with Haar cascade and simple fast PCA and LDA to detect and recognize the real time human face The matched face is then used to mark attendance in the laboratory,

in their case This biometric system is a real time attendance system based on the human face recognition with a simple and fast algorithms and gaining a high accuracy rate

Zhang and Zhang (2010) made a survey of the recent advances in face detection for the past decades The seminal Viola-Jones face detector was first reviewed They reviewed the seminal Viola-Jones face detector hoping that by reviewing it, they will see a better algorithm which may develop to solve this fundamental computer vision problem So they surveyed the various techniques, in accordance to how Viola-Jones extracted these features and what learning algorithms are adopted

In a study conducted by Yeom and Lee (2013) they discussed a face detection method in the long distance with AdaBoost filtering and a false alarm reduction scheme

The false alarm reduction scheme was based on skin-color testing and variable edge mask filtering The skin-color test involves the average RGB components of the window,

followed by the binary cluster image generation The binary cluster was composed of the alternative and null pixels according to color The size of the edge mask was determined by the ellipse covering the binary cluster The edge mask filters out false alarms by evaluating the contour shape of the object in the window In the experiments, the false alarm

Trang 40

reduction scheme was shown to be effective for face detection in images captured at a distance

The above-mentioned related researches were applied by the proponent in the timekeeping system study In fact, those concepts are very familiar with any programmer but not everyone can do it best As analyzed above for Vietnamese user, this proposal system was developed for timekeeping, management efficiency and for convenience

This section relates our work to other system products of timekeeping system by face recognition at this time

Timekeeping Software Wise Eye 2010: Wise Eye 2010 is a built-in programming

tools Net.Wise Eye TAS2010 is used along with the timekeeper fingerprint and proximity card Especially TAS2010 can connect and manage the next generation models as

iClock660 (Finger Technology 10.0) and iFace302 (Face Recognition Technology and Finger 10.0) It manages card data and fingerprint data of employees You can download the files to your computer and upload this data to the timekeeping system Wise Eye 2010 analyses the administrative work, shift work schedule, overtime work schedule, Sunday work schedule, public holidays, night shift schedule, night allowance, working hours, late working hours, or going home early of employees It also exports data to Excel file for overtime shifts, working hours, and others Wise Eye 2010 was designed to help our parameters declared the calculation depending on the requirements of each company It helps us also to put the common law declaration (declaration coefficients attendance on Sundays, holidays, or night shift) and set up the schedule (a schedule containing from 1 to

20 shifts) The work shift schedules are sorted for each employee for each day of the weekly cycle

Timekeeping Software Vietnamese Mita 2012: Mita 2012 is a Fingerprint,

Sensor & Timekeeper Control door Mita was correctly integrated, effective working hours

as prescribed, and the number of overtime hours for each employee's time depending on the wage policies of each company It can export a detailed report on the hours of every employee, for easy observation of employee’s working hours

Định dạng
Số trang	87
Dung lượng	3,6 MB