TOÀN VĂN KỶ YẾU HỘI NGHỊ Conference Proceeding Fulltext.

VIII-O-1 XÂY DỰNG THUÂT TOÁN NHẬN DIỆN ĐẶC TRƯNG SINH TRẮC HỌC LÒNG BÀN TAY PALMPRINT TRÊN NỀN TẢNG DI ĐỘNG Nguyễn Duy Thiên, Trần Hoàng Đạt, Bùi Trọng Tú Trường Đại Học Khoa Học Tự Nh

Trang 1

TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN

ISBN: 978-604-82-1375-6

TOÀN VĂN KỶ YẾU HỘI NGHỊ Conference Proceeding Fulltext

TP HCM – 21/11/2014 www.hcmus.edu.vn

Trang 2

TOÀN VĂN BÁO CÁO NÓI

ORAL

Tiểu ban ĐIỆN TỬ - VIỄN THÔNG

Trang 3

VIII-O-1

XÂY DỰNG THUÂT TOÁN NHẬN DIỆN ĐẶC TRƯNG SINH TRẮC HỌC LÒNG BÀN TAY

PALMPRINT TRÊN NỀN TẢNG DI ĐỘNG Nguyễn Duy Thiên, Trần Hoàng Đạt, Bùi Trọng Tú

Trường Đại Học Khoa Học Tự Nhiên, ĐHQG-HCM

TÓM TẮT

Chứng thực cá nhân là một yếu tố quan trọng trong cuộc sống của chúng ta Để vượt qua những khó khăn của những phương pháp chứng thực thông thường, chứng thực sinh trắc học đã được phát triển và sử dụng những yếu tố cơ bản của con người Trong bài báo này, nhóm tác giả phát triển một ứng dụng phần mềm chứng thực palmprint trên nền tảng di động Quá trình chứng thực có 4 bước: lấy mẫu dữ liệu, tiền xử lý tín hiệu, rút trích đặc trưng và đối sánh Một bộ lọc Gabor 2 chiều được sử dụng để thu được thông tin kết cấu sau đó 2 hình ảnh palmprint được đối sánh bằng khoảng cách Hamming Kết quả chứng thực được kiểm tra trên 14 người và mỗi người chụp 4 hình ảnh palmprint Bên cạnh đó, ứng dụng phần mềm cũng đạt được hiệu suất tốt với FAR 0% và FRR 2.66% ở thế ngưỡng 0.7.

Từ khoá: FRR, FAR, ROI

GIỚI THIỆU

Điện thoại di động trong những năm gần đây đã có sự phát triển rất nhanh chóng và trở thành thiết bị không thể thay thế trong các hoạt động thường ngày của con người bao gồm các công việc như xử lý thư điện tử (Email), lưu các dữ liệu quan trọng, thanh toán trực tuyến v.v Với tầm quan trọng của điện thoại di động trong cuộc sống thì vấn đề bảo mật trên điện thoại di động rất cần thiết được quan tâm và nghiên cứu Một trong những vấn đề luôn được quan tâm nhất bên ngoài tính năng giải trí quan trọng nhất trên điện thoại di động đó là chứng thực cá nhân Từ những năm 70 trở lại đây [1], đã có rất nhiều thiết bị hoặc hệ thống được sản xuất và sử dụng công nghệ sinh trắc học để chứng thực cá nhân Năm 1970, hệ thống có tên Identimat có chức năng ghi lại kích thước bàn tay trở thành hệ thống đầu tiên được thương mại hóa Tháng 9 năm 2013, công ty Apple giới thiệu thiết bị Touch ID tích hợp trên Iphone 5S sử dụng công nghệ chứng thực dấu vân tay Những thiết bị và hệ thống ở trên có những ưu điểm vượt trội hơn các cách bảo mật thông thường Nhưng vẫn còn tồn tại một số nhược điểm như dấu vân tay dễ bị thay đổi do những tác động vật lý từ môi trường sống và khi con người được sinh ra đôi khi dấu vân tay cũng không rõ ràng khiến cho việc chứng thực dễ bị sai Chính vì những hạn chế của chứng thực dấu vân tay mà các nhà nghiên cứu đã chuyển hướng và tập trung nghiên cứu vào công nghệ chứng thực lòng bàn tay Trong báo cáo này, nhóm tác giả sẽ xây dựng thuật toán nhận dạng đặc trưng sinh trắc học lòng bàn tay trên nền tảng điện thoại di động thông minh dựa trên thuật toán phân tích đặc trưng lòng bàn tay trên hình ảnh độ phân giải thấp phục vụ cho chứng thực cá nhân do Wai Kin Kong, David Zhang [2] phát triển Mặc dù thuật toán gốc đã đạt được những yêu cầu cơ bản cho việc chứng thực cá nhân, nhưng mới chỉ được sử dụng cho các hình ảnh được chụp bởi các hệ thống máy chụp ảnh CCD cố định, chính điều này làm cho thuật toán chưa được linh động như mong muốn và thu hẹp phạm vi áp dụng thuật toán Chính những khuyết điểm được liệt kê ở trên đã làm động lực để nhóm tác giả cải thiện thuật toán gốc tốt hơn và dựa trên những cải tiến này sẽ xây dựng một thuật toán mới sử dụng trên điện thoại di động thông minh

Hình 1 Các tầng trong hệ thống sinh trắc học

Trang 4

THIẾT KẾ HỆ THỐNG

Sơ đồ khối tổng quát của hệthống được trình bày trong hình 2 Hệ thống có chức năng thiết lập mã số cho người dùng mới và nhận diện đối với người dùng đã có mẫu dữ liệu tồn tại trong cơ sở dữ liệu của hệ thống Hệ thống lấy mẫu bằng máy chụp hình của điện thoại di động và xử lý mẫu dữ liệu bằng chính vi xử lý có sẵn trên điện thoại Hệ thống nhận diện được nhóm tác giả thực hiện trên hai ngôn ngữ lập trình là : Java và Matlab Nhóm tác giả sử dụng Matlab trên máy tính để thực hiện việc đánh giá cũng như tìm thế ngưỡng tối ưu cho thuật toán.Sau đó nhóm tác giả sử dụng thế ngưỡng tối ưu của thuật toán để thực hiện ứng dụng trên HTC JOne dùng JAVA

Hình 2 Sơ đồ mô tả hệ thống nhận diện và lấy mẫu THUẬT TOÁN TIỀN XỬ LÝ

Trước khi ảnh đầu vào trải qua giai đoạn rút trích đặc trưng, hình ảnh đầu vào cần phải trải qua quá trình tiền xử lý Quá trình tiền xử lý trải qua 5 bước chính

Bước 1:ảnh đầu vào được chuyển thành ảnh xám với giá trị mức xám trải dài từ 0 đến 255 Sau đó, ảnh

xám được làm mờ bằng một bộ lọc thấp qua Gaussian Dựa vào lược đồ xám của ảnh (Histogram) và lý thuyết thế ngưỡng Otsu, ta sẽ tính toán được thế ngưỡng 𝑇𝑝 để chuyển đổi ảnh xám thành ảnh nhị phân Việc chuyển đổi này sẽ được biểu diễn như sau [3] :

  , 1 ,         , * ,

0 , , * ,

p p

Trang 5

(a) (b)

Hình 3 (a) ảnh đầu vào sau khi qua lọc Gaussian, (b) ảnh xám sau khi nhị phân hóa

Bước 2 : rút trích biên của các lỗ giữa các ngón tay sử dụng lý thuyết vạch đường biên, sau đó ta sẽ thu

được tập hợp điểm biên nằm trong các lỗ F x F yi j i jvới (i  [1 , 3], j  [2 , n]) Điểm bắt đầuS xk  , 2 và kết thúc E xk  , 2 của từng lỗ sẽ được đánh dấu bằng cách sau [4] :

Trang 6

Bước 3 : Tính toán điểm trung tâm 𝐶𝑘(𝑥, 𝑦)của mỗi lỗ bằng các phương trình sau :



 và b1 K y1  m K x1 1 , 3 1

m m

Trang 8

(a) (b)

Hình 8 (a) Ảnh sau khi được xoay, (b) Định vị vùng quam tâm và trích xuất vùng quan tâm (ROI)

RÚT TRÍCH ĐẶC TRƯNG VÀ ĐỐI SÁNH ĐẶC TRƯNG

Rút trích đặc trưng

Đặc trưng dòng chính và vết nhăn có thể nhận xét từ hình ảnh chụp palmprint của chúng ta Một số lý thuyết ví dụ như lọc ngăn xếp (stack filter) có thể thu được đặc trưng dòng chính Tuy nhiên, những đặc trưng dòng chính không đạt được tỷ lệ nhận diện cao bởi vì sự tương đồng giữa những lòng bàn tay khác nhau Hình 9 cho thấy 6 hình ảnh palmprint có đặc trưng dòng chính tương đối giống nhau Bên cạnh đó, đặc trưng vết nhăn

có khả năng chứng thực palmprint cao nhưng việc rút trích đặc trưng từ chúng khó thực hiện Với những lý do trên việc ứng dụng phân tích kết cấu (texture) để chứng thực palmprint là điều cần thiết

Trang 9

Bộ lọc Gabor được thiết lập bởi những thông số đặc biệt ở bảng 1 sẽ nhân chập với ảnh vùng quan tâm (ROI) 310x140 pixel Hình ảnh thu được sau khi thực hiện lọc Gabor sẽ được mã hóa theo cách sau :

Hình 10 Vùng quan tâm (ROI) được trích xuất từ long bàn tay

Hình 11 Hình ảnhPr sau khi được mã hóa với các thông số thiết lập bộ lọc Gabor là

G x, y, 0, 0.01145, 44.9432

Đối sánh vector đặc trưng

Quá trình đối sánh các đặc trưng sử dụng lý thuyết khoảng cách Hamming để tính toán điểm đối sánh Gọi

𝑀 và 𝑉 là hai ma trận đặc trưng có khoảng cách w x h và khoảng cách Hamming H giữa hai ma trận này được quy định như sau:

2 w*h

 , Docó giá trị thuộc khoảng [0, 1], giá trị Do càng gần 0 thì kết quả đối sánh càng hoàn hảo

CÀI ĐẶT THUẬT TOÁN

Quá trình cài đặt trải qua hai giai đoạn : tạo cơ sở dữ liệu và cài đặt thuật toán trên điện thoại di động.Nhóm tác giả tạo cơ sở dữ liệu phục vụ cho việc kiểm tra thuật toán bao gồm 52 hình ảnh lòng bàn tay, có

Trang 10

kích thước là 1520 x 2688 được lấy từ 13 người và độ tuổi được lựa chọn là từ 21 tuổi đến 24 tuổi Qua các quá trình tiền xử lý ta thu được hình ảnh vùng quan tâm (ROI) có kích thước là 310 x 140

Hình 12 Ảnh lòng bàn tay được lưu trong cơ sở dữ liệu vơi mã số là 3

Điều kiện lấy mẫu tuân theo một số điều kiện sau : lòng bàn tay được đặt dưới ánh sáng đèn của phòng thí nghiệm, cho phép nhiễu của sự rung lắc tay, đèn flash của điện thoại được chiếu trực tiếp vào lòng bàn tay, người được lấy mẫu không đeo các trang sức ở các ngón tay và lòng bàn tay Mỗi tập mẫu của mỗi người sẽ được cấp một mã số để phục vụ cho việc thực hiện đối sánh về sau và mẫu được yêu cầu chứng thực sẽ chỉ đối sánh với các mẫu có cùng mã số

Thuật toán được cài đặt trên nền tảng hệ điều hành Android 4.2.2 nhưng vẫn tương thích ngược với các phiên bản Android 4.0 trở lên [6,7] Thuật toán được kiểm tra trên điện thoại di động HTC JOne có cấu hình như sau : máy chụp hình có độ phân giải 4 Mega Pixel, với kích thước hình ảnh tối đa là 2688 x 1520, chế độ tự động lấy nét, ổn định quang học, đèn flash Led,vi xử lý Quad-Core 1.7 GHz Krait 300, chip đồ họa Adreno 320, hệ điều hành Android 4.4.2, bộ nhớ Ram 2 GB Bên cạnh đó thuật toán còn được cài đặt trên máy tính với cấu hình như sau : Dell Inspiron N5110 Intel Core i5-2410M 2.3GHz ( 4CPUs ), RAM 8 GB, BUS 1333 MHz, NVDIA GeForce GT525M 1024 MB, hệ điều hành Window 7 64 bit

Hình 13 (a) Giao diện phần mềm Palmprint Authentication chạy trên điện thoại Android HTC JOne, (b) Ảnh

đầu vào được nhị phân hóa

Trang 11

Hình 15 Phân bố điểm đối sánh của các tập mẫu với các thông số của hàm lọc Gabor ( 0, 0.01145,

44.9432 )

Trang 12

Dựa vào số liệu thực nghiệm trên phần mềm Matlab, nhóm tác giả rút ra được bảng số liệu sau :

Bảng 2 Kết quả thực nghiệm của việc chọn thế ngưỡngTar

Trang 13

Sau khi tiến hành đo thời gian xử lý quá trình rút trích đặc trưng dựa theo tiêu chuẩn trên thì nhóm tác giả

có được bảng số liệu sau :

Bảng 3 Kết quả đo thời gian xử lý quá trình rút trích đặc trưng

Thông số lọc Gabor (,

 ,  )

0, 0.01145, 44.9432

45, 0.01145, 44.9432

90, 0.01145, 44.9432

120, 0.01145, 44.9432

135, 0.01145, 44.9432

Dựa vào bảng 3 ta có thể thấy rằng thời gian xử lý của điện thoại lâu hơn khoảng 3.5 lần so với thời gian

xử lý trên máy tính Nhưng với thời gian xử lý chỉ từ 1.86 giây đến 2.068 giây thì việc ứng dụng thuật toán xử lý

rút trích đặc trưng trên điện thoại di động là khả quan

KẾT LUẬN

Trong bài báo này, nhóm tác giả đã giới thiệu về tổng quan về thuật toán nhận dạng đặc trưng sinh trắc học lòng bàn tay trên nền tảng di động Nhóm tác giả đã thực hiện thành công thuật toán này chạy trên máy tính và điện thoại di động Từ những thông số kết quả thu được ở trên, cho thấy rằng thuật toán mất rất ít thời gian để thực hiện nhưng trên máy tính lại tốn ít thời gian hơn trên điện thoại di động, điều này làm cho khả năng đáp ứng tức thời của 2 phần mềm có sự chênh lệch nhau Nhưng với sự phát triển không ngừng của lĩnh vực thiết kế vi mạch thì điều này trong tương lai sẽ được khắc phục

Bên cạnh những công việc nhóm tác giả đã thực hiện thành công thì vẫn còn tồn tại những việc mà nhóm tác giả cần phải thực hiện trong tương lai Đầu tiên là tăng tốc độ xử lý của thuật toán trên điện thoại di động và máy tính Thứ hai là các hình ảnh lòng bàn tay sẽ được chụp và cho phép xử lý ở cả hai lòng bàn tay Thứ ba là các hình ảnh lòng bàn tay được chụp sẽ không cần ánh sáng đèn flash chiếu thẳng vào Thứ tư là nhóm tác giả sẽ

sử dụng ngôn ngữ lập trình Java để xây dựng một thư viện mã nguồn mở, nhằm mục đích thực hiện việc kế thừa

và cải thiện thuật toán này tốt hơn Thứ năm là mở rộng đối tượng lấy mẫu lòng bàn tay để tăng số lượng tập mẫu và tìm ngưỡng 𝑇𝑎𝑟 tốt hơn

BUILDING FEATURE DETECTION ALGORITHM BIOMETRIC PALMPRINT

IN MOBILE PLATFORM Nguyen Duy Thien, Tran Hoang Đat, Bui Trong Tu

University of Sciences, VNU-HCM

ABSTRACT

Personal authentication play an important role in our society To overcome the disadvantages of conventional authentication methods, biometric authentication has been developed to use the characteristics of human nature In this paper, we develop a software application of palmprint authentication in mobile platform Authentication have four steps consisting of data acquisition, preprocessing, feature extraction and matching A 2-D Gabor filter is used to obtain the texture information then two palmprint images are matched by Hamming distance The authentication has been tested on 14 persons with 4 palmprint images per one person In addiotion, the software application provides a good performance with FAR of 0% and FRR of 2.66% with the threshold value

of 0.7.

Keywords: FRR, FAR, ROI

TÀI LIỆU THAM KHẢO

[1] D.Zhang, Palmprint Authentication,Springer Science & Business Media, 2004

[2] D.Zhang, W.K.Kong, J.You, M Wong,”Online palmprint identification”, IEEE Transactions on pattern

andalysis and machine intelligence25 (9) (2003) 1041 – 1050

[3] W.K.Kong, D.Zhang, ”Palmprint Texture Analysis based on Low Resolution Images fod Personal

Authentication”, Proceedings of 16th International Conference on Pattern Recognition3 (2002) 807–

810

Trang 14

[4] Z.Khan, A.Mian, Y.Hu, “Contour Code : Robust and Efficent multispectral palmprint encoding for

human recognition ”, IEEE international Conference (2011)

[5] W.K.Kong, D.Zhang, W.Li, “Palmprint feature extraction using 2D Gabor filters”, Pattern Recognition

36 (2003) 2339 – 2347

[6] G.Allen, M.Murphy, Beginning Android 4, Apress, 2011

[7] S.Komatineni, D.MacLean, Pro Android 4, Apress, 2012

Trang 15

VIII-O-2

AN EFFICIENT HARDWARE ARCHITECTURE FOR HMM-BASED TTS SYSTEM

Su Hong Kiet 1 , Huynh Huu Thuan 1 , Bui Trong Tu 1

1 University of Natural Sciences, VNU-HCM

ABSTRACT

This work proposes a hardware architecture for HMM-based text-to-speech synthesis system (HTS) In high speed platforms, HTS with software core-engine can satisfy the requirement of real- time processing However, in low speed platforms, software core-engine consumes long time-cost to complete the synthesis process A co-processor was designed and integrated into HTS to accelerate the performance of system

Keywords: text-to-speech synthesis, HMM, HTS, SoPC, FPGA

Figure 1 Scheme of HTS

In synthesis part, given text is analyzed and converted into label sequence According to label sequence, HMM sentence is constructed by concatenating HMMs taken form trained HMM database And then, excitation and spectral parameters are extracted from HMM sentence Excitation and spectral parameters are fed to synthesis filter to synthesize speech waveform Depending on the fact that spectral parameter is presented as mel-cesptral coefficients or mel-generalized cepstral coefficients, synthesis filter is constructed as MLSA filter

or MGLSA filter, respectively

In recent research, HTS is applied to many languages such as Japanese [1], English [1], Korean [13], Arabic [14] and so on Moreover, thank to the small-size of core-engine, HTS can be implemented on various devices such as personal computer, server and so on On high speed platforms such as PC, HTS with software core-engine can satisfy requirement of real-time processing In contrast, on low speed platforms, software core-engine consumes long time-cost to convert text to speech, i.e., the system do not meet real-time processing In

Trang 16

order to implement an efficient HTS on low speed platforms, speeding up the performance of core-engine is on demand This work uses a co-processor to accelerate the performance of HTS built on FPGA-based platform The rest of this paper is organized as follow: Section 2 presents the co-processor for HTS Section 3 proposes a hardware architecture for HTS built on FPGA-based platform Section 4 presents experiment for evaluating the performance of proposed system

CO-PROCESSOR FOR HTS

HTS Working Group have been developing a software core-engine for HTS (engine) [10] engine provides functions to generate speech waveform from label sequence by using a trained context-dependent HMM database The process of generating speech waveform from label sequence can be split into three steps as follow:

HTS-•Step 1: parsing label sequence and creating the HMM sentence

•Step 2: generating speech parameters from HMM sentence

•Step 3: generating speech waveform (synthesized speech) from speech parameters

The evaluation of performance of HTS-engine on various platforms shows that time-cost for Step-1 is small, Step-2 and Step-3 consume about 10% and 90% of total time-cost, respectively [15] The performance of HTS-engine on FPGA-based platform is shown in Table 1

Table 1 Performance of HTS-engine on FPGA-based platform

System configuration

FPGA device Altera Cyclone○RIV 4CE115

FPGA chip CPU

Nios-II with -Floating point hardware -Instruction cache: 4KB -Data cache: 2KB

Instruction storage SRDAM Data storage

SDRAM Flash memory for storing trained HMM database Synthesized

speech

144,240 samples which correspond to 3.005s of speech (Note: sampling rate is set as 48 KHz) Time-cost (s)

Figure 2 Architecture of co-processor Speech parameter generator (SPG) carries out the processing of generating speech parameters from

means and variances of states in the constructed HMM sentence The detailed architecture of SPG is shown in

Trang 17

Figure 3-a SPG consists of an arbiter and five sub-modules The arbiter communicates with main CPU via Avalon bus and controls the operation of sub-modules via an internal bus Each sub-module carries out its own specified task and activated by the arbiter After a sub-module completes its task, it informs the arbiter And then, the arbiter deactivates the sub-module

(a) (b) Figure 3 Architecture of SPG (a) and SSG (b) Synthesized speech generator (SSG) carries out the processing of generating synthesized speech from

speech parameters Similar to SPG, SSG consists of an arbiter and several sub-modules The arbiter communicates with main CPU via Avalon bus and controls the operation of sub-modules via an internal bus Each sub-module carries out its own specified task and activated by the arbiter After a sub-module completes its task, it informs the arbiter And then, the arbiter deactivates the sub-module Detailed architecture of SSG is shown in Figure 3-b

Floating point unit (FPU) is integrated into the co-processor to support SPG and SSG to carry out

operations in floating point numbers FPU supports operations of addition, subtraction, multiplication, division, modulo, comparison, exponential, natural logarithm and cosine FPU is shared for the arbiters and sub-modules

of SPG and SSG In order to avoid the conflict, at any time, at most one arbiter or one sub-module can use FPU, i.e., other arbiters and sub-modules must release the FPU interface bus

Internal memory stores data which are used or created by SPG or SSG Similar to FPU, the internal

memory is a shared resource At any time, at most one arbiter or one sub-module can access the internal memory, i.e., other arbiters and sub-modules must release the internal memory interface bus

HARDWARE ARCHITECTURE FOR HTS

Figure 4 shows the hardware architecture for HTS built on FPGA-based platform, in which a co-processor

is integrated into the system to accelerate system peformance Nios-II CPU is the main CPU of the system SDRAM is instruction storage and data storage of the system PLLs are used for setting the frequency of clocks

in the system UART port is used for debug mode This architecture consists of synthesis part of HTS only, i.e.,

it do not consists of training part So the proposed system need a trained context-dependent HMM database Since the HMM database is saved in files, a flash memory is used to store the HMM database so that we can use read only zip file system (which is supported by Altera) to load data from the HMM database

Trang 18

Figure 4 Hardware architecture for HTS EXPERIMENT

Building the proposed system shown in Figure 4 on Stratix IV FPGA development board, in which input text device is a touch-screen, audio output device is a DAC card connecting to a speaker Performance of system

is shown in Table 2

Table 2 Performance of HTS on FPGA-based platform with a co-processor

Input text Synthesized speech

(Sampling rate = 38 KHz)

Time-cost (s) Number of

samples

Length (s)

đại học khoa học tự nhiên 95040 2.501 2.428

Table 2 shows that performance time-cost is smaller than the length of synthesized speech, i.e., the requirement of real-time processing is met Comparing to the system which do not have co-processor, the performance time-cost is reduced significantly When co-processor is not used, the performance time-cost is above ten times larger than the length of synthesized speech But after integrating co-processor into the system and setting system configuration appropriately, performance time-cost can decrease to a value smaller than the length of synthesized speech

Moreover, synthesized speech is intelligible and has the same quality to the speech synthesized by HTS built on PC-platform Denoting waveforms which generated from the same input text by the proposed HTS and HTS built on PC-platform by 𝑋1 and 𝑋2, respectively

𝑋1 = [𝑥11, 𝑥12, … , 𝑥1𝑁]

𝑋2 = [𝑥21, 𝑥22, … , 𝑥2𝑁]

Trang 19

where 𝑥1𝑖 and 𝑥2𝑖 with 𝑖 = 1, 2, … , 𝑁 are samples of 𝑋1 and 𝑋2, respectively

Mean square error (MSE) between two vectors 𝑋1 and 𝑋2 is calculated as following equation

𝑀𝑆𝐸 =1

𝑁∑(𝑥1𝑖− 𝑥2𝑖)2𝑁

𝑖=1

(1)

(a) (b) Figure 5 Waveform generated from the input text ” bộ giáo dục và đào tạo”

by proposed HTS (a) and HTS built on PC-platform (b) Applying Eq.-1 to waveforms which are generated from different input text, we obtain the result in Table 3

Table 3 Mean square error between waveforms generated by proposed HTS and HTS built on PC-platform

bộ giáo dục và đào tạo 0.034 đại học khoa học tự nhiên 0.020

REFERENCES

[1] Tokuda K., Zen H., & Black A W (2002, September) An HMM-based speech synthesis system applied

to English In Speech Synthesis, 2002 Proceedings of 2002 IEEE Workshop on (pp 227-230) IEEE

[2] Tokuda K., Masuko T., Miyazaki N., & Kobayashi T (2002) Multi-space probability distribution HMM IEICE TRANSACTIONS on Information and Systems, 85(3), 455-464

[3] Tokuda K., Masuko T., Miyazaki N., & Kobayashi T (1999, March) Hidden Markov models based on multi-space probability distribution for pitch pattern modeling In Acoustics, Speech, and Signal Processing, 1999 Proceedings., 1999 IEEE International Conference on (Vol 1, pp 229-232) IEEE

[4] Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., & Kitamura, T (1998, December) Duration modeling for HMM-based speech synthesis In ICSLP (Vol 98, pp 29-31)

[5] Yoshimura T., Tokuda K., Masuko T., Kobayashi T., & Kitamura T (1999) Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis In Sixth European Conference on Speech Communication and Technology

Trang 20

[6] Tokuda K., Yoshimura T., Masuko T., Kobayashi T., & Kitamura T (2000, June) Speech parameter generation algorithms for HMM-based speech synthesi s In Acoustics, Speech, and Signal Processing,

2000 ICASSP’00 Proceedings 2000 IEEE International Conference on (Vol 3, pp 1315-1318) IEEE [7] Fukada T., Tokuda K., Kobayashi T., & Imai S (1992, March) An adaptive algorithm for mel-cepstral analysis of speech In Acoustics, Speech, and Signal Processing, 1992 ICASSP-92., 1992 IEEE International Conference on (Vol 1, pp 137-140) IEEE

[8] Tokuda K., Kobayashi, T Masuko, T., & Imai S (1994, September) Mel-generalized cepstral analysis-a unified approach to speech spectral estimation In ICSLP

[9] SPTK Working Group (2013, December) Reference Manual for Speech Signal Processing Toolkit Ver 3.7 http://sp-tk.sourceforge.net/

[10] HTS Working Group HMM-based Speech Synthesis Engine (hts_engine API) Ver 1.06 http://htsengine.sourceforge.net/

[11] Pham N M., Dau D N., & Vu Q H (2013) Distributed Web Service Architecture Towards Robotic Speech Communication: A Vietnamese Case Study Int J Adv Robotic Sy, 10(130)

[12] Taylor P (2009) Text-to-speech synthesis Cambridge University Press

[13] Kim S J., Kim J J., & Hahn M (2006) HMM-based Korean speech synthesis system for hand-held devices Consumer Electronics, IEEE Transactions on, 52(4), 1384-1390

[14] Khalil K M., & Adnan C (2013, March) Arabic HMM-based speech synthesis In Electrical Engineering and Software Applications (ICEESA), 2013 International Conference on (pp 1-5) IEEE

[15] Nguyen H B., Cao T B T., Bui T T.,& Huynh H T (2013, November) A Performance Evaluation of HMM Based Text- to- Speech System on Various Platforms Proceedings of ICDV-2013, pp 265-267

Trang 21

VIII-O-3

DESIGNING A HIGH PERFORMANCE CRYPTOSYSTEM

FOR VIDEO STREAMING APPLICATION Nguyen Van Toan 1 , Do Quoc Minh Dang 1 , Nguyen Duc Phuc 1 , Nguyen Dinh Thuc 2 , Huynh Huu Thuan 1

1Faculty of Electronics and Telecommunications, HCMC University of Science

2Faculty of Information Technology, HCMC University of Science

ABSTRACT

This paper presents the hardware design of a high performance cryptosystem for video streaming application Our proposed system is the combination of two cryptographic algorithms, symmetric key algorithm and asymmetric key algorithm (also called public key algorithm) to take their benefits The symmetric key algorithm (ZUC) is used to encrypt/decrypt video, and the public key algorithm (RSA) performs the encryption/ decryption for the secret key This architecture has high performance, including high security and high processing bit rate High security is archieved due to the ease of key distribution of the asymmetric key cryptosystem and the secret key can be easily changed High processing bit rate of video encryption/decryption is the result of the high speed of encryption/decryption of the symmetric key algorithm The H.264 video decoder is also integrated into this system to test the functionality of the proposed cryptosystem This system is implemented in Verilog-HDL, simulated by using ModelSim simulator and evaluated by using Altera Stratix IV-based Development Kit The speed of video decryption achieves up to 4.0 Gbps at the operating frequency

of 125 MHz, which satisfies applications with high bandwidth requirement such as video streaming.

Keywords: cryptosystem, encryption, decryption, RSA, ZUC, FPGA

INTRODUCTION

Nowadays information security is a subject with a high interest The development of computer networks, particularly Internet, results more and more applications and services are carried out electronically, for example, PayTV, video streaming, internet-banking, and so on Since the information of these applications and services are possible transmitted in insecure channels, the demand of information security becomes essential The increase of the demand of information security makes cryptography to become important

Symmetric key cryptography uses the same key for both encryption and decryption The advantage of symmetric key algorithms is that their execution is fast [1] However, the critical issue of the symmetric key cryptosystem is the secret key distribution On the other hand, the public key algorithm uses a pair of keys(public key and private key) to perform data encryption and decryption The advantage of the public key cryptosystem is that providing public keys is easier than distributing secret keys securely [2] However, the execution of public key algorithms is much slower than the execution of symmetric key algorithms.A hybrid cryptographic system in [2] was implemented by combining Advanced Encryption Standard (AES), Data Encryption Standard (DES) and public key algorithm (RSA), which has benefits in key distribution and high security [2] Data block is encrypted

by using AES or DES while their secret keys are encrypted by using RSA algorithm The encrypted secret key is then concatenated with the encrypted data to form the packets and sent to the destination This implementation does not need key exchange separately [2] However, every data block contains the encrypted key and each data block is encrypted by using different session key, which does not save the transmission bandwidth And the system must decrypt the secret key completely before data decryption, which is not appropriate with video streaming application The system was proposed in [3] included 1024-bit RSA algorithm, 163-bit Elliptic Curve Cryptography (ECC), 128-bit AES In this system, AES was used to encrypt the transferred document to produce cipher-text, and RSA (or ECC) provided encryption/decryption for the secret key This system also achieves high security However, it does not allow us to change the secret key during data transfer Both works [2], [3], AES cryptosystem (block cipher) was used to encrypt data The drawback of block cipher are: (1) data block needs to

be padded if its size is less than block size, (2) be sufferred error propagation, (3) the speed of encryption/decryption is less than that of stream cipher

Our proposed cryptosystem combines the ZUC stream cipher [4] and the public key cipher RSA with 1024-bit key length RSA is widely used public key algorithm [1] The ZUC cipher is the new stream cipher that will be commonly used in many countries [5] It is simple, faster than block cipher [1] The video content is encrypted/decrypted by using ZUC algorithm And the secret key is encrypted/decrypted by using RSA algorithm The encrypted symmetric key is then concatenated with the encrypted video to form the transmitted packets In addition, our system allows us to change the secret key In case of no key changing, the encrypted key is not present in the transmitted packets, which saves the transmission bandwidth Additionally, we build the system that enables to decrypt a new secret key and video in parallel That means while RSA core is decypting

Trang 22

new secret key, ZUC core still uses the current secret key for data decryption This feature was not implemented

in the existing systems [2-3] It is also difficult to implement this feature by software Our proposed system achieves high security and speed which is very suitable for real time applications In paper, we focus on the implementation of hardware architecture of cryptosystem for video streaming application

SYSTEM ARCHITECTURE

The overall block diagram of the proposed embedded system

AVALON SWITCH FABRIC ETHERNET

DMA CRYPTOSYSTEM

(RSA, ZUC)

DECODER FIFO

DDR3 (A) NIOS II DDR3 (B) DISPLAY

CONTROLLER

ENCRYPTED

Figure 1 The overall block diagram of the proposed embedded system

The block diagram of the proposed embedded system is shown in Figure 1 The encrypted data (the encrypted secret key and the encrypted video stored in Server) and streamed to the evaluation board via Ethernet interface and stored into DDR3 (A) DMA module reads the encrypted data from DDR3 (A) and pushes them into FIFO The cryptosystem reads the encrypted data from FIFO to decrypt video content Firstly, the RSA coprocessor decrypts the secret key Then the ZUC coprocessor uses that secret key to generates a keystream to decrypt the video content (video in compressed H.264 format) And the video content is pushed into another FIFO When the video content is available in FIFO, the H.264 video decoder decodes the video content and writes it to DDR3 (B) Finally, the display controller reads video from DDR3 (B) andsends it to the display device.H.264 decoder module has features: capable to decode H.264/AVC baseline profile video of VGA resolution (640x480) with 25 frames per second at the clock frequency of 25 MHz Output frame format is in 4:2:0 YCbCr sampling format

The block diagram of the proposed cryptosystem

Our proposed cryptosystem is the combination of ZUC algorithm and RSA algorithm The RSA algorithm

is used to encrypt/decrypt the secret key (key of ZUC algorithm) ZUC algorithm provides the encryption/decryption for video content Figure 2 illustrates our proposed cryptosystem

DECRYPT CONTROLLER controls to read the encrypted secret key from FIFO to its registers And then RSA coprocessor performs to decrypt the secret key When RSA coprocessor completes its decryption, it indicates to ZUC coprocessor by asserting zuc_key_valid signal The ZUC coprocessor then loads the secret key into its LFSR and produces a keystream Video content is recovered by XORing the encrypted video and thegenerated keystream The decrypted video will be stored in the FIFO Whenever the secret key needs to be changed (through the signaling in the header of the received packets), the RSA decrypts that new secret key while ZUC still uses the current key to produce the keystream for decrypting video content As soon as RSA coprocessor completes itsoperation, and the signaling in the received packet indicates to apply the new secret key, ZUC coprocessor then uses that new secret key to generate a keystream for the next decryption Figure 3 shows the frame format of each transmitted packet It is made of the encrypted video, encrypted secret key and signaling The signaling aims to: (1) when new encrypted secret key is coming, (2) when new secret key is applied

Trang 23

ZUC RSA

DECRYPT CONTROLLER

reset_n clk enable

fifo_rd_req fifo_almost_empty data_fr_fifo clk

fifo_wr_req fifo_almost_full

zuc_key_valid zuc_key

ctrl_sig_zuc ctrl_sig_rsa

Figure 2 The proposed cryptographic system

Figure 3 Encrypted packet

The advantages of our system are as follows

High security is achieved because the secret key is encrypted by the RSA algorithm, and there is no key establishment separately before data transferring

We can change the secret key at anytime without key re-establishment as in traditional cryptosystem

Our system saves the transmission bandwidth by elemenating the encrypted secret key in the packets sent

in case of no key changing

Our proposed system enables to decrypt a new secret key and the encrypted video in parallel, which makes the quality of service better, e.g., video decryption is performed continuously and smoothly

Design of ZUC

ZUC is a word-oriented stream cipher [4] It takes a 128-bit initial key and a 128-bit initial vector as input, and outputs a keystream of 32-bit words.The architecture of ZUC stream cipher is proposed as Figure 4 The top layer is a linear feedback shift register (LFSR) that consists of 16 of 31-bit registers The middle layer is bit reorganization (BR) that extracts 128 bits of registers of LFSR to form 4 of 32-bit words The first three words are the inputs of nonlinear function F, and the last word is used in keystream generation The bottom layer is the nonlinear function F that takes three words X0, X1, X2 as inputs and outputs 32 bit word W The outputted keystream is shifted into a 32-bit register

The LFSR has two operation modes: initialization mode and working mode In initialization mode, the LFSR receives 31 bits of W (bit 31 to 1) as its input In the working mode, the LFSR does not receive any input, and produces a 32-bit word per clock cycle In hardware implementation, we use a multiplexer to select the input for these modes.We found that the critical path in the ZUC architecture is the circuit used to update LFSR in the initialization stage and the working stage There is a chain of six modulo (231 – 1) additions to compute the value

of S16 Therefore, the timing optimization of this critical path improves the operating frequency of ZUC core The expression of S16 is given in equation (4)

v=215S15+217S13+221S10+220S4+(1+28)S0 mod (231-1) (3)

S16=[v+(W>>1)] mod (231-1) (4)

We propose to use carry save adders (CSA) to calculate the intermediate values and ripple carry adder to calculate the final result The hierarchical CSA tree is shown in the Figure 5 In this architecture, one multiplexer selects the mode of LFSR: initialization mode or working mode To perform modulo (231 – 1)

Trang 24

addition,for each addition of CSA, carry is cyclic left-shifted by one bit This implementation helps to improve timing significantly because the delay of CSA is exactly equal to the delay of 1-bit full adder

Figure 4 Architecture of ZUC

Algorithm 1 implements the Montgomery multiplication The addition of long operands in loop is performed by 3-to-2 carry save adder (CSA) To get the final result, we need to add carry output and sum output

of CSA In this paper, we use 32-bit RCA and a shift register to implement this final addition because of its simplicity and area saving It takes (k+3+k/32) clock cycles to complete the Montgomery multiplication, where k

is the size of the operands; k/32 is the number of clock cycle to complete the final addition Figure 6 shows the CSA-based Montgomery multiplier

Trang 25

Algorithm 1 – Montgomery multiplication by using

//Output: p = xyr-1 mod n with r = 2(k+2)

Algorithm 2 – Modular exponentiation, L-R method

register

ss sc

Control Unit

ss(0)

load shift

x n

y

x(i)

Figure 6 Montgomery multiplier

Montgomerymultiplier

registerInitial = b

startdone

i

x

Figure 7 Modular exponentiation using MP

Algorithm 2 implements the modular exponentiation by using the Montgomery multiplier In this algorithm, C is the operand that has the length of 1024 bits; di is the exponent with the length of 1024 bits.The block diagram of the modular exponentiation is shown in Figure 7 This architecture uses only one Montgomery multiplier Two multiplexers are used to select inputs for the Montgomery multiplier Based on the input value

di, the control block determines the values of sel_1 and sel_2

RESULTS AND DICUSSION

Experimental results of ZUC and RSA

The ZUC implementation is passed all test sets that was provided by ZUC Implementor’s Test Data [7] All the stages of the ZUC core have been implemented in hardware To make the fair comparison, the implementation is synthesized with Quartus II (Altera) and ISE (Xilinx) as well.In [5], they implemented a pipeline architecture that achieves the maximum operating frequency of 222 MHz However, it costs higher hardware resources, higher latency (4 extra clock cycles), and initialization stage was implemented in software to reduce hardware resources In [6], their proposal used ripple carry adders in series, which limits the operating frequency of the circuit Our proposal uses hierarchical CSA tree, and RCAs, which achieves throughput up to 4.45 Gbps in Virtex 5, and 4.0 Gbps in FPGA Stratix IV EP4SGX230KF40C2

Table 1 ZUC results and comparison

Architecture Technology Slices/ALUTs Frequency

(MHz)

Bit rate (Gbps) Our proposal EP4SGX230KF40C2 1166ALUTs 125 4.0

Our proposal XC5VLX50-3FF324 384 slices 139 4.45

Trang 26

In the RSA implementation, we use 3-to-2 CSA and 32-bit RCA to implement the Montgomery multiplier, which is technology independent It takes 2(k+3+k/32)*kd clock cycles to complete the modular exponentiation, where k is the bit length of the modulus, k/32 is the number of clock cycles cost to complete the final addition (sum and carry) in the Montgomery multiplication, kd is bit length of key Compared with systolic architecture [3], our implementation has a higher operating frequency The architecture in [9] used 4-to-2 CSA to implement the Montgomery multiplication However, this costs some extra registers to store intermediate results of CSA

Table 2 1024-bit RSA results and comparison

(MHz)

Number of clock cycles

Our proposal EP4SGX230KF40C2 16964 214.10 (k+3+k/32)(2kd+1)

Experimental results of the proposed cryptosystem

The design is synthesized with Quartus II tool based onStratix IV FPGA EP4SGX230KF40C2 The results show that our proposed system allows the secret key to be changed At the operating frequency of 125 MHz, the total processing bit rate is 4.0 Gbps that satisfies the required bandwidth in the video streaming application Figure.8 and 9 shows the decryption process for video content The original video content is recovered by XORing the generated keystream and the encrypted video Figure 9 shows the new secret key applied when the signaling value of 0x2

Figure 8 The result captured by SignalTap Logic Analyzer (using the first key)

Figure 9 The result captured by SignalTap Logic Analyzer (using the second key)

To test the operation of our cryptosystem, we integrated H.264 decoder into our system (Figure 1) to decode the video content Figure 10 shows the video content in memory captured by In-system Memory Content Editor tool that is integrated into Quartus II tool Fig 11 shows one video frame that is displayed on the display device

Trang 27

Figure 10 Video captured by In-system Memory

Editor

Figure 11 Video content displayed on display device

CONCLUSION

The high performance cryptosystem is presented in this paper that has been implemented and prototyped

on FPGA Stratix IVEP4SGX230KF40C2 The experimental results show that key exchange does not need to be performed on a dedicated channel as in traditional cryptosystem In addition, key changing can beperformed during one session, which maximizesthe security of this cryptosystem The decryption bit rate of this architecture is up to 4.0 Gbps at the operating frequency of 125 MHz, which is high enough for the real-time application such as video streaming In this implementation, we do not focus on improving the operating frequency but also optimizing the hardware resources

Acknowledgement The authors would like to thank to CESLab for technical support and for

providing us with FPGA evaluation board The Department of Science and Technology of Ho Chi Minh

City have funded this research

THIẾT KẾ PHẦN CỨNG HỆ THỐNG MẬT MÃ CÓ HIỆU NĂNG CAO

CHO ỨNG DỤNG TRUYỀN VIDEO Nguyễn Văn Toàn 1 , Đỗ Quốc Minh Đăng 1 , Nguyễn Đức Phúc 1 , Nguyễn Đình Thúc 2 , Huỳnh Hữu Thuận 1

1Khoa Điện tử Viễn thông, Trường Đại học Khoa học Tự Nhiên, ĐHQG-HCM

2Khoa Công Nghệ Thông Tin, Trường Đại học Khoa học Tự Nhiên, ĐHQG-HCM

TÓM TẮT

Bài báo này trình bày về thiết kế phần cứng hệ thống mật mã có hiệu năng cao dành cho ứng dụng truyền video Hệ thống chúng đề nghị là hệ thống kết hai thuật toán mã hóa này nhằm tận dụng các ưu điểm của chúng Thuật toán mã hóa đối xứng ZUC được sử dụng để mã hóa/giải mãvideo, trong khi đó thuật toán mã hóa công khai RSA thực hiện mã hóa/giải mã khóa bí mật Kiến trúc này đạt được hiệu năng cao như: độ bảo mật cao, tốc độ xử lí (mã hóa/giải mã) cao.Hệ thống đạt được độ bảo mật cao nhờ sự trao đổi khóa bí mật dễ dàng của hệ mật mã công.Nhờ tốc độ mã hóa/giải mã cao của thuật toán mã hóa khóa đối xứng mà tốc độ mã hóa/giải mã của hệ thống đạt được là rất cao

Bộ giải mã video H.264 cũng được tích hợp vào hệ thống để kiểm thử chức năng của hệ thống mật

mã Hệ thống này được thực hiện phần cứng bằng ngôn ngữ đặc tả phần cứng Verilog-HDL, sau đó được mô phỏng bằng bộ mô phỏng ModelSim, và được kiểm tra, đánh giá trên bộ Kit của Altera dùng FPGA Stratix IV Tốc độ giải mã mà hệ thống đạt được lên đến 4.0 Gbps tại tần số hoạt động là 125 MHz, thỏa mãn các ứng dụng truyền video

Keywords: hệ thống mật mã, mã hóa, giải mã, RSA, ZUC, FPGA

REFERENCES

[1] A Menezes, P Oorschot, S Vanstone, “Handbook of Applied Cryptography”, CRC Press, 1997

Trang 28

[2] Adnan Abdul-Aziz Gutub, Farhan Abdul-Aziz Khan, “Hybrid Crypto Hardware Utilizing Symmetric-Key

& Public-Key Cryptosystems”, 2012 International Conference on Advanced Computer Science Applications and Technologies, IEEE

[3] Mohamed Khalil Hani, Hau Yuan Wen, Arul Paniandi, “Design and Implementation of a Private and Public Key Crypto Processor for Next-generation its Security Applications”, Malaysian Journal of Computer Science, Vol 19 (1), 2006, pp.29-45

[4] ETSI/SAGE Specification Specification of the 3GPP Confidentiality and Integrity Algorithms 128-EEA3

& 128-EIA3 Document 2: ZUC Specification; Version: 1.6; Date: 28th June 2011

[5] Lei Wang, et al, “Evaluating Optimized Implementations of Stream Cipher ZUC Algorithm on FPGA”, Springer 2011, pp.202-215

[6] Paris Kitsos, Nicolas Sklavos, Athanassios N Skodras, “An FPGA Implementation of the ZUC Stream Cipher”, 14th Euromicro Conference on Digital System Design, 2011, IEEE

[7] C McIvor, M McLoone, J.V McCanny, “Fast Montgomery Modular Multiplication and RSA Cryptographic Processor Architectures”, Conference Record of the thirty-seventh Asilomar Conference,

pp 379-384, 2003

[8] ETSI/SAGE Specification Specification of the 3GPP Confidentiality and Integrity Algorithms 128-EEA3

& 128-EIA3 Document 3: Implementor’s Test Data; Version: 1.1; Date: 4th Jan 2011

[9] Wen nuan, Dai Zi bin, Zhang Yong Fu, “FPGA Implementation of Alterable Parameters RSA Public-Key Cryptographic Coprocessor”, IEEE, 2005

Trang 29

Key words: Graphene, Graphene nanoribbon FET, non- equilibrium Green’s function, voltage characteristics

current-INTRODUCTION

Graphene [1-8] has been one of the most vigorously studied research materials since its inception in 2004 Graphene has attracted considerable attention from scientific community due to its excellent electronic properties, such as high electron and hole mobilities even at room temperature and at high doping concentration [9], high thermal conductivity [10], and its interesting optical properties [11] 2D graphene is a gapless material, which makes it unsuitable for digital IC applications However, an energy bandgap can be induced by tailoring a graphene sheet into graphene nanoribbons (GNR) called 1D graphene (GNR) [12] Depending on the orientation

of the ribbon edges, GNR can have edges with zigzag shape, armchair or a combination of these two [13] In order to obtain a suitable bandgap for transistor applications, the width of GNR must be scaled to extremely small values Bandgap energy of narrow GNR is inversely proportional to the width of the GNR In narrow GNR, line-edge roughness plays an important role in the device characteristics [14-20] The effect of line-edge roughness on the device performance of GNR field-effect transistor (GNR-FET) has been numerically studied in [14-15, 21]

In this paper, using top-gate GNR-FET model, device performances are investigated The electronic transport in the GNR-FET used narrow GNR as channel of sub-10 nm is studied The device characteristics are explored by using the non-equilibrium Green’s function method Basing on the obtained results, on-off current ratio of the GNR-FET for digital IC applications has been calculated This work is organized as follows: section

2 describes channel materials used for GNR-FET, simulation method, and results of simulations Concluding remarks are drawn in section 3

MATERIAL AND SIMULATION METHOD

Graphene channel materials

Bandgap engineering In modern electronics, bandgap formation is the key concept for switching current,

and thus, for processing electric signals

Although graphene has great advantages for use in electronics applications, including atomically thin channels, high mobility, and large electric field effects, its semi-metallic electronic band structure makes the creation of a graphene transistor quite challenging

So far, several methods have been proposed for introduction of bandgap in graphene Among them the most promising are graphene nanoribbons In this section, we briefly review theoretical predictions, experimental results, and the major challenges of the formation of bandgap in graphene

Graphene nanoribbons In quantum mechanical systems, the confinement of carriers leads to discrete

energy levels This also the case in graphene; however, some diffences are seen because of its peculair lattice structure

Thin graphene wires are called graphene nanoribbons Two common structures, armchair and zigzag nanoribbons (Figure 1), have been intensely studied theoretically

Theoretical predictions In the following theoretical treatment of graphene nanoribbons, the graphene

edges are assumed to be passivated by hydrogen, as illustrated in Figure 1

Trang 30

In the tight binding (TB) approximation for π-electrons in graphene, armchair graphene nanoribbons are metallic when the number of carbon atoms in the ribbon width, Na satisfies the relation, Na = 3p+2 (where p is a positive integer), and are semiconducting otherwise The energy gap ΔNa is inversely proportional to the width in each group, Na = 3p or Na = 3p+1

Zigzag nanoribbons in the TB approximation are metallic and have flat bands at  = 0

In the first-principles calculation using the local spin density approximation (LSDA), the result is significantly different from that discussed above Specially, all of the armchair and zigzag nanoribbons are semiconducting with gaps depending on the ribbon width

The energy gap of zigzag nanoribbons in the LSDA calculation, Δ, is well fitted by

(1) for the ribbons width w > 1 nm

Figure 1 Two kinds of graphene nanoribbons: a) armchair and b) zigzag Na and Nz denote the

number of carbons in ribbon width in armchair and zigzag nanoribbons, respectively White circles

indicate hydrogen atoms passivating the graphene edges

The magnitude of the gaps is presented in Figure 2

Figure 2 Energy gaps in graphene nanoribbons

Experiments Graphene nanoribbons have been made by various methods, including electron beam

lithography followed by oxygen plasma etching [22-25], and chemical derivation [26-29] The main challenge in gap formation in graphene nanoribbons is suppression of structural disorder Structural disorder causes weak localization and the Coulomb blockade effect, and suppresses the mobility

Lithographically defined graphene nanoribbons were first reported by Han et al in 2007 [22] After contacting a graphene flake with Cr/Au (3/50 nm) electrodes, they produced a graphene nanoribbon from the flake by oxygen plasma etching They estimated the magnitude of the energy gap, and found that the energy gap

g is well fitted by

Trang 31

(2)

where w is the ribbon width, a = 0.2 eVnm, and w* = 16 nm Han et al attributed inactive width w* to contribution from localized edge state near the ribbon edge caused the structural disorder from etching process Graphene nanoribbons have also been made by chemical exforliation Li et al [26] obtained graphene nanoribbons with edges that appeared smoother than those obtained lithographically

Graphene nanoribbons with various widths ranging from 50 nm down to sub-10 nm scale were obtained by this method The room temperature on-off current ratio Ion/Ioff induced by the back-gate voltage increased exponentially with decreasing ribbon width; it reached 107 in sub-10 nm ribbons Here, the on (off) current Ion

(Ioff) is defined as the maximum (minimum) value of the source-drain current I for a fixed bias (source-drain) voltage V within a measured gate voltage range The energy gap g estimated from relationship

(3)

was converted into an empirical form

(4) and falls between the limits of theoretical results (Figure 2)

Wang et al [28] reported that even in smooth, chemically graphene nanoribbons with widths of sub-10 nm, the mobility was limited to 200 cm2/Vs and the mean free path was limited These values are significantly smaller than those for wider graphene devices These values were attributed to scattering at the edges caused by edge roughness

Top-gate graphene nanoribbons FET

In this sub-section, the effect of the geometrical parameters on the transfer characteristics and performance

of GNR-FET is investigated A top-gate GNR-FET with gate oxide of Al2O3 with relative dielectric constant, r

= 9.8 is assummed [30] Graphene monolayer flake is exfoliated from bulk natural graphite crystals by the micromechanical cleavage The substrate consists of a highly-doped, n-type Si(100) wafer with an arsenic doping concentration of ND > 1020 cm-3, on which a 300 nm-thick SiO2 layer is grown by thermal oxidation Metal contacts on the sample is defined by using electron beam lithography (EBL) followed by a 50 nm-thick metal (Ni) layer evaporation and a lift-off process A graphene FET with source-drain separation and top-gate length is shown in Figure 3 [30]

Figure 3 Structure of top-gate graphene field-effect transistor [30] is used in our simulations

For all simulation, the widths of source and drain contacts of 1 nm, the length of channel of 10 nm, room temperature are assummed The top-gate GNR-FET having channel of a highly-doped, n-type with NH3 doping concentration is also assummed for suppressing Schottky effect in the source-semiconducting-drain contacts of the device

The flow of current is due to the difference in potentials between the source and the drain, each of which is

in a state of local equilibrium, but maintained at different electro-chemical potentials 1,2and hence with two distinct Fermi functions [31]:

11 1

Trang 32

    exp    /  1

12 2

by the applied bias V:21qV Here, E- energy, kB - Boltzmann constant, T- temperature

The density matrix is given by

, we can calculate the current from (8) For coherent transport, one can calculate the transmission from the Green’s function method, using the relation

1 1

2 , 1 2 , 1 2 , 1 1 2 1

,

, ,

A A G G i A E f A E f A

G

G G E A G G E A i

H EI

Results and discussion

The main goal of the project was to make a user-friendly simulation program that provides as much control

as possible over every aspect of the simulation Flexibility and ease of use are difficult to achieve simultaneously, but given the complexity of quantum device simulations became clear that both criteria were vital to program success Consequently, graphic user interface development was major part of the program

We start by simulating ID-VD characteristics of top-gate GNR-FET Figure 3 shows the schematic of the device used in our simulations Top-gate GNR-FET with one-dimensional graphene as the channel is simulated The device is simulated with Al2O3 as the dielectric which has been predicted to be one of the promising dielectrics for GNR-FETs in recent experiment [30] All the simulations have been done for channel length of GNR-FET, L = 10 nm

Figure 4 shows the ID-VD characteristics of the GNR-FET having the length of 10 nm versus different gate voltages It can be noted that when the gate voltage is increased the saturated drain current exponentially increased This behavior is in agreement with experimental results [31]

Trang 33

Figure 4 The ID-VD characteristics of the top-gate GNR-FET at different gate votage, VG = 0.1 V,

0.4 V, 0.6 V, 0.8 V (bottom to up)

Figure 5 shows the ID-VD characteristics of the top-gate GNR-FET having the length of 10 nm under ballistic transport and that with phonon scattering It is shown that scattering can have an appreciable affect on the on-current At VGS = 0.8 V, the on-current is reduced by 9% due to the phonon scattering

Figure 5 The ID-VD characteristics of the gate top GNR-FET at VG = 0.8 V for ballistic, scattering,

where the length of the gate is LG=10 nm

Figure 6 shows ID-VD characteristics of GNR-FET versus the gate voltage, VG When the gate voltage is small, the drain current is gradually increased When the gate voltage is greater than VG = 0.3 V, the drain current is exponentially increased The modeling results agree well with experimental data [31]

Figure 6 The 3D plot of ID-VD characteristics of the top gate GNR-FET versus VG, where the length

of the gate is LG=10 nm

Figure 7 shows the 3D plot of ID-VD characteristics of the GNR-FET versus the temperature, T It can be noted that as the temperature increases the saturated drain current gradually increases We also observe that the off-current is about 1×10-9 nA at very low temperature and the low gate voltage, Vg = 0.1 V From Figure 4 and 7

we can calculate on/of-current ratio, Ion/Ioff = 1×10-5 nA/1×10-9 nA = 104

Trang 34

Figure 7 The 3D plot of the ID-VD characteristics of the top-gate GNR-FET versus temperature The

GNR-FET parameters are: material, Al2O3, the gate length is LG = 10 nm, the gate thickness is tox = 2

nm, at the gate voltage, VG = 0.1 V

The effect of the channel length scaling on the device characteristics is investigated ID-VD characteristics

of GNR-FET versus the length of the gate layer at room temperature are shown in Figure 8 Apparently, as the length of the GNR-FET decreases, the saturated drain current gradually increases

Figure 8 The 3D plot of the ID-VD characteristics versus the gate length of the top-gate GNR-FET at

room temperature, T = 300 K The parameters of the GNR-FET: material, Al2O3, the gate thickness,

tox= 2 nm

Figure 9 shows ID-VD characteristics of the top-gate GNR-FET versus the gate thickness at room temperature Apparently, as the gate thickness, tox of the GNR-FET is increased, the saturated drain current is gradually decreased

Figure 9 The 3D plot of ID-VD characteristics of the top-gate GNR-FET versus the gate thickness, tox

at room temperature, T = 300 K The parameters of the GNR-FET: material, Al2O3, the gate length is

LG = 10 nm

Trang 35

CONCLUSION

A model for the top-gate GNR-FET using NEGF written in GUI of Matlab has been reported The top-gate GNR-FET has been simulated Typical simulations is then successfully performed for various parameters of the GNR-FET or the electronic transport of GNR-FET has been investigated The model is not only able to accurately describe ID-VG, ID-VD characteristics of the GNR-FET, but also effects of channel materials, gate materials, size of GNR-FET, temperature on the characteristics The obtained results indicate that the performance of GNR-FET in terms of on/off-current ratio is improved in narrow ribbons, while the conductance

is degraded in longer channel We also observe that the on/off-current ratio of the GNR-FET is 104 as the width of 1 nm and the GNR-length of 10 nm

GNR-MÔ PHỎNG TRANSISTOR HIỆU ỨNG TRƯỜNG DẢI NANO GRAPHENE

Từ khóa: Graphene, GNR-FET, hàm Green không cân bằng, đặc trưng dòng-thế

REFERENCES

[1] K.S Novoselov, A.K Giem, S.V Morozov, D Jang, Y Zhang, S.V Dubonos, I.V Grigorieva, and A.A

Firsov, Electric field effect in atomically thin films, Science, vol 306, No 5696, p 666-669, 2004

[2] L Jiao, L Zhang, X Wang, G Diankov, and H Dai, Narrow graphene nanoribbons from carbon

nanotubes, Nature, vol 458, p 877-880, 2009

[3] X Li, X Wang, L Zhang, S Lee, H Dai, Chemically drived, ultrasmooth graphene nanoribbon

semiconductors, Science, vol 319, No 5867, p 1229-1232, 2008

[4] K.I Bolotin, K.J Sikes, Z Jiang, G Fundenberg, J Hone, P Kim, and H.L Stormer, Ultrahigh electron

mobility in suspended graphene, Solid State Comm., vol 146, p 351-355, 2008

[5] M.S Purewal, Y Zhang, and P Kim, Unusual transport properties in carbon based nanoscaled materials:

nanotubes and graphene, Phys State Sol.(b), vol 243, No.13, p 3418-3422, 2006

[6] J.S Moon, D Curtis, M Hu, D Wong, P.M Campbell, G Jernigan, J.L Tedesco B Vanmil, R Ward, C Eddy, and D.K Gaskill, Epitaxial graphene RF field-effect transistors on Si-face 6H-SiC

Myers-substrates, IEEE electron device Lett., vol 30, No 6, p 650-652, 2009

[7] Y.M Lin, C Dimitrakoponlos, K.A Jenkins, D.B Farmer, H.Y Chiu, A Grill, Ph Avouris, 100-GHz

transistors from wafer-scale epitaxial graphene, Science, vol.327, No 5966, p 662, 2010

[8] Y.Q Wu, P.D Ye, M.A Capano, Y Xuan, Y Sui, M Qi, J.A Cooper, T Shen, D Pandey, G Prakash,

and R Reifenberger, Top-gate graphene field effect transistors formed by decomposition of SiC, Appl Phys Lett., vol 92, No.9, p 092102, 2008

[9] Schedin, A.K Geim, S.V Morozov, E.W Hill, P Blake, M.I Katsnelson, and K.S Novoselov, Detection

of individual gas molecules absorbed on graphene, Nature Materials, vol.6, No.9, p.625-655, 2007

[10] A.A Balandin, S Ghost, W Bao, I Calizo, D Teweldebrhan, F Iao, and C.N Lau, Superior thermal

conductivity of single-layer graphene, Nano Lett., vol.8, No.3, p.902-907, 2008

[11] T Mueller, F Xia and P Avouris, Graphene photodetectors for high-speed optical communications,

Nature Photonics, vol.4, No.5, p.297-301, 2010

[12] Z Chen, Y Lin, M Rooks, and P Avouris, Graphene nanoribbon electronics, Phys E: Low Dimension System Nanostructure, vol.40, No.2, p.222-232, 2007

[13] K Nakada, M Fujita, G Dresselhaus, and M.S Dresselhaus, Edge state in graphene ribbons: nanometer

size effect and edge shape dependence, Phys Rev B: condens matter, vol.54, No.24, p.17954-17961,

1996

[14] Y Yoon and Guo, Effect of edge roughness in graphene nanoribbon transistors, Appl Phys Lett., vol.91,

No.7, p.073103/1-7, 2007

Trang 36

[15] D Basu, M.J Gilbert, L.F Register, S.K Banerjee, and A.H MacDonald, Effect of edge roughness on

electronic transport in graphene nanoribbon channel metal-oxide-semiconductor field-effect transistors, J Appl Phys., vol.92, No.4, p.042114/1-3, 2008

[16] E.R Mucciolo, A.H Castro Neto, and C.H Lewenkopt, Conductance quantization and transport gaps in

disordered graphene nanoribbons, Phys Rev B: condens matter material Phys., vol.79, No.7,

p.075407/1-3/, 2009

[17] D.A Areshkin, D Gunlycke, and C.T White, Ballistic transport in graphene nanostrips in presence of

disorder: Importance of edge effects, Nano Lett., vol.7, No.1, p.204-210, 2007

[18] D Gunlycke, D.A Areshkin, and C.T White, Semiconducting grapene nanotrips with edge disorder,

Appl Phys Lett., vol.90, No.14, p.142104/1-3/, 2007

[19] M Evaldsson, I.V Zozoulenko, H Xu, and T Heinzel, Edge disorder induced Anderson location and

conduction gap in graphene nanoribbons, Phys Rev B: Condens Matter Mater Phys., vol.78, No.16,

p.161407/1-4/, 2008

[20] Y Yang and R Murali, Impact of size effect on graphene nanoribbon transport, IEEE Electron Device Lett., vol.31, No.3, p.237-239, 2010

[21] A Yazdanpanah, M Pomfath, M Fathipour, and H Kosina, Device performance of graphene nanoribbon

field effect transistors in the presence of line-edge roughness, IEEE Transactions on electron devices,

vol.59, No.12, p 3527-3532, 2012

[22] M Y Han, B Ozyilmaz, Y Zhang, and P Kim, Energy band-gap engineering of graphene nanoribbons,

Phys Rev Lett., vol.98, p 206805/1-4, 2007

[23] Z Chen, Y.M Lin, M.J Rooks and Ph Avouris, Graphene nanoribbon electronics, Physica E, vol.40,

p.228-232, 2007

[24] K Todd, H.T Chou, S Amasha, and D Goldhaber Gordon, Quantum dot behavior in graphene

nanocontrictions, Nano Lett., vol.9, p.416-421, 2009

[25] M.Y Han, J.C Brant, and P Kim, Electron transport in disordered graphene nanoribbons, Phys Rev Lett., vol.104, p.056801/1-4, 2010

[26] K Li, X Wang, Li Zhang, S Lee, H Dai, Chemically derived, ultrasmooth graphene nanoribbon

semiconductors, Science, vol.319, p.1229-1232, 2008

[27] Y Ouyang, X Wang, H Dai, and J Guo, Carrier scattering in graphene nanoribbon field-effect

transistors, Appl Phys Lett., vol.92, p.243124/1-4, 2008

[28] X Wang, V Ouyang, X Li, H Wang, J Guo, and H Dai, Room temperature all-semiconducting, sub-10

nm graphene nanoribbon field-effect transistors, Phys Rev Lett., vol.100, p.206803/1-4, 2008

[29] J.M Poumirol, A Cresti, S Roche, W Escoffier, M Goiran, X Wang, X Li, H Dai, and B Raquet,

Edge magnetotransport fingerprints in disordered graphene nanoribbons, Phys Rev B, vol.82,

p.041413/1-4, 2010

[30] S Datta, Quantum Transport: Atom to Transistor, Cambridge University Press, (2005)

[31] [31] S Kim, J Nah, I Jo, D Shahrjerdi, L Colombo, Z Yao, E Tuctuc, and S.K Banerjee, Realization

of a high mobility dual-gated graphene FET with Al2O3 dielectric, Appl Phys Lett., vol.94, p.062107/1-3,

2009

Trang 37

VIII-O-5

TÁCH VÀ LOẠI BỎ NHIỄU CHO TÍN HIỆU ĐIỆN TÂM ĐỒ ECG SỬ DỤNG PHƯƠNG

PHÁP PHÂN TÍCH THÀNH PHẦN ĐỘC LẬP FASTICA CẢI TIẾN

Nguyễn Ngọc Hùng, Bùi Trọng Tú, Hồ Anh Vũ, Dương Văn Tuấn

Trường Đại học Khoa học Tự nhiên, ĐHQG-HCM

Email: hoanhvu2511@gmail.com, {nnhung, bttu}@fetel.hcmus.edu.vn

TÓM TẮT

Ngày nay, phương pháp phân tích thành phần độc lập ICA (Independent Component Analysis) được sử dụng rất phổ biến, đặc biệt là trong xử lý tín hiệu y sinh đòi hỏi độ chính xác lẫn tốc độ xử lý cao Bởi vì tín hiệu y sinh thực tế có biên độ thấp, dễ ảnh hưởng bởi nhiễu và hiện tượng chồng lẫn tín hiệu mà không thể áp dụng các phương pháp lọc truyền thống thông thường để xử lý Trong bài báo này, chúng tôi đề xuất phương pháp FastICA sử dụng thuật toán cải tiến số vòng lặp được phát triển

từ phương pháp lặp Newton’s cổ điển để tăng tốc độ hội tụ và giảm sự phức tạp trong quá trình tính toán Với mục tiêu như trên, chúng tôi tiến hành mô phỏng thực nghiệm tách và loại bỏ nhiễu cho tín hiệu điện tâm đồ ECG trong nhiều trường hợp khác nhau Kết quả đạt được là các tín hiệu ECG được khôi phục hoàn toàn Thuật toán được đánh giá rất tốt thông qua giá trị sai số bình phương trung bình MSE (Mean Square Error) và hệ số đánh giá (E)

Từ khóa: ICA, FASTICA, ECG, Deflation, Symmetric

GIỚI THIỆU

Tín hiệu điên tâm đồ ECG là một trong những tín hiệu y sinh đã được nghiên cứu rộng rãi và sử dụng cho việc chẩn đoán bệnh ECG đã và đang rất được quan tâm đến bởi các thiết bị lẫn quá trình đo còn gặp rất nhiều vấn đề, tín hiệu ECG thu được rất dễ bị ảnh hưởng bởi nhiều loại nhiễu khác nhau cũng như chồng lẫn trong quá trình đo và thu thập dữ liệu Nhiễu ở đây có thể kể đến: nhiễu cơ do ảnh hưởng cử động của người bệnh, nhiễu

do nguồn điện, do môi trường, do sai số trong tính toán, nhiễu từ các thiết bị điện tử trong quá trình thu nhận dữ liệu

Bài báo này được trình bày như sau: cơ sở lý thuyết thuật toán ICA được trình bày chi tiết trong phần II Phần III trình bày thuật toán FastICA và mô hình ứng dụng trong thực tế, qua đó đề xuất cải tiến thuật toán FastICA qua phương pháp lặp Newton’s cổ điển trong phần IV Phần V trình bày kết quả mô phỏng thực nghiệm

và thảo luận Cuối cùng là kết luận và đánh giá kết quả

Các nguồn tín hiệu gốc ban đầu được xem là độc lập thống kê với nhau

Ma trận trộn A là ma trận vuông (tín hiệu nguồn si và tín hiệu trộn xi bằng nhau) khả nghịch

Tối đa chỉ có một nguồn tín hiệu gốc có phân bố Gauss

Phương pháp ICA giải bài toán x  Aschính là tìm các nghiệm s = A-1x, thực tế không thể tìm được s một cách trực tiếp được, phải sử dụng thống kê thông qua phép biến đổi tuyến tính y = Wx (đặt W = A-1), y tương ứng là s Vector y được ước lượng thông qua phép đo tính phi Gauss dựa trên sử dụng các hàm phi tuyến trong phương pháp xấp xỉ Negentropy

Thuật toán FastICA sử dụng tính phi Gaussian để đo tính độc lập hỗ tương Thuật toán được đề xuất gồm 3 bước chính:

Qui tâm (Centering)

Trắng hóa (Whitening)

Xấp xỉ hoá Negentropy

Trang 38

Qui tâm

1

m

i i i

Với V là ma trận làm trắng được tính thông qua triển khai trị riêng EVD (Eigenvalue Decomposition) của

ma trận hiệp phương sai

Xấp xỉ Negentropy

Negentropy J được định nghĩa như sau:

( ) ( gauss) ( )

Trong đó ygauss là một biến ngẫu nhiên Gauss của cùng một ma trận tương quan như y Do những tính chất

đề cập bên trên, Negentropy luôn không âm Nó chỉ bằng 0 nếu và chỉ nếu y có phân bố Gauss Ước lượng Negentropy rất khó, thực tế Negentropy được xấp xỉ dựa trên các hàm đối tượng Gi

2 1

( ) [ { ( )} { ( )}]

p

i i i

4

y

THUẬT TOÁN FASTICA

Thuật toán FastICA dựa trên phép lặp một điểm cố định (fix-point) cho tốc độ hội tụ nhanh hơn so với ICA truyền thống dựa trên phương pháp lặp Newton’s cổ điển.Với hai phương pháp trực giao tuần tự (Deflationary orthonormalization) viết tắt là Deflation và trực giao đối xứng (Symmetrical orthonormalization) viết tắt là Symmetric việc giải bài toán trở nên nhanh chóng

Trực giao tuần tự (Deflation)

Một cách trực giao đơn giản chính là thực hiện trực giao từng vector theo phương pháp Gram-Schmidt [1 – 2], có nghĩa là ước lượng lần lượt từng thành phần độc lập Giả sử đã ước lượng được p thành phần độc lập, hoặc

p vectơ w1, …, wn thực hiện giải thuật tìm một thành phần cho wp+1 Tuy nhiên sau mỗi bước lặp cần trừ một lượng (wT 1w )w , jj j 1, , p

   và chuẩn hóa cho wp+1với g là đạo hàm của Gi Cụ thể các bước làm như lưu đồ thuật toán hình 1 (lưu ý: ICs là số thành phần độc lập)

Trang 39

Hình 1 Lưu đồ thuật toán FastICA sử dụng trực giao tuần tự Trực giao đối xứng (Symmetric)

Phương pháp trực giao tuần tự có hạn chế là sai số ước lượng vectơ đầu và tích lũy ở các vectơ kế tiếp, chính vì vậy phương pháp trực giao đối xứng dường như hữu hiệu hơn Phương pháp này xem tất cả các vectơ tương đương nhau, không ưu tiên cho một thành phần nào, có nghĩa là vectơ wi không ước lượng riêng biệt một đối một, mà chúng được ước lượng một lần song song Các bước thực hiện như hình 2:

Hình 2 Lưu đồ thuật toán FastICA sử dụng trực giao đối xứng CẢI TIẾN THUẬT TOÁN

Phương pháp lặp Newton’s cổ điển có tốc độ hội tụ chỉ là bậc hai Bắt đầu từ dự đoán x0,sử dụng phương trình tiếp tuyến tại x0 để xấp xỉ các giá trị x1,x n (11)

1 /

( )( )

/ ( ) ( ) '( )

Khởi tạo vector wi, với

||wi|| = 1,

i = 1,…, m

i w z

T i w g E z

T i w zg E i

w  { ( )} { ( )}

W T

Khởi tạo vecto wp, với

||wp|| = 1

p w z T p w g E z T p w zg E p

w  { ( )} { ( )}

i w i w p

i

T p w p

w p

Hội tụ

Đúng Sai

Hội tụ

Trang 40

( )/( )

( )( )

Bước 1: Gán n ← 0, khởi tạo vectơ w0 (ngẫu nhiên) ban đầu với chuẩn đơn vị, gán E{w0T xg(w0T x)}

u T

v

n E zg w z E g w z w

w1 { ( )} { ( )} Bước 3: Chuẩn hóa vector wn+1 ← wn+1/||wn+1||

Nếu thuật toán không hội tụ, gán E{w T n1xg(w T n1x)} và n ← n+1, quay lại bước 2

MÔ PHỎNG VÀ KIỂM TRA

Đánh giá

Để đảm bảo điều kiện của thuật toán ICA, chúng tôi tiến hành thực nghiệm đối với dữ liệu ECG, giả lập

ma trận lai trộn A để tiện cho viêc đánh giá chất lượng phân tách cũng như loại bỏ nhiễu thông qua giá trị MSE

và hệ số đánh giá E [3 – 4]

Định dạng
Số trang	82
Dung lượng	4,31 MB