Tuy nhien, do moi mau huan luyfn cua mpt am tiet chi co the sur dyng df huan luyfn cho chinh am tift do chu khong dun^ chung df huan luyfn cho cac am tift khac, dong thai tieng Vift
Trang 1«y<u HQi thao ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May 20-21,200(
N H A N D A N G T I E N G N 6 I T I E N G VIET su* DVNG MlTC Dir6l TU
* VATHU' NGHIEM SO SANH M O T SO PHlTONG PHAP
NHAN DANG TIENG VIET Vietnamese Speech Recognition using Subword Models and Test Experiments for Comparing Some Methods of Vietnamese Recognition
Nguyen Phii Binh, Trinh Van Loan
Tom tat
Bdi bdo nay gi&i thieu nhirng kit qud dgt duac cua chung loi Irong viec dp dung ly Ihuyil
vi mo hinh Markov dn di xdy dung mgt hi thdng nhdn dgng tiing ndi tiing Viit cd khd ndng md hinh hda mgt dan vj dm thanh bdt ky Bdng cdch diing hi thdng ndy chiing Idi dd liin hdnh khdo sdi mgi sd phuang phdp nhdn dgng lua chgn dan viphdt dm khdc nhau, tir
do dua ra m^ phuang phdp nhdn dgng khd hiiu qud, do Id sit dung cdc md hinh bdn dm liil Phucmg phdp ndy da dugc kiim nghiim tren ca sa dir Iiiu tiing ndi bao gdm 10575 lugt phdi dm 888 tir dan khdng ddu liing Viit da mgt gigng nam, vd li li nhdn dgng chinh xdc dgt duvc Id 96.89%
Tit khia: nhgn dgng liing noi md hinh Markov dn tiing Viit, mvrc dual tir bdn dm liil
Abstract
This paper presents our results in applying the hidden Markov model theory for
developing a Vietnamese speech recognition system, which is capable of modelling any pronunciatiort unit By using this system, we had some experiments to compare some methods of recognition with the selection of different pronunciation units As the result, the effective method using semi-syllable models is proposed This approach was tested on
a database consisting of 10,575 pronunciations of 888 Vietnamese toneless single words spoken by a male voice, and the accuracy is 96.89%
Keywords: speech recognition Hidden Markov Model.Vietnamese sub-word,
semi-syllable
1 GIOI THIf U mo hinh Markov an a mure tir ([3],[7],[9]) nfn
-A z u- •!.< u- J so lugmg tCr v^mg con h^n che va kho ap dyng
Cho den nay, cac h? thong nh?in d^ng * , " V, ° i„ „, „ , ^ ,.^„ t °
*-^ i - • u ' u ' uJtl u' - J ; - de nh?in d^ng duQfc tieng noi phat am lien tyc
tieng noi thanh cong nhat chu yeu d\ra tren „ % ° -» x °tukJ^ „i.a« j—™ >.v
\.i u U.JL- L- J s ^ 1 - u -• Cung da co mpt so h? thong nh^n d?ng su
khuynh huong nh^n d^ing mau va ky thuat , " - u' u x* i, k^ f.\^.:, A ^ ,: a^ , / s® J , jT I * L-x L* \y dyng mo hmh Markov an o muo don v| am nh^n djuig mau dvtqc sir dyng pho bien nhat la 7 °, , , _ , „ i„„ L„„ „L., »„ • u,,, , , , s ,iyxAXA\ T - *!.- thanh nho hom tu, chSng han nhir am vi hay dung mo hinh Markov an (HMM) Tren the , - A^ ^ ' /^rci r/ci roi\ «i,.^„
•z.-^„ luz •.» 1 .1.J! •_- J -i phy am <^au + van, ([5],[6],[8]), nhung gi6i da CO khi nhieu hf thong nhan dang tieng *^, , _ \ V V' 'Is." * „»„ - , L t 1 j' • « u' C nhung thyc nghifm cua cac Jl? thong ao van n6i c6 s6 lupiig Jr vyrng Icm sir dyng mo hinh ^j^, ^^.„ ^^ ^ ^, ^ , ^ Markov an cho dp chinh xic rat cao 6 Viet , ^ ; , ^^^^ ^ ^ ^.„ ^-^^ ^^^ Nam mpt so chmmg tnnh nh^n d^ing tieng ^j ^^ ,j.„ ^^^ gen c^nh do, phin I6n n6i dya tren mo hinh Markov an cung da cho ^ ^ ^^ „, ,^i ^^^^ ^^j.„ ^^ ^^^ nhihig ket qua kha kha quan Tuy nhien, phan ^ ^ ^,^ ^ ^^ ^^^ d^„g j - „ g ^^j ^ 3
Ion nhung chuang tnnh do van chi su dyng
Trang 2Ky y^u HQi thao ICT.rda'06 Proceedings ofICT.rda'06 Hanoi May nguon mor cua nuac ngoai nhu SPHINX',
CSLU^ H T K \ nen ket qua cung mai chi
dimg a mure nghien curu ma kho co the ap
dung vao thyc tf do bj phy thupc ve mat cong
nghf
Xuat phat tir nhan thuc tren, bai bao nay
trinh bay cac nghien curu, thur nghifm cac
phuomg an lya chpn dom vj am thanh cho cac
mo hinh Markov an de sao cho vai so lirgmg
mo hinh khong nhieu, hf thong van co the
nh|in dan^ dugc mpt tap tir vyng tuomg ddi
lorn vai ket qua chap nhan dugc
2 L V A C H Q N DOfN V | AM THANH DE
HUAN LUYfN MO HINH
2.1 Cac dorn vj nh^n dang thong thirorng
2.1.1 Mo hinh tir va dm tiet
Vifc lya chpn tir lam dom vj nh|in dang ca
ban se bao trum dugc tinh bien the cua am vj
Cung mpt am vj song neu thupc ve cac tir
khac nhau thi co the tra thanh cac cac mo hinh
am thanh khac nhau Ta biet rang tieng Vift la
ngon ngir dan am tiet Trong tieng Vift, am
tiet la dom vj am thanh ty nhien nho nhat chu
khong phai la tir, cho nen im tiet m6i la muc
tieu cua cac hf thong nh|in d^ng tifng noi
tif ng Vift Cac mo hinh am thanh dya tren am
tiet CO tinh ben vOng cao do chung bao trum
dugc hifn tugng dong cau am cua cac am vj
cau thanh am tiet cung nhu tinh bif n the am vj
trong cac am tiet khac nhau Vi v|iy, am tif t c6
the dugc lyra chpn lam dan vj nhan d^ng tieng
noi ca ban cho cac urng dung nhan dang tieng
noi tieng Vift vai so lugng tu vyng vira va
nho Tuy nhien, do moi mau huan luyfn cua
mpt am tiet chi co the sur dyng df huan luyfn
cho chinh am tift do chu khong dun^ chung
df huan luyfn cho cac am tift khac, dong thai
tieng Vift co tai hom 10.000 am tift nen r6
rang la khong de dang gi lya chpn mo hinh am
tiet cho cac urng dyng nh|in dang tieng noi
tif ng Vift CO so lugng tir virng Idm vi thu thap
' http://cmusphinx.sourceforge.net
^ http://cslu.cse.ogi.edu
^ http://htk.eng.cam.ac.uk
du so miu tieng noi can thiet de hu
cong vifc cyrc ky kho khSn
2.1.2 Mo hinh am Vf
De CO the sir dyng chung cac luyfn ciia cac tir khac nhau, mo h thuang dugc chpn lam don vj nhai ban cho cac hf thong nh|n d^ing
Tieng Vift co khoang 40 am vj (Z
neu khong k^ phy am trong zero, 1
am dan va doi, va 2 ban nguyen am
CO kft hgp vdi thanh difu nQ-a thi cur den toi da la 40 X 6 = 240 am vj co th nen ta co the xay dyng cac hf the d^ing tifng noi tif ng Vift co so lugng Ion vai don vj nhan dang ca ban durgt amvj
Tuy nhien, mo hinh am vj co mf difm kha lan, do la mo hinh am vj coi
vj dugc t^o ra trong bat ki tir nio, or I tri nio deu nhu nhau Gia djnh nay 1 cic hf thong nh^n dang tieng noi tif i dya tren am vj khong khai thic het du phan biet cua cic am vj Vi du, tron
Vift, cic phy am k, m n, ng p t vira ci
am dau, vua co the la am cudi Va cic ( nay nfu ddng vai tro lam am dau thi bin da CO dac tinh am hpc - ngQ- am gid chinh nd trong vai trd lim am cudi Tu< nhu v|y, cac nguyf n am u, o vira cd thf
df m, vua cd the la am chinh Nhung r5 r
u 0 vdi vai trd am chinh (vi dy: biin, to)
phit am ni^nh va mang am sic chu ye
am tiet, trong khi dd u, o vdi vai trd la
df m (vi dy: qud, xodi) l^i khdng mang ai
chii yeu
Chinh vi nhugc diem nhu \%y che
md hinh am vj thudng it dugc sur dung i cac hf thong nh$n dfuig tieng ndi tieng Vi
2.1.3 Md hinh am v/ kep, bp ba am v/
Md hinh am vj don (phoneme monophone) it dugc sir dyng do nhu3ig diem ke tren Do v|iy, trong cic hf nhain c tieng ndi cua mpt so ngdn ngir nhu tieng / tieng Phip, ngudi ta thudng sur dyng hinh am vj kfp (biphone/diphone) ho|ic hinh bp ba am vj (triphone) vi nd khfic pi dugc mpt so nhugc diem cua md hinh am v
Trang 3Proceedings of ICT.rda'06 Hanoi May 20-21,2006
V i ^ *P ^^Ving Sm vj kfp (diphone) tron§
- d^ng ti^ng ndi dya trfn d^c diem mpt so
to vi cd sy chuyen tiep sang nhau gan nhu cd
Mj^ Nhu v$y am vj kep se dugc md hinh hda
^ 2 im vj hgp thinh vdi mpt phan ngir cinh
^Mtng ^ g ^<^' 2 am vj dd Cdn neu ta bieu
diln sy lien quan giite mpt am vj vdi hai am
vi dihig trudc vi sau no thi ta cd md hinh bp
ba Sn> vj (triphone) Neu so lugng am vj vao
Uioing SO thi s6 lugng am vj kep la khoing
gin ISOO vi so lugng triphone cd the md hinh
hda dugc li vio khoing 7300 triphone
Rd ring, cic md hinh diphone vi triphone
cd xet den sy chuyf n doi vf tinh chat am hpc
-ngi^ am giQ-a cic am vj lifn nhau nfn nd cd thf
dugc ip dyng de ^iai quyet hifu qua bii toin
nh$n d^ng cd sd lugng tir vyng idn Tuy
nhifn, cic md hinh diphone, triphone cung
thudng chi dugc sur dung trong cac hf thdng
nh|in d^ng tieng ndi cua cic ngdn ngQ- khdng
CO thanh difu va da am tiet (nhu tieng Anh,
tieng Phap, ) Vdi cic ngdn ngQ- nay, cic md
hinh diphone va triphone cd khi ning bieu
dien khi tot dugc sy lien ket vi chuyen doi
giQ-a cic am vj trong mpt tir Han nQ-a, do cic
ngdn ngu- nay khdng cd thanh difu nfn so
lugng cic diphone v i triphone su dyng cung
khdng phii li qui Idn vi trong thyc te hoan
toin cd the thu th§p du so mlu am thanh df
huan luyfn
Cdn tieng Vift cua chung ta la ngdn
ngQ-cd thanh difu, cho nen neu sOr dyng md hinh
diphone hay triphone thi so lugng md hinh
thyc sy sg tang Ifn khi nhieu (khoing 6 lan)
M?it khic, do tieng Vift li ngdn ngCi dom am
tiet cho nfn ta cd thf su dyng cac phuang in
md hinh hda dom vj nh^n d^ng khic df Igi
dyng trift de d^c difm niy Mpt trong so cic
phuang in dd se dugc trinh biy d phan 2.2
2.1.4 Mo hinh dm ddu + vdn
Nhu da ndi, tieng Vift la ngdn ngQ- dom
am tiet Am tiet tieng Vift tuy dugc phat am
lien mpt hai nhung l^i cd cau t^o lap ghep
Khoi lip ghep ay cd thf thio rdi tihig bp ph$n
d am tiet nay df hoin vj vdi bp ph^n tuang
ling cua am tiet khic Am tiet tieng Vift cd 3
bp ph$n la: phan dau, phan sau va thanh difu
Phan dau am tiet dugc xic djnh la am dau, d
vj tri nay chi cd mgt am vj tham gia cau t^o Phin sau cua am tift gpi ii phan van Cic am dau van, giQ-a van va cuoi van dugc gpi ten li cic am dfm, am chinh va am cuoi Vdi cau true dac trung niy, da cd mpt sd nghifn curu dua ra giii phip lya chpn am dau + vin lam dan vj nhan dang ca ban [8] Xet vf tinh ben vQ-ng, md hinh am dau + van bf n vung ban
md hinh am vj do phan van cd thf bao triim tinh bifn thf am vj cua am dfm, am chinh va
am cudi, nhung lai kem hom md hinh am tiet
do md hinh am tiet cd thf bao trum tinh bien thf am vj c i am dau, am dfm, am chinh va am cuoi Xet vf khi ning huan luyfn, md hinh
am dau + van kem ban md hinh am vj vi can
sd lugng miu am thanh huin luyfn nhieu hom, nhung lai tdt hom md hinh am tift vi sd lugng miu am thanh cin cho huan luyfn la ft hom
Md hinh am dau + van cd 22 am diu (neu khdng kf am dau trong zero) va khoang 150 vin (theo [6]) Neu ket hgp vdi thanh difu thi tdng sd im dau + van phii nhd han (22 + 150)
x 6 = 1032 vi cd nhifu van vi thanh difu khdng ket hgp dugc vdi nhau So lugng nay cho thay cd the thu thap dii so mau huan luyfn cho cac hf thong nh|in d^ng tieng ndi su dyng
md hinh am dau + van Tuy nhifn, theo nhu danh gii cua chung tdi thi md hinh am dau + van niy cung chua phai li phuang in lya chpn tot nhat, bdi vi nd chua cho thay cd sy gin ket ch^t che gi&a im diu vi vin Cic thyc nghifm d phin sau sS cho thay md hinh niy khdng tot bing md hinh bin am tiet sau day 2.2 De xuat don vi nhan dang cor ban Ii ban 2m tiet
Neu nhu ta chia mpt im tift tifng Vift ra lim hai phin vdi ranh gidi la nguyfn im chinh cua am tiet thi ta sg cd 2 bin am tiet
(semi-syllable) Chii y "bdn" am tiet d day
khdng phii li hai nua rifng tach bift cua mpt
am tiet, ma li hai phin diu v i cuoi cua mpt
am tiet nhung cd chung nhau mpt phin cua nguyen am chinh Tire li hai md hinh bin am tift se cd sy chuyen tiep, lifn ket vdi nhau giong nhu li hai md hinh am vj kep (diphone)
Vi dy: im tiet anh bao gom 2 bin am tiet _a
Trang 4Ky ylu HQi thao ICT.rda'06 Proceedings of lCT.rda'06 Hanoi May.:
vi anh (dau _ tugn^ trung cho khoing ling,
_a tuc la ban am tift bat diu, chang han nhu
a a trong anh, cdn a_ ii bin am tiet ket thuc,
vi dy nhu a_ a trong cha)
Do ca hai bin am tiet deu cd mpt phin
chung dd li nguyen am chinh cua tir cho nen
md hinh bin am tiet bieu dien kha tot sy lifn
ket vi chuyen doi giu-a cic am vj trong mpt tu
Tong so bin im tiet cin cd df tdng hgp nfn
tat ci cic tir tieng Vift theo phuang phip trfn
li 389, trong dd di cd 61 bin am tiet da mang
sin thanh difu (ching han nhu dc, ep, dc, )
[4] Nhu viy, nfu ket hgp vdi ca thanh difu
thi tong so bin am tift can thift df md hinh
hda cho toin bp cic am tiet tiing Vift toi da
sg li (389 - 61) x 6 + 61 = 2029 Vdi so ^ugng
nay ta hoan toan cd the thu thip du sd miu am
thanh huan luyfn cho cic md hinh
3 HE T H 6 N G N H A N DANG
3.1 Cic thinh phan chinh
De thyc hifn cic myc tieu nghien curu de
ra d trfn, chung tdi di xay dyng mpt hf thong
nhin d^ng tieng ndi tieng Vift nhim thich ung
vdi s6 lugng tir vyng Idn Hf thong nhin d^ng
niy gom nhilu mddun khic nhau, moi mddun
thyc hifn mgt chuc ning cy the Vifc chia hf
thong thinh cic mddun khic nhau nhu v$y
cho phep cd thi df ding tich hgp cic mddun
niy vao cic hf thong khic tuy thupc vio tiTng
myc dich sir dyng khic nhau Cic mddun niy
deu dugc phat trifn bing cdng cy lip trinh
Microsoft Visual C++ 6.0, chi su dyng thuin
tiiy cac ham API cua Windows nfn cd toe dp
thyc thi rat nhanh, giao difn khdng ciu ki
nhung df six dyng
Cic mddun chinh cua hf thong bao
gdm:
• VSRCutter: Cat mpt file im thanh
chua nhifu lin phit im cua mpt tir
thanh cic file rifng le vi d|it tfn chung
theo mpt quy tac nhit djnh df phyc vy
qui trinh huan luyfn md hinh va test hf
thong
• VSRAutoSplit: Ty dpng tim kiem tit
ci cic file im thanh nim trong mpt thu
myc vi thyc hifn chuc ning ] VSRCutter
• VSRTraining: Huin luyfn cit Markov in cho hf thdng vdi c thanh bit ki (word ho|ic sub danh gia chat lugng cua md khi huan luyfn
• VSRTiny: Li mpt chuong t gpn cho phep nhin dsing tron; thdi gian thyc cac phit am tienj
32 Cor sd dir lifu tiing ndi
Co sd da lifu (CSDL) tiing ndi thong se bao gom cic phit am ciia khc
900 tu dom khdng dau tiing Vift c gipng nam Ngoii ra, CSDL tiing ndi gom ci mpt tip hgp nhifu file am thai cic phit im cic so tir 0 din 9 cua nhie
ndi khic nhau, vdi myc dich sir dyng i
tra, dinh gii chat lugng ciia md hinh Cic file im thanh niy dugc thu tro trudng lim vifc binh thudmg, su dyng difn dpng cim tay, lay mau d tin so
Hz, Mono, 16 bit, va dugc ma hda due WAVE PCM khdng nen
Vifc xiy dyng CSDL tiing ndi nij thyc hifn bdi hai chuang trinh VSRCut VSRAutoSplit Diu vio cho cic chuon{ niy li dudng din din cac file WAVE cic phit im cin xiJr ly, ki hifu ten ngui
va nhan ciia phit im ung vdi tirng do^ thanh (iMn cua tijr tiing Vift dugc m;
dudi d^ng TELEX - vi dy, tCr "khong" nhan tuong ung li "khoang") Vdi cic i
tin nay, chuang trinh VSRCutter VSRAutoSplit sg su dyng thuit toan phat khoing^l$ng [3] dl liy ra cic do?in im t chua tiing ndi vi ghi ra 1 file WAVE ci
dugc dit theo quy tic sau: <nhan>_
ngu&i n6i>_<sd thir ttr>.wav Quy tic di
nay sf giup cho mddun nh|in dang cd thong kf dugc ty If nhin d?ng chinh xac i
tu cho cac gipng ndi khic nhau
3.3 Huan luyfn mo hinh Sau khi da cd CSDL tiing ndi thi cun
liic ta cin sir dyng chiing dk huan luyfn
Trang 5H6i thto ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May 20-21.2006
_j|,hinh Markov in phyc vy cho vifc nhin
jSff Chuong trinh VSRTraining dugc thiit
U ^ thyc hifn cdng vifc niy
%.'' Chuong trinh VSRTraining su dyng thuft
10^ huan luvfn nhung [2] (embedded
training) de huan luyfn md hinh Vi viy, diu
13 ciia chuang trinh khdng chi li cic md hinh
d^ng tir (whole word model) mi cdn cd thi li
Q^c md liinh cua cac don vj im thanh nhd hom
tjy (subword model) nhu im vj, im diu +
vin,
Khdng gi6ng nhu qui trinh huan luyfn
cic md hinh d^ng tu trong dd chi sur dyng lin
lugt timg tip cic phit im ciia mpt tix de huin
hiyf n ra mpt md hinh cua tir dd, VSRTraining
sg sur dyng tat c i cac file dvi lifu im thanh
dinh cho pha huin luyfn (chua cic phit im
ciia rat nhifu tu khic nhau) de cap nhit dong
thdi toin bp cic md hinh trong hf thong
Cdng vifc niy dugc thyc hifn vdi mpt vdng
lip Chi tilt thuit toin da dugc trinh bay trong
[2] song ta cd thf tdm tit lai nhu sau:
Khdi diu, VSRTraining n^ip toin bp
cic file im thanh vio bd nhd Sau dd tiln
hinh trich chpn die trung tiing ndi ciia timg
file vi xd ly tung file huin luyfn noi tilp
nhau Tilp theo, VSRTraining sir dyng chuoi
phifn im ciia phit im tuong urng dl xiy dyng
nfn mpt md hinh HMM tong hgp bao phu
toin bp tijr phit im Md hinh tdng hgp nay
dugc xiy dyng bing cich ghfp noi cic md
hinh subword tuong umg vdi tung nhin trong
chuoi phifn am Sau dd thuit toin
Forward-Backward dugc su dyng v i se t^o ra cic trpng
sd trung binh dugc tich luy theo cich thdng
thudng Khi tat cic cic file huan luyfn dugc
xir ly xong, cic tham so udc lugng mdi dugc
hinh thinh tur cic tong trpng so v i cho ra t$p
md hinh HMM da dugc cap nh|t
Diu vao cua VSRTraining la mpt file text
bao gom nhieu ddng, mdi ddng chua cic
thdng tin sau: <tfttrmyc chira cdc phdt dm ciia
mqt tir> <nhan cua tir> <phien dm cua
tie>
Trong dd <phien dm cua tir> sg chua ten
cua cic md hinh subword irng vdi tir V i vifc
lya chpn don vj im thanh nio dl md hinh hda
li tuy thupc vio ngudi su dyng (cd thi li c i tir, im vj, im diu + vin, )
Ngoii chuc ning huan luyfn md hinh, chuomg trinh VSRTraining cdn cd thi ty dpng nhin d^ng tit c i cic file im thanh trong m$t thu myc, didng ke kit qui nhin d^ng de dua
ra nhurng danh gii vl chat lugng cua md hinh Chuc nang nay sir dyng module nhin d^ng sg dugc trinh biy d phin sau diy
3.4 Nhan d^ng tiing ndi trong chi d9 thdi gian thvc
d diu ra ciia qui trinh huan luyfn nhiing, ta
cd dugc mpt tip hpp cic md hinh subword Cic
md hinh nay kft hgp vdi mpt file tir diin chura
phifn am ciia cic tir cin nhin d^ing sg cho ta mpt m^ng lien kit cic md hinh Markov in Vin
de dit ra d diy la phai tim kiim trong m^ng lifn ket nay df xac djnh ra dupc mpt chudi cic tr^g thai ket ndi tix md hinh subword np din md hinh subword kia sao cho nd thich hgp nhit vdi chuoi quan sat umg vdi doan tin hifu tifng ndi diu vio Giii ma doan dudng di niy sg xic djnh dugc mpt diy cic phifn im, hay ndi cich khic li di nhin d^ng dugc doan tin hifu tiing ndi phit im lifn tyc dd
Thuit toin tim kiim dl giii quylt vin de niy thudng dya trfn tu tudng ciia giii thuit Viterbi Tuy nhifn, neu nhu kich thudc ciia tip tir vyng cin nhin d^ng tuomg doi Idn thi
so lugng cac niit md hinh trong m^ng lifn kit niy cung khi nhieu Vi thi, cin phii ip dyng mpt so ky thuit cit nhinh trong ciy tim kiim
dl ting toe dp nhin d^ng
Thuit toin dugc chuang trinh su dyng dk
giii ma chuoi quan sit li thuit toin tim kiim
Viterbi theo chiun (Viterbi beam search) Npi
dung chi tilt ciia thuit toin niy dugc trinh biy trong tai lifu [1] Vdi vifc six dyng heuristic nhim lo^i bd bdt cic nhinh trong ciy tim kiim, toe dg thyc hifn ciia thuit toin tim kiim
Viterbi da dugc cii thifn mpt cich ding kl
Vifc thyc hifn nhin d^ng mpt tir tiing Vift tmng binh chi mit khoang 30 ms (vdi miy Pentium 4 toe dp 2 GHz) cho nfn hoin toin
cd thi cii dit dugc thu tyc nhin d^g trong ehl dp thdi gian thyc
Trang 6Ky y^u HQi thao ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May
Chuong trinh VSRTiny cho phep thyc
hifn dieu dd So do thyc hifn cua chuong
trinh nay dugc trinh bay d hinh I
Qui trinh xiir ly nay cd the tdm tat l^i
nhu sau:
S i
e«<in«ulni
1
a<«mdlt
1
nitaavi
1
1
Hinh 1 Nhgn dgng tieng noi trong
thai gian thyc
Tin hifu tiing ndi dugc ghi im tryc tilp
tir micro vi chuyfn vio bd dfm thu vdi
tin so liy miu 11025 Hz, Mono, 16 bit
Tin hifu tieng ndi trong bp dfm niy
dugc chia thanh cic frame dai 256 miu
(»23 ms), hai frame c^nh nhau cich
nhau 128 mau
Sau dd chuang trinh se dya trfn ning
lugng cua timg frame vi ngu&ng ning
lugng ciia nhifu nIn di dugc udc lugng
tir trudc dk tiln hanh lo^i bd cic
khoing ling Tilp theo, vdi tirng frame,
12 gii trj MFCC + ning lugng ciia
fi-ame cimg vdi cac gia trj d^o ham bic
1 vi b$c 2 theo thdi gian ciia chiing (tit
ci cd 39 hf so) sf dugc xic djnh df lim
tham sd die trung cho tirng doan tiing
ndi
Vdi cic tham sd die trung diu vao nay,
chuang trinh sg siir dyng mpt tip hgp
cic md hinh HMM subword di dugc
huin luyfn tir trudc, mpt md hinh ngdn
ngQ- dang bigram, cimg vdi mpt danh
sach cic phien am ciia cac tir cd khi
nang nhan d^ng de thyc hifn thu|it toin
tim kiem Viterbi theo chiim
Dudng di tot nhat tim thay sau thii tyc
tim kiim chinh la kft qui cua qui trinh
nhan d^ng
a hinh 2 ta thiy cd xuit hifn k
tin vl md hinh ngdn ngff Hifn t? trinh da hd trg md hinh ngdn ngi^ d^i
va cho phep nhin d^ng tiing ndi lifr
nhien do bing thdng tin xic suit ciu
ngdn ngir chua dugc tdi uu nen t chuang trinh mdi chi dugc thiir ng chiirc nang nhin dang tieng ndi tifng
am rdi rac
4 CAC KET QUA THV'C NGHlEl 4.1 So sinh mo hinh am dau + vai hinh bin am tiet
• Ngu&i noi: mpt gipng nam Ha
tudi
• Dir lifu de huan luy?n mo hh
888 tir tiing Vift khdng diu { ciia [4]), mdi md hinh ciia 1 huin luyfn vdi 10 lin phat im c (chiira trong 10 file am thanh)
• DO- li4u de test: gom 10575
thanh chua cic phit im ciia 888
• Thong sd cua mo hinh: mdi n
subword trong cac thyc nghifn deu bao gdm 5 tr^ng thai (kf ca
thai gii dung dk kit noi cic md I
vdi nhau) vi 3 thanh phin Gauss
• Kit qud: mdi d^ng md hinh dup
luyfn vdi 5 lin lip, kit qui nhi tip do- lifu test dugc trinh ba> bing I
Bang 1: Ket qua so sinh hai mo hi
Lo^imd hinh
Am dau + vin Ban am tilt
S6m6 hinh subword
129
293
T/gian huin luyfn
(8)
2108
1917
T/gian nhin d?ng(s)
767
529
(
Xi
i
9
Cac thdng sd thdi gian d phin nghifm nay dugc tinh khi chay chuang tren may tinh xich tay cd cau hinh la Pentium 4 toe dp 2.0 GHz, 512 MB SDI
hf dieu hanh Microsoft Windows
Trang 7Hfli thto ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May 20-21.2006
potessional Ngoii ra, thdi gian khi thyc hifn
IbAc ning ty dpng nhin dfrig bao gom ei thdi
gian tim, dpc file im thanh vi thdi gian xur ly
nhin d»ng
B i n g 2: A n h hvorng ciia so lan lap
^lin'^
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
chinh
xic(%)
40.42
86.60
92.18
93.31
93.74
94.03
94.17
94.18
94.31
94.40
94.56
94.91
95.31
95.79
96.35
96.49
96.67
96.77
96.78
96.89
96.89
96.86
96.85
96.85
96.79
Thdi gian buinhiyfn (s)
137
445
444
443
448
579
722
778
784
784
760
761
762
761
762
766
774
774
774
775
763
786
787
790
788
Thdi gian nhind^ng (s)
2099
816
563
538
529
535
510
495
486
463
445
437
413
384
362
348
335
323
316
312
315
338
338
331
326 Qua thix nghifm niy ta thay vdi trudng
hgp nh$n d^ng cic td khdng dau tiing Vift,
chuong trinh sur dyng mo hinh ban im tilt cd
kit qui nh|n d^ng tot hon md hinh am diu +
vin Dd li vi trong md hinh bin im tilt, ci hai
bin am tilt deu cd mpt phin chung thupc
nguyen am chinh cua tur cho nfn md hinh niy
bieu dien khi tit sy lifn kit vi chuyin d6i giu-a cae don vj im thanh trong mpt tu Cdn
md hinh am diu + vin thi khdng thi hifn dugc rd net sy lien kit niy
4.2 Anh hirdng ctia so lan thyrc hifn
vong lap trong thii tuc huan luyfn
Vdi Cling mpt bp test nhu thd nghifm d phin tren, nlu thay ddi so lin thyc hifn vdng lip trong thii tyc huan luyfn md hinh bin am tift thi ta cd kit qui nhu d bing 2
Vdi kich cd bp tu vyng la 888 tu don tiing Vift khdng diu, trong thix nghifm niy chuong trinh d^t dp chinh xic cao nhit li 96.89% Vi khi ting din so vdng lip thyc hifn huan luyfn, ty If nhin d^ng chinh xic ciia md hinh cung ting din Tuy nhifn, din mpt liic nao dd thi dp chinh xic nay sg trd nfn bio hda
4.3 Nhan dang cic tir chua durgrc huan luyfn
Mic dii du- lifu tiing ndi danh cho huan luvfn chi bao gom 888 tir dom khdng dau tieng Vift, nhung khdng phii khi ning cua chuang trinh chi cd thi nhfn dpng dugc tirng
dd tir d diy do dom vj nhin dpng li mdc nhd hon tir (cy thi la cic bin im tilt) cho nfn chuomg trinh hoan toin cd khi ning nhin dpng them mpt so lugng Idn cic tir don khdng diu khic dugc tpo thanh bing cich ghfp tihig cap cic bin im tilt vdi nhau
Chuang trinh VSRTiny cho phfp thyc hifn dieu niy Diu vio ciia chuomg trinh niy
la mpt file VSR.HMM chua thdng tin ve cic
md hinh bin am tilt, vi mpt file vin bin VSR.DIC chda nhin vi cich phifn im (chi ra
td umg vdi nhin dd li ghep bdi hai bin im tilt nio) ciia cic tir ma chuong trinh cd khi ning nhpn dpng
Vdi cic thdng tin diu vio nhu d thyc nghifm tren thi file VSR.DIC sg bao gom djnh nghTa ciia 1351 td don trong dd cd 888 td
cd trong danh sach huan luyfn, cic tu cdn Ipi dugc tpo thanh bing each ghep tung c$p bin
Trang 8Ky y^u HQi thao ICT.rda'06 Proceedings of ICT.Kla'06 Hinoi May 2
am tilt vdi nhau Cic td mdi niy dugc thu im
va nhpn dpng tryc tilp trong thdi gian thyc
Kit qui nhpn xet ban diu cho thiy ty If nhpn
dpng dung khdng cao bing trudng hgp nhpn
dpng cac td da dugc huan luyfn nhung van d
mdc chip nhan dugc
4.4 Cac thiir nghifm vdi so lurong tir vvng
nhd
Trong thyc nghifm nay, co sd dd lifu
tiing ndi dugc su dyng va bp td vyng nhpn
dpng chi la cac sd tu 0 din 10
Bang 3: Diir lifu huan luyfn mo hinh
• - • • - : s ^
Srr ^Sudi Gidi ^^^ mlu
noi tmh ° /mo;?
hinh-i
1 N P B N a m 25 ^ f 3 0
N Q I
2 N N V N a m 25 ^!"J 3 0
Binh
3 TCT Nam 27 Si 30
NQI
4 V N A N a m 27 J ? " 30
Tay
5 N H Q N a m 2 7 j J * 30
6 DTTT N a 2 4 ^ ^ ^ ^ 3 0
Ninh
7 DTNH Ntt 25 ^^ 3 0
• ZMr li^u de huan luy^n mo hinh: x e m
bing 3
• Ca s& dir li^u nhgn dgng thii 1: Bao
gom 3411 file im thanh chua eie phat
im eie so td 0 din 10 ciia 7 gipng ndi
cd trong tip huan luyfn (luu y l i cic
file im thanh dung dl huin luyfn md
hinh se Ididng dugc sd dyng Ipi trong
bude nhpn dpng)
• Ca sa da li^u nhgn dgng thir 2: Bao
gdm 1466 file am thanh chua cic phit
am cua 6 gipng ndi chua timg tham gia
huan luyfn (bang 4)
• Kit qui thu nghifm tren hai hinh mdc im tilt va mdc bar dugc chi ra d trong bing 5
Bang 4: CSDL nhan d^ng thur
S n ^eudi% Gidi ^<Tu6i^'Gipng
noi tinh - -^ -V
1 N B A N a m 34
2 N T H N& 31
3 PVH N a m 29
4 N T N NO- 25
5 N L T N a m 23
6 LTV Nam 23
Thai Binh Thai Binh Thai Nguyen Thai Nguyen Hii Duong Hii Duong Bang 5: Ket qua nh|n d^ng 2 CSI
f Lopi Chinh '^ mo CSDL nhin dpng xic
hmh (%)/
^ ^ (gipng da huin luyfn) ^•^''
hinh
am 2 tilt (g>9ng chua huin 96.18
luyfn)
hinh (gipng di huin luyfn) ban 2 iro (gipng chua huin 95.09
luyfn)
tilt
Mic du trong tit e i cic thd nghifm kit qui nhpn dpng deu khi cao nhimg ta ring sd dyng md hinh mdc im ti« (ti trudng hgp nay chinh la muc td) cho dp cl xic cao hon md hinh nhpn dpng muc nhd
tu (d diy la md hinh ban am tilt) Dd li vi
md hinh mdc am tilt cd tinh bin vihig cao chiing bao trum dugc hifn tugng d6ng p
am cua cic am vj ciu thanh am tilt cung i tinh bien thi im vj trong cic im tilt kl
nhau Vi vpy, trong cic dng dyng nh§n d\
Trang 9i0B thio ICT.rda'06
logng td vyng nhd ta nen sd dyng md
^ nb$n dpng mdc im tilt
1 ^ LUAN VA PHlTONG HlTdNG
pHATTRliN
' Td mpt s6 kit qui thyc nghifm d tren, ta
jbj^ ling hf thong nhan dang tiing ndi dya
i«fn md hinh Markov in mdc dudi td, eu thi
u sd dyng md hinh ban am tilt, da cho kit
qui tit trong c i hai trudng hgp nhpn dpng
'tiing ndi tiing Vift phit am rdi rpc, dd li
nhin dpng phy thupc ngudi ndi vdi sd lugng
td vyng Idn v i nhin dpng khdng phy thupc
ngudi ndi vdi so lugng td vyng nhd
Tuy nhien, trong mdt so trudng hgp khi
nhpn dpng van eon cd sai so Nhung sai sd niy
mpt phin ii do dir lifu dung cho huan luyfn
chua cd difu kifn de thu thap nhieu, va mpt
phin la do cac tham sd trong chuang trinh cd
tbi la chua dugc lya chpn mdt each tdi uu
nhit Vi vpy, ta hoan toan cd thi nang cao dp
chinh xic eua hf thong bing each thu thap
thit nhieu nguon dd lifu huan luyfn va tinh
chi Ipi cic gia trj tham so cho thich hgp hom
Mac dii so lugng td vyng ma hf thong
hifn tpi cd kha ning nhpn dpng dugc di la kha
Idn nhung dd mdi chi la eie td dom tieng Vift
khdng diu Trong tuong lai, chung tdi sg tilp
tyc thu thpp them cic miu am thanh cd kem
thanh difu df cd thf xiy dyng du so md hinh
cin thiit nhim ting kich thudc bp td vyng,
hudng tdi myc tif u nhpn dpng dugc tit ci cic
tir don tiing Vift Diiu nay hoan toan cd thi
thyc hifn dugc vi nhu da trinh bay d myc 2.2,
vdi khoing hon 2000 ban am tilt la di du df
md hinh hda cho toan bp cac im tiet tiing
Vift
Ngoii ra cdn cd mpt hudng phit triin
khac nu-a, dd la kit hgp giua chuc nang nhpn
dpng cic td don khdng diu mi hf thong dang
cd vdi mpt mddun nhpn dpng thanh difu df
xiy dyng nen mpt hf thong cd khi ning nhpn
dpng dugc tit c i cic td don tiing Vift Uu
dilm de thiy nhit cua hudng di niy dd la
Proceedings of ICT.rda'06 Hanoi May 20-21.2006
khdng cin sd dyng nhifu md hinh Tuy nhifn, trong tuong lai ta sg phii nghien ciiru tiifm vl moi quan hf giua cic tu don mang thanh difu vdi cic td dom khdng dau tuong iimg
Tii lifu tham khao
[1] C Becchetti and L.P Ricotti: "Speech
Recognition Theory and C++ Implementation", John Wiley & Sons Ltd,
1999, pp 309-320
[2] S Young et al: "The HTK book (for HTK
version 3.2.1)", Microsoft Corporation, 2002,
pp 129-133
[3] Nguyen Phii Binh, Trjnh Van Loan, Eric Castelli: "He thing xiir ly thdi gian thyc nh§n
dang cic tir tiing Vift phit im rdi", Kyyiu hgi
thdo ICT.rda'03, Ha Nfi, 2003, tr 310-316
[4] Nguyin Phu Binh: "Nhan dang tiing ndi tieng
Viet siir dung muc diroi tir", Ludn van thgc sy
Cdng nghi thdng tin, DHBK Ha Nfi, 11/2004
[5] Dang Ngoc Due, Luang Chi Mai: "TSng cudng df chinh xic cua hf thing m^ing neuron
nhpn dang tiing Vi?t", Tgp chi Biru chinh Viin
//long, S6 II, 3/2004, tr 75-81
[6].Ng6 Hoing Huy, Luong Chi Mai, Bui Quang Trung, Nguyen Thj Thanh Mai, Vu Kim Bang,
Vu Thj Hai Ha: "Thiet kl cic he thing nhpn
dpng tieng Vift trong thdi gian thyc", Ky yiu
hgi thdo FAIR, 2003, tr 349-357
[7] Bao cao dl tii nhinh "Tuong tic ngudi-may dung tiing ndi", dl tai cip nhi nude mi s6
KC.Ol.09: "Nghien cim phdt triin vd img
diing cdc ky thugt tuang tdc nguai-mdy vd hi thdng lien tiin", Khoa CNTT, Trudng DHBK
Ha Nfi, 2004
[8] Nguyen Thanh Phiic: "Mft phuong phip nh§n dpng Idi Viet: Ap dyng phuong phip kit hgp mpng no-ron vdi md hinh Markov in cho cic
hf thing nhan dang Idi Vift", Lugn dn lien sT
l^ thugt, DHBK Ha Nfi 2000
[9] Nguyen Hdng Quang, Trjnh Van Loan: "Nhfn
dpng tiing ndi tiing Vift phit im lien tyc", Ky
yiu hgi thdo ICT.rda '04, Ha Nfi, 2004
Trang 10Ky y^u HQi thao ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May
Vl cic tac gia:
Thpc sp Nguyin Phi Binh tit nghifp dpi hpc chuyen nganh Cdng nghf thdng tin nam 2002, bio vf lufn van
thac sS chuyen nganh Xii
ly thdng tin va truyin thong nam 2004 tpi Dai hpc Bich Khoa Ha Nfi
Tir nam 2002 din nay, ThS Binh cdng tic tpi khoa Cdng nghf thdng tin, trudng Dai hpc Bich Khoa Ha Nfi
Cic hudng nghien cihi chinh: Xiir ly tiing ndi,
e-Leaming, cac kp thuit tich thdng tin td Web
E-mail: binhnp@it-hut.edu.vn
Tiln sT Trjnh
nhin bing Tii
Hf thing difn tpi trudng INI (Phap) Tir nan nay, TS Loan ( trudng Dpi hpc
Ha Nfi va hif Trudng bf mdr may tinh, Khoa > thdng tin Dpi
TS Loan hifn dang quan tim nghien ci
vyc Xu \^ tiing ndi va Xu 1]^ tfn hifu
E-mail: loantv@it-hut.edu.vn