基于互补FST的语音识别发音字典扩展

doi:10.11805/TKYDA201703.0480

首页 > 按期查看>2017年第3期 >480-488. DOI:10.11805/TKYDA201703.0480

基于互补FST的语音识别发音字典扩展
DOI:
                        10.11805/TKYDA201703.0480
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
基金项目:国家自然科学基金资助项目(No.61673395,No.61403415,No.61302107)；河南省自然科学基金资助项目(No.162300410331)
伦理声明:

Complement FST based pronunciation lexicon expansion for speech recognition

Author:

Ethical statement:

Affiliation:

Funding:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

发音字典是语音识别系统的重要组成部分，字典词汇量不足将导致高集外词率，降低语音识别性能。提出一种自动扩展字典的新方法，该方法不需要大量文本数据来获取新词，而是利用单词发音恢复集外词。首先，利用字典有限状态转换器(FST)表示的互补形式和P2G转换获取新的词-发音对。然后采用一种两步确认策略，即发音确认和单词确认，滤除错误词条。最后，采用语言模型线性内插将生成的新词添加进语言模型中。该方法在英语和捷克语的连续语音识别任务中进行了测试。实验表明，字典扩展有效降低系统集外词(OOV)率；英语大词汇量连续语音识别(LVCSR)系统的连续语音识别性能相对基线系统提升约9%，关键词检索性能约提升9.7%；捷克语系统性能分别提升了2.3%和10.0%。

Abstract:

Lexicon is a significant part of an Automatic Speech Recognition(ASR) system. Small lexicon size will result in high Out Of Vocabulary(OOV) rates and degrade the performance of speech recognition system. A novel method is proposed to automatically expand the lexicon, which recovers OOVs from the pronunciations without large text corpus to discover new words. Firstly, the complement forms of Finite State Transducer(FST) expression of the lexicon and P2G conversion are adopted to get new word-pronunciation pairs. Then，a two-stage verification strategy, namely pronunciations verification and words verification, is utilized to filter the errors. Finally, the learned new words are incorporated into the Language Model(LM) by adopting linear interpolation of the base LM and a new LM trained with the crawled texts. The proposed method is tested through Continuous Speech Recognition(CSR) task of English and Czech. There is significant reduction of OOV rates after the lexicon expanding. The WERs have been improved with a relative gain of about 9% for English and 2.3% for Czech over the baseline systems，and the Actual Term-Weight Value(ATWV) improves by 9.7% for English and by 10.0% for Czech.

参考文献

相似文献

引证文献

引用本文

舒帆,屈丹,范正光,周利莉,张文林.基于互补FST的语音识别发音字典扩展[J].太赫兹科学与电子信息学报,2017,15(3):480~488

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:

历史

收稿日期:2016-11-29
最后修改日期:2017-03-07
录用日期:
在线发布日期: 2017-07-03
出版日期:

首页

期刊简介

投稿必读

征订启事

编委会

联系我们

ENGLISH

编委风采

出版道德

太赫兹专委会

引用本文

分享

文章指标

历史