Title 3R: Word and Phoneme Edition based Data Augmentation for Lexical Punctuation Prediction
Authors Zheng, Aihua
Ye, Naipeng
Wang, Xiao
Song, Xiao
Affiliation Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei, Peoples R China
Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei, Peoples R China
Peking Univ, Shenzhen Inst, Shenzhen, Peoples R China
Keywords CAPITALIZATION
Issue Date 2020
Publisher 2020 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2020)
Abstract Existing Lexical Punctuation Prediction methods are mainly trained on the standard clean data while losing the generalization in practical automatic speech recognition (ASR) system with ubiquitous transcription errors. To bridge the gap between clean training data and noisy testing data, we propose three random (3R) data augmentation strategies: random word deletion (RWD), random word substitution (RWS), and random phoneme edition (RPE) in both word and phoneme levels on the training dataset. Specifically, we contribute an acoustically similar vocabulary with phoneme level editions for acoustically similar word substitution. In addition, we first introduce the RoBERTa-large model into a punctuation prediction task to capture the semantics and the long-distance dependencies in language. Extensive experiments on the English dataset IWSLT2011 yield to a new state-of-the-art comparing to the prevalent punctuation prediction methods.
URI http://hdl.handle.net/20.500.11897/619023
ISBN 978-1-6654-0445-7
DOI 10.1109/CIS52066.2020.00009
Indexed CPCI-S(ISTP)
Appears in Collections: 深圳研究生院待认领

Files in This Work
There are no files associated with this item.

Web of Science®


0

Checked on Last Week

Scopus®



Checked on Current Time

百度学术™


0

Checked on Current Time

Google Scholar™





License: See PKU IR operational policies.