楊茜 袁奧航 胡歡 袁玉 劉鈞鵬 朱奕 阮先玉
摘要:隨著全球化的發(fā)展,我國與國外文化交流日益頻繁,英文視頻的需求量大幅上升。AI語音識別技術(shù)的應(yīng)用極大的促進了語言產(chǎn)業(yè)的創(chuàng)新。為研究AI語音識別技術(shù)在傳統(tǒng)聽譯工作中應(yīng)用的可能,本文同時使用訊飛聽見、騰訊云、搜狗聽寫三個支持語音識別技術(shù)的軟件,對人工聽譯與AI語音識別聽譯后的文本進行了初步分析與總結(jié)。本文發(fā)現(xiàn),AI語音識別較人工聽譯用時短,但正確率有待提高,就如何對兩者的優(yōu)缺點進行結(jié)合,本文提出了相應(yīng)思路和方法。
關(guān)鍵詞:聽譯;AI語音識別;語音轉(zhuǎn)寫
在“引進來”和“走出去”戰(zhàn)略的指導下,我們對英文視頻的需求量日益增加。聽譯是指對音頻或視頻中的原聲語音文本進行聽寫和識別,便于后續(xù)對音頻或視頻進行翻譯的過程。傳統(tǒng)人工聽譯依靠人工提取,對速記員要求較高,受人為因素影響較大。隨著人工智能技術(shù)的日漸成熟,AI語音識別技術(shù)在語音識別和聽寫方面受到更廣泛的認可。2017年8月,微軟宣布其旗下語音識別系統(tǒng)的正確率已經(jīng)由原來的94.1%提升至94.9%,其正確率高于部分專業(yè)速記員。然而在語音特征提取的準確性,識別的穩(wěn)定性等方面亟待改進。
1.傳統(tǒng)人工聽譯的特點及問題
聽譯是一種特殊的語音識別和轉(zhuǎn)換類型,具有書面性,即時性,同步性,跨文化性等特性。針對英文視頻的語音識別,聽譯時并無源語文本作為參考。完成從音頻到書面文本的轉(zhuǎn)換,要求速記員有較高的聽辨能力。然而,英文音頻源文本具有口語化、不規(guī)范性、難以識別性等特征,使得速記員在聽譯時很難辨識。
2.AI語音識別聽譯與人工聽譯的分析與比較
選用音視頻均來自TED演講、BBC新聞、知名電影片段,AI語音識別軟件采用訊飛聽見、騰訊云、搜狗聽寫三個支持AI語音識別(語音轉(zhuǎn)文字)的軟件。
2.1用時
以TED演講《如何學好外語》為例,速記員人工聽譯平均用時一小時三十七分二十七秒(1:37:27),三個AI語音識別軟件平均用時十一分零九秒(11:09),AI軟件語音識別并生成文本幾乎與原視頻同步。對比之下,筆者組織速記員對50個不同音頻進行人工聽譯,并對用時進行統(tǒng)計。統(tǒng)計結(jié)果顯示,人工聽譯文本的用時是AI語音識別軟件的3-14倍,倍數(shù)與源語文文本的時長和難度呈正相關(guān)。統(tǒng)計結(jié)果表明,在用時方面,AI語音識別軟件體現(xiàn)出其明顯優(yōu)勢。
2.2口音校正
速記員在人工聽譯時能針對口音較重的音頻進行反復多次的聽寫,從而達到終版聽譯文本的準確。然而,由于大部分語音識別軟件默認標準的美式或英式發(fā)音,對部分帶有口音的音頻存在識別障礙。
例1:
人工:...talking about how this problem is being addressed...
搜狗/騰訊:...talking about how this problem is being dangerous...
例2:
人工:... after the third season, seriously, the dialogue started to make sense...
搜狗:... after they turn a season, seriously, the dialogue started to make sense...
以上材料均選用帶有印式英語的音頻。不難發(fā)現(xiàn),由于印式英語與美式英語和英式英語之間存在元音障礙和輔音障礙,AI語音識別軟件難以對部分發(fā)音進行準確的識別,使得導出文本出現(xiàn)嚴重錯誤。
2.3斷句
例1:
人工: A pentagon official said this was to provide president Obama with flexibility.
騰訊: A pentagon official said this was to provide president Obama with flexibility should military options be required to protect American lives and interests.
例2:
人工:...people dont listen to them. Why is that?
搜狗:...people dont listen to them and why is that?
騰訊:...people dont listen to them why is that?
受原音頻語速和輕重讀音的影響,AI語音識別軟件難以像人工聽譯一樣做到準確的斷句。但就普遍性而言,50個音頻里斷句錯誤占比較低。絕大多數(shù)情況下,AI語音識別軟件還是能較準確的對原音頻進行斷句。
2.4整體準確性
例1:
人工:Its the instrument we all play. Its the most powerful sound in the world. Probably its the only one that can start a war or say, I love you.
訊飛:Its the instrument we all play. Probably see anyone that can start a war or say, I love you.
搜狗:Its the most powerful instrument well play. Its the most powerful sound in the world. Probably its the only one that can start a war or say, I love you.
騰訊:Voice instrument we will play its most powerful sound a world probably any one can start a war or say I love you.
例2:
人工:Oh no, I cant leave you. I promised I would put your photo up. I promised you would see Coco.
訊飛:Oh no, I cant leave you. I promised I put your photo up. I promise you would see Coco.
搜狗:Its almost sunrise. Leave you.
騰訊:Oh no, I cant leave you. I promised Id put your phone up. I promised you would see Coco.
例3:
人工:Remember me though I have to say goodbye. Dont let it make you cry. Forever if Im far away. Look, I sing secret song to you. Each time you hear sad guitar. Know that Im with you. The only way that I can be until youre in my arm again.
訊飛:Remember be so I have to travel for Free man army each time you hear cent town with you noise to noise noise yeah yeah noise yeah.
搜狗:Remember be so I have to travel for Free man army each time you hear cent town with you noise to noise yeah noise yeah.
騰訊:real number me! Do I have to say goodbye do not let it make you cry far away. I sings secret song to you. Each time you hear sand it are. The only way that I can be until youre in my arm again.
AI語音識別軟件在識別過程中,存在增聽、漏聽、連讀分辨不清、甚至部語段無法識別等問題,使得識別后的文本正確率較源語文本低。人工聽譯主要依靠速記員的專業(yè)性,聽寫時長長,且可反復聽寫某一模糊部分,正確率較源語文本高,準確性較AI語音識別軟件更好。
3.總結(jié)
字幕聽譯較文本翻譯受到更多因素的限制。筆者通過對人工聽譯與AI語音識別軟件聽譯的分析與對比發(fā)現(xiàn),人工能更好的保證斷句、口音校正和整體的準確性,但用時長,工作量大,對速記員本身的語言素質(zhì)要求高;由于AI語音識別軟件當前固有的問題,AI語音識別整體上已經(jīng)達到不錯水平,能較為準確的識別出源音頻。這說明,在日后的聽譯工作中,速記員可嘗試將AI語音識別后的文本作為藍本進行再精聽;將AI語音識別技術(shù)同傳統(tǒng)聽譯結(jié)合起來,采用更加靈活的聽譯策略和方法,更快速準確的完成聽譯工作。
參考文獻
[1]林明月,耿磊.淺析字幕翻譯的特點[J].明日風尚,2016(18):282.
[2]路雅芝.從功能對等理論淺談字幕聽譯——以跨語言訪談類節(jié)目為例[J].校園英語,2019(14):229-230.
[3]艾朝陽,周祎,李紅.二語習得中英漢口譯障礙的邊界條件——印式英語語音聽辨障礙分析[J].教育現(xiàn)代化,2015(13):71-75.