國立虎尾科技大學 |

Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks./
作者:	Qiu, Liang.
面頁冊數:	1 online resource (46 pages)
附註:	Source: Masters Abstracts International, Volume: 57-06.
Contained By:	Masters Abstracts International57-06(E).
標題:	Electrical engineering. -
電子資源:	click for full text (PQDT)
ISBN:	9780438068957

Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks.
Qiu, Liang.

Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks. - 1 online resource (46 pages)

Source: Masters Abstracts International, Volume: 57-06.

Thesis (M.S.)--University of California, Los Angeles, 2018.

Includes bibliographical references

Non-linguistic Vocalization Recognition refers to the detection and classification of non-speech voice such as laughter, sneeze, cough, cry, screaming, etc. It could be seen as a subtask of Acoustic Event Detection (AED). Great progress has been made by previous research to increase the accuracy of AED. On the front end, multiple kinds of features such as Mel-Frequency Cepstral Coefficients (MFCCs), Gammatone Cepstral Coefficients (GTCCs) and many other hand-crafted features were explored. While on the back end, models or methods such as Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), Bags-of-Audio-Words (BoAW), Support Vector Machine (SVM) and various types of neural networks were experimented.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780438068957Subjects--Topical Terms:

596380
Electrical engineering.
Index Terms--Genre/Form:

554714
Electronic books.

Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks.
LDR:03073ntm a2200373Ki 4500 001 916927
005 20180928111503.5
006 m o u
007 cr mn||||a|a||
008 190606s2018 xx obm 000 0 eng d
020 $a 9780438068957
035 $a (MiAaPQ)AAI10829436
035 $a (MiAaPQ)ucla:16997
035 $a AAI10829436
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Qiu, Liang. $3 1190800
245 1 0 $a Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks.
264 0 $c 2018
300 $a 1 online resource (46 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Masters Abstracts International, Volume: 57-06.
500 $a Adviser: Lei He.
502 $a Thesis (M.S.)--University of California, Los Angeles, 2018.
504 $a Includes bibliographical references
520 $a Non-linguistic Vocalization Recognition refers to the detection and classification of non-speech voice such as laughter, sneeze, cough, cry, screaming, etc. It could be seen as a subtask of Acoustic Event Detection (AED). Great progress has been made by previous research to increase the accuracy of AED. On the front end, multiple kinds of features such as Mel-Frequency Cepstral Coefficients (MFCCs), Gammatone Cepstral Coefficients (GTCCs) and many other hand-crafted features were explored. While on the back end, models or methods such as Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), Bags-of-Audio-Words (BoAW), Support Vector Machine (SVM) and various types of neural networks were experimented.
520 $a Recent researches on Automatic Speech Recognition (ASR) and Acoustic Scene Classification (ASC) show the advantage of using Convolutional, Long Short-Term Memory, Deep Neural Networks (CLDNNs) on audio processing tasks. In this thesis, I am building a non-linguistic vocalization recognition system using CLDNNs. Log Mel-filterbank coefficients are adopted as input features and data augmentation methods such as random shifting and noise mixture are discussed. The built system is evaluated on a custom dataset collected from several resources and tested for real time application. The performance of CLDNNs for non-linguistic vocalization recognition is also compared with hybrid GMM-SVMs, Convolutional Neural Networks, Long Short-Term Memory and a fully connected Deep Neural Network trained on VGGish embeddings.
520 $a The results indicate that CLDNNs outperform the other models in classification precision and recall. Visualization of CLDNNs are presented to help understand the framework. The model is proved accurate and fast enough for real time applications.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Electrical engineering. $3 596380
650 4 $a Artificial intelligence. $3 559380
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0544
690 $a 0800
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of California, Los Angeles. $b Electrical Engineering 0303. $3 845570
773 0 $t Masters Abstracts International $g 57-06(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10829436 $z click for full text (PQDT)