語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Segregation of speech signals in noi...
~
Vishnubhotla, Srikanth.
Segregation of speech signals in noisy environments.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Segregation of speech signals in noisy environments./
作者:
Vishnubhotla, Srikanth.
面頁冊數:
110 p.
附註:
Source: Dissertation Abstracts International, Volume: 72-09, Section: B, page: 5490.
Contained By:
Dissertation Abstracts International72-09B.
標題:
Engineering, Electronics and Electrical. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3461249
ISBN:
9781124743950
Segregation of speech signals in noisy environments.
Vishnubhotla, Srikanth.
Segregation of speech signals in noisy environments.
- 110 p.
Source: Dissertation Abstracts International, Volume: 72-09, Section: B, page: 5490.
Thesis (Ph.D.)--University of Maryland, College Park, 2011.
Automatic segregation of overlapping speech signals from single-channel recordings is a challenging problem in speech processing. Similarly, the problem of extracting speech signals from noisy speech is a problem that has attracted a variety of research for several years but is still unsolved. Speech extraction from noisy speech mixtures where the background interference could be either speech or noise is especially difficult when the task is to preserve perceptually salient properties of the recovered acoustic signals for use in human communication. In this work, we propose a speech segregation algorithm that can simultaneously deal with both background noise as well as interfering speech. We propose a feature-based, bottom-up algorithm which makes no assumptions about the nature of the interference or does not rely on any prior trained source models for speech extraction. As such, the algorithm should be applicable for a wide variety of problems, and also be useful for human communication since an aim of the system is to recover the target speech signals in the acoustic domain. The proposed algorithm can be compartmentalized into (1) a multi-pitch detection stage which extracts the pitch of the participating speakers, (2) a segregation stage which teases apart the harmonics of the participating sources, (3) a reliability and add-back stage which scales the estimates based on their reliability and adds back appropriate amounts of aperiodic energy for the unvoiced regions of speech and (4) a speaker assignment stage which assigns the extracted speech signals to their appropriate respective sources. The pitch of two overlapping speakers is extracted using a novel feature, the 2-D Average Magnitude Difference Function, which is also capable of giving a single pitch estimate when the input contains only one speaker. The segregation algorithm is based on a least squares framework relying on the estimated pitch values to give estimates of each speaker's contributions to the mixture. The reliability block is based on a non-linear function of the energy of the estimates, this non-linear function having been learnt from a variety of speech and noise data but being very generic in nature and applicability to different databases. With both single- and multiple-pitch extraction and segregation capabilities, the proposed algorithm is amenable to both speech-in-speech and speech-in-noise conditions. The algorithm is evaluated on several objective and subjective tests using both speech and noise interference from different databases. The proposed speech segregation system demonstrates performance comparable to or better than the state-of-the-art on most of the objective tasks. Subjective tests on the speech signals reconstructed by the algorithm, on normal hearing as well as users of hearing aids, indicate a significant improvement in the perceptual quality of the speech signal after being processed by our proposed algorithm, and suggest that the proposed segregation algorithm can be used as a pre-processing block within the signal processing of communication devices. The utility of the algorithm for both perceptual and automatic tasks, based on a single-channel solution, makes it a unique speech extraction tool and a first of its kind in contemporary technology.
ISBN: 9781124743950Subjects--Topical Terms:
845382
Engineering, Electronics and Electrical.
Segregation of speech signals in noisy environments.
LDR
:04372nam 2200313 4500
001
712931
005
20121003100301.5
008
121101s2011 ||||||||||||||||| ||eng d
020
$a
9781124743950
035
$a
(UMI)AAI3461249
035
$a
AAI3461249
040
$a
UMI
$c
UMI
100
1
$a
Vishnubhotla, Srikanth.
$3
845417
245
1 0
$a
Segregation of speech signals in noisy environments.
300
$a
110 p.
500
$a
Source: Dissertation Abstracts International, Volume: 72-09, Section: B, page: 5490.
500
$a
Adviser: Carol Y. Espy-Wilson.
502
$a
Thesis (Ph.D.)--University of Maryland, College Park, 2011.
520
$a
Automatic segregation of overlapping speech signals from single-channel recordings is a challenging problem in speech processing. Similarly, the problem of extracting speech signals from noisy speech is a problem that has attracted a variety of research for several years but is still unsolved. Speech extraction from noisy speech mixtures where the background interference could be either speech or noise is especially difficult when the task is to preserve perceptually salient properties of the recovered acoustic signals for use in human communication. In this work, we propose a speech segregation algorithm that can simultaneously deal with both background noise as well as interfering speech. We propose a feature-based, bottom-up algorithm which makes no assumptions about the nature of the interference or does not rely on any prior trained source models for speech extraction. As such, the algorithm should be applicable for a wide variety of problems, and also be useful for human communication since an aim of the system is to recover the target speech signals in the acoustic domain. The proposed algorithm can be compartmentalized into (1) a multi-pitch detection stage which extracts the pitch of the participating speakers, (2) a segregation stage which teases apart the harmonics of the participating sources, (3) a reliability and add-back stage which scales the estimates based on their reliability and adds back appropriate amounts of aperiodic energy for the unvoiced regions of speech and (4) a speaker assignment stage which assigns the extracted speech signals to their appropriate respective sources. The pitch of two overlapping speakers is extracted using a novel feature, the 2-D Average Magnitude Difference Function, which is also capable of giving a single pitch estimate when the input contains only one speaker. The segregation algorithm is based on a least squares framework relying on the estimated pitch values to give estimates of each speaker's contributions to the mixture. The reliability block is based on a non-linear function of the energy of the estimates, this non-linear function having been learnt from a variety of speech and noise data but being very generic in nature and applicability to different databases. With both single- and multiple-pitch extraction and segregation capabilities, the proposed algorithm is amenable to both speech-in-speech and speech-in-noise conditions. The algorithm is evaluated on several objective and subjective tests using both speech and noise interference from different databases. The proposed speech segregation system demonstrates performance comparable to or better than the state-of-the-art on most of the objective tasks. Subjective tests on the speech signals reconstructed by the algorithm, on normal hearing as well as users of hearing aids, indicate a significant improvement in the perceptual quality of the speech signal after being processed by our proposed algorithm, and suggest that the proposed segregation algorithm can be used as a pre-processing block within the signal processing of communication devices. The utility of the algorithm for both perceptual and automatic tasks, based on a single-channel solution, makes it a unique speech extraction tool and a first of its kind in contemporary technology.
590
$a
School code: 0117.
650
4
$a
Engineering, Electronics and Electrical.
$3
845382
690
$a
0544
710
2
$a
University of Maryland, College Park.
$b
Electrical Engineering.
$3
845418
773
0
$t
Dissertation Abstracts International
$g
72-09B.
790
1 0
$a
Espy-Wilson, Carol Y.,
$e
advisor
790
1 0
$a
Shamma, Shihab
$e
committee member
790
1 0
$a
Chellappa, Rama
$e
committee member
790
1 0
$a
Liu, Ray
$e
committee member
790
1 0
$a
Idsardi, William J.
$e
committee member
790
$a
0117
791
$a
Ph.D.
792
$a
2011
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3461249
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入