語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Deep Learning-Based Human Action Understanding in Videos.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Deep Learning-Based Human Action Understanding in Videos./
作者:
Vahdani, Elahe.
面頁冊數:
1 online resource (158 pages)
附註:
Source: Dissertations Abstracts International, Volume: 85-08, Section: B.
Contained By:
Dissertations Abstracts International85-08B.
標題:
Computer engineering. -
電子資源:
click for full text (PQDT)
ISBN:
9798381681093
Deep Learning-Based Human Action Understanding in Videos.
Vahdani, Elahe.
Deep Learning-Based Human Action Understanding in Videos.
- 1 online resource (158 pages)
Source: Dissertations Abstracts International, Volume: 85-08, Section: B.
Thesis (Ph.D.)--City University of New York, 2024.
Includes bibliographical references
The understanding of human actions in videos holds immense potential for technological advancement and societal betterment. This thesis explores fundamental aspects of this field, including action recognition in trimmed clips and action localization in untrimmed videos. Trimmed videos contain only one action instance, with moments before or after the action excluded from the video. However, the majority of videos captured in unconstrained environments, often referred to as untrimmed videos, are naturally unsegmented. Untrimmed videos are typically lengthy and may encompass multiple action instances, along with the moments preceding or following each action, as well as transitions between actions. In the task of action recognition in trimmed clips, the primary objective is to classify action categories. In contrast, action detection in untrimmed videos aims to accurately identify the starting and ending moments of actions within untrimmed videos while also assigning the corresponding action labels. Action understanding in videos has significant implications across various sectors. It is invaluable in surveillance for identifying potential threats and in healthcare for monitoring patient movements. Importantly, it serves as an indispensable tool for interpreting sign language, facilitating communication with the deaf and hard-of-hearing community. This research presents innovative frameworks for video-based action recognition and detection. Annotating temporal boundaries and action labels for all action instances in untrimmed videos is a labor-intensive and expensive process. To mitigate the need for exhaustive annotations, this work introduces pioneering frameworks that rely on limited supervision. The proposed models demonstrate significant performance improvements over the current state-of-the-art on benchmark datasets. Furthermore, the applications of action understanding in sign language videos are explored by pioneering automated detection of signing errors. The effectiveness of the models is evaluated on the collected sign language datasets.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798381681093Subjects--Topical Terms:
569006
Computer engineering.
Subjects--Index Terms:
Computer visionIndex Terms--Genre/Form:
554714
Electronic books.
Deep Learning-Based Human Action Understanding in Videos.
LDR
:03373ntm a22003737 4500
001
1152026
005
20241125080214.5
006
m o d
007
cr mn ---uuuuu
008
250605s2024 xx obm 000 0 eng d
020
$a
9798381681093
035
$a
(MiAaPQ)AAI30990374
035
$a
AAI30990374
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Vahdani, Elahe.
$3
1478900
245
1 0
$a
Deep Learning-Based Human Action Understanding in Videos.
264
0
$c
2024
300
$a
1 online resource (158 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 85-08, Section: B.
500
$a
Advisor: Tian, Yingli.
502
$a
Thesis (Ph.D.)--City University of New York, 2024.
504
$a
Includes bibliographical references
520
$a
The understanding of human actions in videos holds immense potential for technological advancement and societal betterment. This thesis explores fundamental aspects of this field, including action recognition in trimmed clips and action localization in untrimmed videos. Trimmed videos contain only one action instance, with moments before or after the action excluded from the video. However, the majority of videos captured in unconstrained environments, often referred to as untrimmed videos, are naturally unsegmented. Untrimmed videos are typically lengthy and may encompass multiple action instances, along with the moments preceding or following each action, as well as transitions between actions. In the task of action recognition in trimmed clips, the primary objective is to classify action categories. In contrast, action detection in untrimmed videos aims to accurately identify the starting and ending moments of actions within untrimmed videos while also assigning the corresponding action labels. Action understanding in videos has significant implications across various sectors. It is invaluable in surveillance for identifying potential threats and in healthcare for monitoring patient movements. Importantly, it serves as an indispensable tool for interpreting sign language, facilitating communication with the deaf and hard-of-hearing community. This research presents innovative frameworks for video-based action recognition and detection. Annotating temporal boundaries and action labels for all action instances in untrimmed videos is a labor-intensive and expensive process. To mitigate the need for exhaustive annotations, this work introduces pioneering frameworks that rely on limited supervision. The proposed models demonstrate significant performance improvements over the current state-of-the-art on benchmark datasets. Furthermore, the applications of action understanding in sign language videos are explored by pioneering automated detection of signing errors. The effectiveness of the models is evaluated on the collected sign language datasets.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Computer engineering.
$3
569006
650
4
$a
Computer science.
$3
573171
653
$a
Computer vision
653
$a
Deep learning
653
$a
Video understanding
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
690
$a
0464
690
$a
0800
710
2
$a
City University of New York.
$b
Computer Science.
$3
1184450
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
773
0
$t
Dissertations Abstracts International
$g
85-08B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30990374
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入