語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Automatic Video Captioning using Dee...
~
Rochester Institute of Technology.
Automatic Video Captioning using Deep Neural Network.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Automatic Video Captioning using Deep Neural Network./
作者:
Nguyen, Thang Huy.
面頁冊數:
1 online resource (90 pages)
附註:
Source: Masters Abstracts International, Volume: 56-06.
Contained By:
Masters Abstracts International56-06(E).
標題:
Computer engineering. -
電子資源:
click for full text (PQDT)
ISBN:
9780355160499
Automatic Video Captioning using Deep Neural Network.
Nguyen, Thang Huy.
Automatic Video Captioning using Deep Neural Network.
- 1 online resource (90 pages)
Source: Masters Abstracts International, Volume: 56-06.
Thesis (M.S.)--Rochester Institute of Technology, 2017.
Includes bibliographical references
Video understanding has become increasingly important as surveillance, social, and informational videos weave themselves into our everyday lives. Video captioning offers a simple way to summarize, index, and search the data. Most video captioning models utilize a video encoder and captioning decoder framework. Hierarchical encoders can abstractly capture clip level temporal features to represent a video, but the clips are at fixed time steps. This thesis research introduces two models: a hierarchical model with steered captioning, and a Multi-stream Hierarchical Boundary model. The steered captioning model is the first attention model to smartly guide an attention model to appropriate locations in a video by using visual attributes. The Multi-stream Hierarchical Boundary model combines a fixed hierarchy recurrent architecture with a soft hierarchy layer by using intrinsic feature boundary cuts within a video to define clips. This thesis also introduces a novel parametric Gaussian attention which removes the restriction of soft attention techniques which require fixed length video streams. By carefully incorporating Gaussian attention in designated layers, the proposed models demonstrate state-of-the-art video captioning results on recent datasets.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355160499Subjects--Topical Terms:
569006
Computer engineering.
Index Terms--Genre/Form:
554714
Electronic books.
Automatic Video Captioning using Deep Neural Network.
LDR
:02432ntm a2200325Ki 4500
001
918321
005
20181114145235.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355160499
035
$a
(MiAaPQ)AAI10618993
035
$a
(MiAaPQ)rit:12735
035
$a
AAI10618993
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Nguyen, Thang Huy.
$3
1192612
245
1 0
$a
Automatic Video Captioning using Deep Neural Network.
264
0
$c
2017
300
$a
1 online resource (90 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 56-06.
500
$a
Adviser: Raymond Ptucha.
502
$a
Thesis (M.S.)--Rochester Institute of Technology, 2017.
504
$a
Includes bibliographical references
520
$a
Video understanding has become increasingly important as surveillance, social, and informational videos weave themselves into our everyday lives. Video captioning offers a simple way to summarize, index, and search the data. Most video captioning models utilize a video encoder and captioning decoder framework. Hierarchical encoders can abstractly capture clip level temporal features to represent a video, but the clips are at fixed time steps. This thesis research introduces two models: a hierarchical model with steered captioning, and a Multi-stream Hierarchical Boundary model. The steered captioning model is the first attention model to smartly guide an attention model to appropriate locations in a video by using visual attributes. The Multi-stream Hierarchical Boundary model combines a fixed hierarchy recurrent architecture with a soft hierarchy layer by using intrinsic feature boundary cuts within a video to define clips. This thesis also introduces a novel parametric Gaussian attention which removes the restriction of soft attention techniques which require fixed length video streams. By carefully incorporating Gaussian attention in designated layers, the proposed models demonstrate state-of-the-art video captioning results on recent datasets.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer engineering.
$3
569006
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0464
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Rochester Institute of Technology.
$b
Computer Engineering.
$3
1184443
773
0
$t
Masters Abstracts International
$g
56-06(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10618993
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入