Language:
English
繁體中文
Help
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Automatic Video Captioning using Dee...
~
Rochester Institute of Technology.
Automatic Video Captioning using Deep Neural Network.
Record Type:
Language materials, manuscript : Monograph/item
Title/Author:
Automatic Video Captioning using Deep Neural Network./
Author:
Nguyen, Thang Huy.
Description:
1 online resource (90 pages)
Notes:
Source: Masters Abstracts International, Volume: 56-06.
Contained By:
Masters Abstracts International56-06(E).
Subject:
Computer engineering. -
Online resource:
click for full text (PQDT)
ISBN:
9780355160499
Automatic Video Captioning using Deep Neural Network.
Nguyen, Thang Huy.
Automatic Video Captioning using Deep Neural Network.
- 1 online resource (90 pages)
Source: Masters Abstracts International, Volume: 56-06.
Thesis (M.S.)--Rochester Institute of Technology, 2017.
Includes bibliographical references
Video understanding has become increasingly important as surveillance, social, and informational videos weave themselves into our everyday lives. Video captioning offers a simple way to summarize, index, and search the data. Most video captioning models utilize a video encoder and captioning decoder framework. Hierarchical encoders can abstractly capture clip level temporal features to represent a video, but the clips are at fixed time steps. This thesis research introduces two models: a hierarchical model with steered captioning, and a Multi-stream Hierarchical Boundary model. The steered captioning model is the first attention model to smartly guide an attention model to appropriate locations in a video by using visual attributes. The Multi-stream Hierarchical Boundary model combines a fixed hierarchy recurrent architecture with a soft hierarchy layer by using intrinsic feature boundary cuts within a video to define clips. This thesis also introduces a novel parametric Gaussian attention which removes the restriction of soft attention techniques which require fixed length video streams. By carefully incorporating Gaussian attention in designated layers, the proposed models demonstrate state-of-the-art video captioning results on recent datasets.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355160499Subjects--Topical Terms:
569006
Computer engineering.
Index Terms--Genre/Form:
554714
Electronic books.
Automatic Video Captioning using Deep Neural Network.
LDR
:02432ntm a2200325Ki 4500
001
918321
005
20181114145235.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355160499
035
$a
(MiAaPQ)AAI10618993
035
$a
(MiAaPQ)rit:12735
035
$a
AAI10618993
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Nguyen, Thang Huy.
$3
1192612
245
1 0
$a
Automatic Video Captioning using Deep Neural Network.
264
0
$c
2017
300
$a
1 online resource (90 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 56-06.
500
$a
Adviser: Raymond Ptucha.
502
$a
Thesis (M.S.)--Rochester Institute of Technology, 2017.
504
$a
Includes bibliographical references
520
$a
Video understanding has become increasingly important as surveillance, social, and informational videos weave themselves into our everyday lives. Video captioning offers a simple way to summarize, index, and search the data. Most video captioning models utilize a video encoder and captioning decoder framework. Hierarchical encoders can abstractly capture clip level temporal features to represent a video, but the clips are at fixed time steps. This thesis research introduces two models: a hierarchical model with steered captioning, and a Multi-stream Hierarchical Boundary model. The steered captioning model is the first attention model to smartly guide an attention model to appropriate locations in a video by using visual attributes. The Multi-stream Hierarchical Boundary model combines a fixed hierarchy recurrent architecture with a soft hierarchy layer by using intrinsic feature boundary cuts within a video to define clips. This thesis also introduces a novel parametric Gaussian attention which removes the restriction of soft attention techniques which require fixed length video streams. By carefully incorporating Gaussian attention in designated layers, the proposed models demonstrate state-of-the-art video captioning results on recent datasets.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer engineering.
$3
569006
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0464
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Rochester Institute of Technology.
$b
Computer Engineering.
$3
1184443
773
0
$t
Masters Abstracts International
$g
56-06(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10618993
$z
click for full text (PQDT)
based on 0 review(s)
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login