Language:
English
繁體中文
Help
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Human Activity Analysis using Multi-...
~
The City College of New York.
Human Activity Analysis using Multi-modalities and Deep Learning.
Record Type:
Language materials, manuscript : Monograph/item
Title/Author:
Human Activity Analysis using Multi-modalities and Deep Learning./
Author:
Zhang, Chenyang.
Description:
1 online resource (115 pages)
Notes:
Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
Subject:
Computer science. -
Online resource:
click for full text (PQDT)
ISBN:
9781369148558
Human Activity Analysis using Multi-modalities and Deep Learning.
Zhang, Chenyang.
Human Activity Analysis using Multi-modalities and Deep Learning.
- 1 online resource (115 pages)
Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
Thesis (Ph.D.)--The City College of New York, 2016.
Includes bibliographical references
With the successful development of video recording devices and sharing platforms, visual media has become a significant component of everyone's life in the world. To better organize and understand the tremendous amount of visual data, computer vision and machine learning have become the key technologies to resolve such a huge problem. Among the topics in computer vision research, human activity analysis is one of the most challenging and promising areas. Human activity analysis is dedicated to detecting, recognizing, and understanding the context and meaning of human activities in visual media. This dissertation focuses on two aspects in human activity analysis: 1) how to utilize multi-modality approach, including depth sensors and traditional RGB cameras, for human action modeling. 2) How to utilize more advanced machine learning technologies, such as deep learning and sparse coding, to address more sophisticated problems such as attribute learning and automatic video captioning.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9781369148558Subjects--Topical Terms:
573171
Computer science.
Index Terms--Genre/Form:
554714
Electronic books.
Human Activity Analysis using Multi-modalities and Deep Learning.
LDR
:03115ntm a2200325K 4500
001
915248
005
20180727125211.5
006
m o u
007
cr mn||||a|a||
008
190606s2016 xx obm 000 0 eng d
020
$a
9781369148558
035
$a
(MiAaPQ)AAI10159927
035
$a
(MiAaPQ)ccny.cuny:10107
035
$a
AAI10159927
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
100
1
$a
Zhang, Chenyang.
$3
1188551
245
1 0
$a
Human Activity Analysis using Multi-modalities and Deep Learning.
264
0
$c
2016
300
$a
1 online resource (115 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
500
$a
Adviser: Yingli Tian.
502
$a
Thesis (Ph.D.)--The City College of New York, 2016.
504
$a
Includes bibliographical references
520
$a
With the successful development of video recording devices and sharing platforms, visual media has become a significant component of everyone's life in the world. To better organize and understand the tremendous amount of visual data, computer vision and machine learning have become the key technologies to resolve such a huge problem. Among the topics in computer vision research, human activity analysis is one of the most challenging and promising areas. Human activity analysis is dedicated to detecting, recognizing, and understanding the context and meaning of human activities in visual media. This dissertation focuses on two aspects in human activity analysis: 1) how to utilize multi-modality approach, including depth sensors and traditional RGB cameras, for human action modeling. 2) How to utilize more advanced machine learning technologies, such as deep learning and sparse coding, to address more sophisticated problems such as attribute learning and automatic video captioning.
520
$a
To explore the utilization of the depth cameras, we first present a depth camera-based image descriptor called histogram of 3D facets (H3DF) and its utilization in human action and hand gesture recognition and a holistic depth video representation for human actions. To unify both the inputs from depth cameras and RGB cameras, this dissertation first discusses a joint framework to model human affections from both facial expressions and body gestures with a multi-modality fusion framework. Then we present deep learning-based frameworks for human attribute learning and automatic video captioning tasks. Compared to human action detection recognition, automatic video captioning is more challenging because it includes complex language models and visual context. Extensive experiments have also been conducted on several public datasets to demonstrate that our proposed frameworks in this dissertation outperform the state-of-the-art approaches in this research area.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
The City College of New York.
$b
Electrical Engineering.
$3
1185904
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10159927
$z
click for full text (PQDT)
based on 0 review(s)
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login