國立虎尾科技大學 |

Spatial-Temporal Hierarchical Model for Joint Learning and Inference of Human Action and Pose.

Record Type:	Language materials, manuscript : Monograph/item
Title/Author:	Spatial-Temporal Hierarchical Model for Joint Learning and Inference of Human Action and Pose./
Author:	Nie, Xiaohan.
Description:	1 online resource (119 pages)
Notes:	Source: Dissertation Abstracts International, Volume: 78-10(E), Section: B.
Contained By:	Dissertation Abstracts International78-10B(E).
Subject:	Statistics. -
Online resource:	click for full text (PQDT)
ISBN:	9781369846980

Spatial-Temporal Hierarchical Model for Joint Learning and Inference of Human Action and Pose.
Nie, Xiaohan.

Spatial-Temporal Hierarchical Model for Joint Learning and Inference of Human Action and Pose. - 1 online resource (119 pages)

Source: Dissertation Abstracts International, Volume: 78-10(E), Section: B.

Thesis (Ph.D.)--University of California, Los Angeles, 2017.

Includes bibliographical references

In the community of computer vision, human pose estimation and human action recognition are two classic and also of particular important tasks. They always serve as basic preprocessing steps for other high-level tasks such as group activity analysis, visual search and human identification and they are also widely used as key components in many real applications such as intelligent surveillance system and human-computer interaction based system. The two tasks are closely related for understanding human motion, most methods, however, learn separate models and combine them sequentially.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9781369846980Subjects--Topical Terms:

556824
Statistics.
Index Terms--Genre/Form:

554714
Electronic books.

Spatial-Temporal Hierarchical Model for Joint Learning and Inference of Human Action and Pose.
LDR:04581ntm a2200409Ki 4500 001 918799
005 20181106104111.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9781369846980
035 $a (MiAaPQ)AAI10286374
035 $a (MiAaPQ)ucla:15475
035 $a AAI10286374
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Nie, Xiaohan. $3 1193221
245 1 0 $a Spatial-Temporal Hierarchical Model for Joint Learning and Inference of Human Action and Pose.
264 0 $c 2017
300 $a 1 online resource (119 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-10(E), Section: B.
500 $a Adviser: Song-Chun Zhu.
502 $a Thesis (Ph.D.)--University of California, Los Angeles, 2017.
504 $a Includes bibliographical references
520 $a In the community of computer vision, human pose estimation and human action recognition are two classic and also of particular important tasks. They always serve as basic preprocessing steps for other high-level tasks such as group activity analysis, visual search and human identification and they are also widely used as key components in many real applications such as intelligent surveillance system and human-computer interaction based system. The two tasks are closely related for understanding human motion, most methods, however, learn separate models and combine them sequentially.
520 $a In this dissertation, we build systems for pursuing a unified framework to integrate training and inference of human pose estimation and action recognition in a spatial-temporal And-Or Graph (ST-AOG) representation. Particularly, we study different ways to achieve this goal:
520 $a (1) A two-level And-Or Tree structure is utilized for representing action as animated pose template (APT). Each action is a sequence of moving pose templates with transition probabilities. Each Pose template consists of a shape template represented by an And-node capturing part appearance, and a motion template represented by an Or-node capturing part motions. The transitions between moving pose templates are governed in a Hidden Markov Model. The part locations, pose types and action labels are estimated together in inference.
520 $a (2) In order to tackle actions from unknown and unseen views we present a multi-view spatial-temporal And-Or Graph (MST-AOG) for cross-view action recognition. As a compositional model, the MST-AOG compactly represents the hierarchical combinatorial structures of cross-view actions by explicitly modeling the geometry, appearance and motion variations. The model training takes advantage of the 3D human skeleton data obtained from Kinect cameras to avoid annotating video frames. The efficient inference enables action recognition from novel views. A new Multi-view Action3D dataset has been created and released.
520 $a (3) To further represent part, pose and action jointly and improve performance, we represent action at three scales by a ST-AOG model. Each action is decomposed into poses which are further divided into mid-level spatial-temporal parts (ST-parts) and then parts. The hierarchical model structure captures the geometric and appearance variations of pose at each frame. The lateral connections between ST-parts at adjacent frames capture the action-specific motions. The model parameters at three scales are learned discriminatively and dynamic programming is utilized for efficient inference. The experiments demonstrate the large benet of joint modeling of the two tasks.
520 $a (4) The last but not the least, we study a novel framework for full-body 3D human pose estimation which is a essential task for human attention recognition, robot-based human action prediction and interaction. We build a two-level hierarchy of Long Short-Term Memory (LSTM) network with tree-structure to predict the depth on 2D human joints and then reconstruct the 3D pose. Our two-level model utilizes two cues for depth prediction: 1) the global features from 2D skeleton. 2) the local features from image patches of body parts.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Statistics. $3 556824
650 4 $a Computer science. $3 573171
650 4 $a Artificial intelligence. $3 559380
655 7 $a Electronic books. $2 local $3 554714
690 $a 0463
690 $a 0984
690 $a 0800
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of California, Los Angeles. $b Statistics 0891. $3 1183048
773 0 $t Dissertation Abstracts International $g 78-10B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10286374 $z click for full text (PQDT)