國立虎尾科技大學 |

Joint Multiple Visual Task Understanding from a Single Image via Deep Learning and Conditional Random Field.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Joint Multiple Visual Task Understanding from a Single Image via Deep Learning and Conditional Random Field./
作者:	Wang, Peng.
面頁冊數:	1 online resource (99 pages)
附註:	Source: Dissertation Abstracts International, Volume: 78-08(E), Section: B.
標題:	Statistics. -
電子資源:	click for full text (PQDT)
ISBN:	9781369657098

Joint Multiple Visual Task Understanding from a Single Image via Deep Learning and Conditional Random Field.
Wang, Peng.

Joint Multiple Visual Task Understanding from a Single Image via Deep Learning and Conditional Random Field. - 1 online resource (99 pages)

Source: Dissertation Abstracts International, Volume: 78-08(E), Section: B.

Thesis (Ph.D.)--University of California, Los Angeles, 2017.

Includes bibliographical references

Human are interpolating the visual world with very rich understanding. For example, when observing the world through eyes, we not only understand the high level semantic meaning of each region/pixel, more importantly, we also understand the 3D properties like how far away each object is and how the 3D shape of each object is in order to do interaction with the world. In the field of computer vision, however, visual understanding are separated into multiple tasks, e.g. segmentation, 3D reconstruction or object detection etc., due to its high complexity. However, this induces the problem that the results from different strategies are lack of compatibility among different tasks. For example, semantic object detection can not take care of the 3D occlusion regions, while 3D reconstruction does not consider overall semantic context. Thus, in order to have good visual understanding, it is critical to joint understand different tasks while maintaining their compatibility.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9781369657098Subjects--Topical Terms:

556824
Statistics.
Index Terms--Genre/Form:

554714
Electronic books.

Joint Multiple Visual Task Understanding from a Single Image via Deep Learning and Conditional Random Field.
LDR:04013ntm a2200361K 4500 001 914360
005 20180703084808.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9781369657098
035 $a (MiAaPQ)AAI10259737
035 $a (MiAaPQ)ucla:15243
035 $a AAI10259737
040 $a MiAaPQ $b eng $c MiAaPQ
100 1 $a Wang, Peng. $3 1187591
245 1 0 $a Joint Multiple Visual Task Understanding from a Single Image via Deep Learning and Conditional Random Field.
264 0 $c 2017
300 $a 1 online resource (99 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-08(E), Section: B.
500 $a Adviser: Alan Loddon Yuille.
502 $a Thesis (Ph.D.)--University of California, Los Angeles, 2017.
504 $a Includes bibliographical references
520 $a Human are interpolating the visual world with very rich understanding. For example, when observing the world through eyes, we not only understand the high level semantic meaning of each region/pixel, more importantly, we also understand the 3D properties like how far away each object is and how the 3D shape of each object is in order to do interaction with the world. In the field of computer vision, however, visual understanding are separated into multiple tasks, e.g. segmentation, 3D reconstruction or object detection etc., due to its high complexity. However, this induces the problem that the results from different strategies are lack of compatibility among different tasks. For example, semantic object detection can not take care of the 3D occlusion regions, while 3D reconstruction does not consider overall semantic context. Thus, in order to have good visual understanding, it is critical to joint understand different tasks while maintaining their compatibility.
520 $a Luckily, thanks to the raising technique of deep learning, (a.k.a. convolutional neural network (CNN)), which dramatically beats the other traditional strategies in many visual tasks based on hierarchical learned features with a nearly single framework, we are able to unify different understandings in a more compact and efficient way by designing reasonable output and interaction terms.
520 $a However, CNN is not a magic key of solving all problems, and one obvious limitation of CNN is that it contains arbitrarily selected convolutional kernel size and layers, yielding non-adaptive receptive fields to match the variance of object scales. In addition, it is not strait-forward to add arbitrary connections inside each layer based on intuition. Thus, we further embed the conditional random field (CRF) into the system in order to compensate the deficiency in order to unify different cues and perform multiple tasks simultaneously.
520 $a In this thesis, we prove the concept through estimating multiple tasks jointly including joint part and object segmentation, joint segmentation and geometry estimation etc. We first show that we can fit deep convolutional network into many different tasks to acquire superior performance compare to traditional shallow features. Secondly, by unifying different tasks with our designed compatibility constrains, we make different tasks mutually regularized and beneficial. Finally, to evaluate the results, we perform our experiments over the standard evaluating benchmarks like PASCAL for segmentation and the NYU v2 dataset for depth estimation. Last but not the least, we not only apply the existing metrics to show the performance gain from our design, but also introduce reasonable new metrics in order to better show the aspect that improved.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Statistics. $3 556824
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0463
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of California, Los Angeles. $b Statistics 0891. $3 1183048
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10259737 $z click for full text (PQDT)