國立虎尾科技大學 |

Visual Commonsense Reasoning : = Functionality, Physics, Causality, and Utility.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Visual Commonsense Reasoning :/
其他題名:	Functionality, Physics, Causality, and Utility.
作者:	Zhu, Yixin.
面頁冊數:	1 online resource (234 pages)
附註:	Source: Dissertation Abstracts International, Volume: 79-09(E), Section: B.
標題:	Statistics. -
電子資源:	click for full text (PQDT)
ISBN:	9780355856767

Visual Commonsense Reasoning : = Functionality, Physics, Causality, and Utility.
Zhu, Yixin.

Visual Commonsense Reasoning :Functionality, Physics, Causality, and Utility. - 1 online resource (234 pages)

Source: Dissertation Abstracts International, Volume: 79-09(E), Section: B.

Thesis (Ph.D.)--University of California, Los Angeles, 2018.

Includes bibliographical references

Reasoning about commonsense from visual input remains an important and challenging problem in the field of computer vision. It is important because the ability to reason about commonsense, plan and act accordingly, represents the most distinct competence that tells human apart from other animals---the ability of analogy. It is challenging partially due to the absence of the observations of all the typical examples in a given category, in which the objects often present enormous intra-class variations, leading to a long-tail distribution in the dimensions of appearance and geometry. This dissertation focuses on four largely orthogonal dimensions---functionality, physics, causality, and utility---in computer vision, robotics, and cognitive science, and it makes six major contributions:

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780355856767Subjects--Topical Terms:

556824
Statistics.
Index Terms--Genre/Form:

554714
Electronic books.

Visual Commonsense Reasoning : = Functionality, Physics, Causality, and Utility.
LDR:05013ntm a2200397K 4500 001 914805
005 20180724121430.5
006 m o u
007 cr mn||||a|a||
008 190606s2018 xx obm 000 0 eng d
020 $a 9780355856767
035 $a (MiAaPQ)AAI10790595
035 $a (MiAaPQ)ucla:16671
035 $a AAI10790595
040 $a MiAaPQ $b eng $c MiAaPQ
100 1 $a Zhu, Yixin. $3 1188164
245 1 0 $a Visual Commonsense Reasoning : $b Functionality, Physics, Causality, and Utility.
264 0 $c 2018
300 $a 1 online resource (234 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 79-09(E), Section: B.
500 $a Adviser: Song-Chun Zhu.
502 $a Thesis (Ph.D.)--University of California, Los Angeles, 2018.
504 $a Includes bibliographical references
520 $a Reasoning about commonsense from visual input remains an important and challenging problem in the field of computer vision. It is important because the ability to reason about commonsense, plan and act accordingly, represents the most distinct competence that tells human apart from other animals---the ability of analogy. It is challenging partially due to the absence of the observations of all the typical examples in a given category, in which the objects often present enormous intra-class variations, leading to a long-tail distribution in the dimensions of appearance and geometry. This dissertation focuses on four largely orthogonal dimensions---functionality, physics, causality, and utility---in computer vision, robotics, and cognitive science, and it makes six major contributions:
520 $a We rethink object recognition from the perspective of an agent: how objects are used as "tools" or "containers" in actions to accomplish a "task". Here a task is defined as changing the physical states of a target object by actions, such as, cracking a nut or painting a wall. A tool is a physical object used in the human action to achieve the task, such as a hammer or a brush, and it can be any daily objects which are not restricted to conventional hardware tools. This leads us to a new framework---task-oriented object modeling, learning and recognition, which aims at understanding the underlying functions, physics and causality in using objects as tools in various task categories.
520 $a We propose to go beyond visible geometric compatibility to infer, through physics-based simulation, the forces/pressures on various body parts as people interact with objects. By observing people's choices in videos, we can learn the comfort intervals of the pressures on body parts as well as human preferences in distributing these pressures among body parts. Thus, our system is able to "feel", in numerical terms, discomfort when the forces/pressures on body parts exceed comfort intervals. We argue that this is an important step in representing human utilities ---the pleasure and satisfaction defined in economics and ethics (e.g., by the philosopher Jeremy Benthem) that drives human activities at all levels.
520 $a We propose to go beyond modeling the direct and short-term human interaction with individual objects. Through accurately simulating thermodynamics and air fluid dynamics, our method can infer indoor room temperature distribution and air flow dynamics at arbitrary time and locations, thus establishing a form of indirect and long-term affordance. Unlike chairs in a sitting scenario, the objects (heating/cooling sources) that provide affordance do not directly interact with a person. Instead, the air in a room serves as an invisible medium to pass the affordance from an object to a person. We coin this new form of affordance as intangible affordance.
520 $a By fusing functionality and affordance into indoor scene generation, we propose a systematic learning-based approach to the generation of massive quantities of synthetic 3D scenes and numerous photorealistic 2D images thereof, with associated ground truth information, for the purposes of training, benchmarking, and diagnosing learning-based computer vision and robotics algorithms.
520 $a We present four case studies on integrating forces and functionality in object manipulations in the field of robotics, showcasing the significance and benefits of explicit modeling of the functionality in task executions.
520 $a We introduce an intuitive substance engine (ISE) model employing probabilistic simulation, which supports the hypothesis that humans infer future states of perceived physical situations by propagating noisy representations forward in time using approximated rational physics.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Statistics. $3 556824
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0463
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of California, Los Angeles. $b Statistics. $3 1186550
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10790595 $z click for full text (PQDT)