國立虎尾科技大學 |

Learning to Learn for Small Sample Visual Recognition.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Learning to Learn for Small Sample Visual Recognition./
作者:	Wang, Yu-Xiong.
面頁冊數:	1 online resource (206 pages)
附註:	Source: Dissertation Abstracts International, Volume: 79-12(E), Section: B.
Contained By:	Dissertation Abstracts International79-12B(E).
標題:	Robotics. -
電子資源:	click for full text (PQDT)
ISBN:	9780438257733

Learning to Learn for Small Sample Visual Recognition.
Wang, Yu-Xiong.

Learning to Learn for Small Sample Visual Recognition. - 1 online resource (206 pages)

Source: Dissertation Abstracts International, Volume: 79-12(E), Section: B.

Thesis (Ph.D.)--Carnegie Mellon University, 2018.

Includes bibliographical references

Understanding how humans and machines recognize novel visual concepts from few examples remains a fundamental challenge. Humans are remarkably able to grasp a new concept and make meaningful generalization from just few examples. By contrast, state-of-the-art machine learning techniques and visual recognition systems typically require thousands of training examples and often break down if the training sample set is too small.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780438257733Subjects--Topical Terms:

561941
Robotics.
Index Terms--Genre/Form:

554714
Electronic books.

Learning to Learn for Small Sample Visual Recognition.
LDR:05380ntm a2200397Ki 4500 001 916938
005 20180928111503.5
006 m o u
007 cr mn||||a|a||
008 190606s2018 xx obm 000 0 eng d
020 $a 9780438257733
035 $a (MiAaPQ)AAI10956329
035 $a AAI10956329
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Wang, Yu-Xiong. $3 1190814
245 1 0 $a Learning to Learn for Small Sample Visual Recognition.
264 0 $c 2018
300 $a 1 online resource (206 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 79-12(E), Section: B.
500 $a Adviser: Martial Hebert.
502 $a Thesis (Ph.D.)--Carnegie Mellon University, 2018.
504 $a Includes bibliographical references
520 $a Understanding how humans and machines recognize novel visual concepts from few examples remains a fundamental challenge. Humans are remarkably able to grasp a new concept and make meaningful generalization from just few examples. By contrast, state-of-the-art machine learning techniques and visual recognition systems typically require thousands of training examples and often break down if the training sample set is too small.
520 $a This dissertation aims to endow visual recognition systems with low-shot learning ability, so that they learn consistently well on data of different sample sizes. Our key insight is that the visual world is well structured and highly predictable not only in data and feature spaces but also in task and model spaces. Such structures and regularities enable the systems to learn how to learn new recognition tasks rapidly by reusing previous experiences. This philosophy of learning to learn, or meta-learning, is one of the underlying tenets towards versatile agents that can continually learn a wide variety of tasks throughout their lifetimes. In this spirit, we address key technical challenges and explore complementary perspectives.
520 $a We begin by learning from extremely limited data (e.g., one-shot learning). We cast the problem as supervised knowledge distillation and explore structures within model pairs. We introduce a meta-network that operates on the space of model parameters and encodes a generic transformation from "student" models learned from few samples to "teacher" models learned from large enough sample sets. By learning a series of transformations as more training data is gradually added, we further capture a notion of model dynamics to facilitate long-tail recognition with categories of different sample sizes. Moreover, by viewing the meta-network as an effective model adaptation strategy, we combine it with learning a generic model initialization and extend the use in few-shot human motion prediction tasks.
520 $a To further decouple a recognition model from ties to a specific set of categories, we introduce self-supervision using meta-data. We expose the model to a large amount of unlabeled real-world images through an unsupervised meta-training phase. By learning diverse sets of low-density separators across auxiliary pseudo-classes, we capture a more generic, richer description of the visual world. Since they are informative across different categories, we alternatively use the low-density separators to constitute an "off-the-shelf" library as external memory, enabling generation of new models on-the-fly for a variety of tasks, including object detection, hypothesis transfer learning, domain adaptation, and image retrieval. By doing so, we have essentially leveraged structures within a large collection of models.
520 $a We them move on to learning from a medium sized number of examples and explore structures within an evolving model when learning from continuously changing data streams and tasks. We rethink the dominant knowledge transfer paradigm that fine-tunes a fixedsize pre-trained model on new labeled target data. Inspired by developmental learning, we progressively grow a convolutional neural network with increased model capacity, which significantly outperforms classic fine-tuning approaches. Furthermore, we address unsupervised fine-tuning by transferring knowledge from a discriminative to a generative model on unlabeled target data. We thus make progress towards a lifelong learning process.
520 $a From a different perspective, humans can imagine what novel objects look like from different views. Incorporating this ability to hallucinate novel instances of new concepts and leveraging joint structures in both data and task spaces might help recognition systems perform better low-shot learning. We then combine a meta-learner with a "hallucinator" that produces additional training examples, and optimize both models jointly, leading to significant performance gains. Finally, combining these approaches, we suggest a broader picture of learning to learn predictive structures through exploration and exploitation.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Robotics. $3 561941
650 4 $a Artificial intelligence. $3 559380
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0771
690 $a 0800
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Carnegie Mellon University. $3 845406
773 0 $t Dissertation Abstracts International $g 79-12B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10956329 $z click for full text (PQDT)