國立虎尾科技大學 |

Toward Perception Models Beyond Internet Applications.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Toward Perception Models Beyond Internet Applications./
作者:	Phoo, Cheng Perng.
面頁冊數:	1 online resource (350 pages)
附註:	Source: Dissertations Abstracts International, Volume: 85-12, Section: A.
Contained By:	Dissertations Abstracts International85-12A.
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9798382842929

Toward Perception Models Beyond Internet Applications.
Phoo, Cheng Perng.

Toward Perception Models Beyond Internet Applications. - 1 online resource (350 pages)

Source: Dissertations Abstracts International, Volume: 85-12, Section: A.

Thesis (Ph.D.)--Cornell University, 2024.

Includes bibliographical references

For the past decades, we have observed tremendous success in developing perception models - computational models that could perceive our world through images, videos, LiDAR point clouds, and so on. Currently, we have perception models that can recognize thousands of concepts commonly seen on the Internet. The ability of these models to recognize concepts is undeniably impressive, but their successes are only limited to concepts or data modalities (e.g. images) commonly seen on the Internet.Beyond applications in the Internet domain such as remote sensing or medical imagery, perception models have yet to show their prowess. The key challenge in building perception models beyond Internet applications is the requirement of extensive expert involvement. Training performant perception models in these domains often requires non-trivial involvement from experts, especially during the data collection process.In this dissertation, we investigate how we could reduce experts' burden when developing perception models. Specifically, we will focus on the angle of label efficiency, i.e., developing perception models that could be trained with fewer annotations. We will present two broad categories of approaches. The first category relies on minimal assumptions and could be applied to various problem domains; along this vein, we will examine how we could leverage pre-trained models, unlabeled data, and coarsely-labeled data to enhance label efficiency. The second category leverages domain knowledge to enhance label efficiency. For this category of approaches, we will look at two specific domains: autonomous driving and remote sensing. We will investigate how repeated traversals of the same location could be used to improve perception models for self-driving vehicles and how ground images could be used to train vision-language models for remote sensing without any textual annotations. We will end this dissertation with a brief discussion of how we could further reduce experts' burden when developing perception models, enabling broader success of perception models beyond Internet applications.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798382842929Subjects--Topical Terms:

573171
Computer science.
Subjects--Index Terms:

Computer visionIndex Terms--Genre/Form:

554714
Electronic books.

Toward Perception Models Beyond Internet Applications.
LDR:03470ntm a22003977 4500 001 1151742
005 20241118085748.5
006 m o d
007 cr mn ---uuuuu
008 250605s2024 xx obm 000 0 eng d
020 $a 9798382842929
035 $a (MiAaPQ)AAI31243387
035 $a AAI31243387
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Phoo, Cheng Perng. $3 1478564
245 1 0 $a Toward Perception Models Beyond Internet Applications.
264 0 $c 2024
300 $a 1 online resource (350 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 85-12, Section: A.
500 $a Advisor: Hariharan, Bharath.
502 $a Thesis (Ph.D.)--Cornell University, 2024.
504 $a Includes bibliographical references
520 $a For the past decades, we have observed tremendous success in developing perception models - computational models that could perceive our world through images, videos, LiDAR point clouds, and so on. Currently, we have perception models that can recognize thousands of concepts commonly seen on the Internet. The ability of these models to recognize concepts is undeniably impressive, but their successes are only limited to concepts or data modalities (e.g. images) commonly seen on the Internet.Beyond applications in the Internet domain such as remote sensing or medical imagery, perception models have yet to show their prowess. The key challenge in building perception models beyond Internet applications is the requirement of extensive expert involvement. Training performant perception models in these domains often requires non-trivial involvement from experts, especially during the data collection process.In this dissertation, we investigate how we could reduce experts' burden when developing perception models. Specifically, we will focus on the angle of label efficiency, i.e., developing perception models that could be trained with fewer annotations. We will present two broad categories of approaches. The first category relies on minimal assumptions and could be applied to various problem domains; along this vein, we will examine how we could leverage pre-trained models, unlabeled data, and coarsely-labeled data to enhance label efficiency. The second category leverages domain knowledge to enhance label efficiency. For this category of approaches, we will look at two specific domains: autonomous driving and remote sensing. We will investigate how repeated traversals of the same location could be used to improve perception models for self-driving vehicles and how ground images could be used to train vision-language models for remote sensing without any textual annotations. We will end this dissertation with a brief discussion of how we could further reduce experts' burden when developing perception models, enabling broader success of perception models beyond Internet applications.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
650 4 $a Web studies. $3 1148502
653 $a Computer vision
653 $a Fewer annotations
653 $a Machine perception
653 $a Internet applications
653 $a Data collection
655 7 $a Electronic books. $2 local $3 554714
690 $a 0800
690 $a 0984
690 $a 0646
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Cornell University. $b Computer Science. $3 1179602
773 0 $t Dissertations Abstracts International $g 85-12A.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31243387 $z click for full text (PQDT)