語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Towards Comprehensive Visual Understanding.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Towards Comprehensive Visual Understanding./
作者:
Jia, Menglin.
面頁冊數:
1 online resource (186 pages)
附註:
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Contained By:
Dissertations Abstracts International84-12B.
標題:
Information science. -
電子資源:
click for full text (PQDT)
ISBN:
9798379722371
Towards Comprehensive Visual Understanding.
Jia, Menglin.
Towards Comprehensive Visual Understanding.
- 1 online resource (186 pages)
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Thesis (Ph.D.)--Cornell University, 2023.
Includes bibliographical references
An image is worth a thousand words, conveying information that goes beyond the visual content therein. Traditional computer vision tasks focus on the recognition of tangible properties of images, such as objects and scenes. Relatively little attention has been paid to tasks that involve private states where subjectivity analysis is relevant. This area includes detecting cyberbullying and hate speech, identifying emotions, and understanding rhetoric and intentions. This dissertation presents our work in exploring new challenges and approaches towards comprehensive visual understanding, with both subjectivity and objectivity in images in mind. Specifically, on the challenge side, we focus on a specific aspect of subjectivity: the intent behind social media images. We introduce an intent dataset, Intentonomy, annotated with 28 intent categories derived from a social psychology taxonomy. We then systematically study whether, and to what extent, commonly used visual information, i.e., object and context, contribute to human intent understanding. On the approach side, we present three approaches: (1) an intent classifier that attends to object and context classes in images as well as textual information in the form of hashtags; (2) a streamlined pre-training method that uses pseudo labels derived from human responses to social media posts. (3) a parameter-efficient transfer learning method for adapting ever-increasing pre-trained vision models. We find our dataset to be very challenging for visual recognition systems and our approaches to be empirically effective on representative visual understanding tasks.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798379722371Subjects--Topical Terms:
561178
Information science.
Subjects--Index Terms:
Computer visionIndex Terms--Genre/Form:
554714
Electronic books.
Towards Comprehensive Visual Understanding.
LDR
:02972ntm a22003977 4500
001
1144460
005
20240611104233.5
006
m o d
007
cr mn ---uuuuu
008
250605s2023 xx obm 000 0 eng d
020
$a
9798379722371
035
$a
(MiAaPQ)AAI30248533
035
$a
AAI30248533
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Jia, Menglin.
$3
1469500
245
1 0
$a
Towards Comprehensive Visual Understanding.
264
0
$c
2023
300
$a
1 online resource (186 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
500
$a
Advisor: Cardie, Claire.
502
$a
Thesis (Ph.D.)--Cornell University, 2023.
504
$a
Includes bibliographical references
520
$a
An image is worth a thousand words, conveying information that goes beyond the visual content therein. Traditional computer vision tasks focus on the recognition of tangible properties of images, such as objects and scenes. Relatively little attention has been paid to tasks that involve private states where subjectivity analysis is relevant. This area includes detecting cyberbullying and hate speech, identifying emotions, and understanding rhetoric and intentions. This dissertation presents our work in exploring new challenges and approaches towards comprehensive visual understanding, with both subjectivity and objectivity in images in mind. Specifically, on the challenge side, we focus on a specific aspect of subjectivity: the intent behind social media images. We introduce an intent dataset, Intentonomy, annotated with 28 intent categories derived from a social psychology taxonomy. We then systematically study whether, and to what extent, commonly used visual information, i.e., object and context, contribute to human intent understanding. On the approach side, we present three approaches: (1) an intent classifier that attends to object and context classes in images as well as textual information in the form of hashtags; (2) a streamlined pre-training method that uses pseudo labels derived from human responses to social media posts. (3) a parameter-efficient transfer learning method for adapting ever-increasing pre-trained vision models. We find our dataset to be very challenging for visual recognition systems and our approaches to be empirically effective on representative visual understanding tasks.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Information science.
$3
561178
650
4
$a
Computer science.
$3
573171
653
$a
Computer vision
653
$a
Machine learning
653
$a
Visual information
653
$a
Social media posts
653
$a
Textual information
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
690
$a
0800
690
$a
0723
710
2
$a
Cornell University.
$b
Information Science.
$3
1179518
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
773
0
$t
Dissertations Abstracts International
$g
84-12B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30248533
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入