語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Learning Counterfactual Reasoning by Answering Counterfactual Questions from Videos.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Learning Counterfactual Reasoning by Answering Counterfactual Questions from Videos./
作者:
Hu, Qingyuan.
面頁冊數:
1 online resource (48 pages)
附註:
Source: Masters Abstracts International, Volume: 85-01.
Contained By:
Masters Abstracts International85-01.
標題:
Computer science. -
電子資源:
click for full text (PQDT)
ISBN:
9798379951870
Learning Counterfactual Reasoning by Answering Counterfactual Questions from Videos.
Hu, Qingyuan.
Learning Counterfactual Reasoning by Answering Counterfactual Questions from Videos.
- 1 online resource (48 pages)
Source: Masters Abstracts International, Volume: 85-01.
Thesis (M.S.)--University of California, Los Angeles, 2023.
Includes bibliographical references
Multimodal counterfactual reasoning is a vital yet challenging ability for AI systems. It involves predicting the outcomes of hypothetical circumstances based on vision and language inputs, which enables AI models to learn from failures and explore hypothetical scenarios. Despite its importance, there are only a few datasets targeting the counterfactual reasoning abilities of multimodal models. Among them, they only cover reasoning over synthetic environments or specific types of events (e.g. traffic collisions), making them hard to reliably benchmark the model generalization ability in diverse real-world scenarios and reasoning dimensions. To overcome these limitations, we develop a video question answering dataset, ACQUIRED: it consists of 3.9K annotated videos, encompassing a wide range of event types and incorporating both first and third-person viewpoints, which ensures a focus on real-world diversity. In addition, each video is annotated with questions that span three distinct dimensions of reasoning, including physical, social, and temporal, which can comprehensively evaluate the model counterfactual abilities along multiple aspects. We benchmark our dataset against several state-of-the-art language-only and multimodal models and experimental results demonstrate a significant performance gap (> 13%) between models and humans. The findings suggest that multimodal counterfactual reasoning remains an open challenge and ACQUIRED is a comprehensive and reliable benchmark for inspiring future research in this direction.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798379951870Subjects--Topical Terms:
573171
Computer science.
Subjects--Index Terms:
Counterfactual reasoningIndex Terms--Genre/Form:
554714
Electronic books.
Learning Counterfactual Reasoning by Answering Counterfactual Questions from Videos.
LDR
:02917ntm a22003977 4500
001
1146242
005
20240812064350.5
006
m o d
007
cr bn ---uuuuu
008
250605s2023 xx obm 000 0 eng d
020
$a
9798379951870
035
$a
(MiAaPQ)AAI30524075
035
$a
AAI30524075
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Hu, Qingyuan.
$3
1471600
245
1 0
$a
Learning Counterfactual Reasoning by Answering Counterfactual Questions from Videos.
264
0
$c
2023
300
$a
1 online resource (48 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 85-01.
500
$a
Advisor: Peng, Nanyun.
502
$a
Thesis (M.S.)--University of California, Los Angeles, 2023.
504
$a
Includes bibliographical references
520
$a
Multimodal counterfactual reasoning is a vital yet challenging ability for AI systems. It involves predicting the outcomes of hypothetical circumstances based on vision and language inputs, which enables AI models to learn from failures and explore hypothetical scenarios. Despite its importance, there are only a few datasets targeting the counterfactual reasoning abilities of multimodal models. Among them, they only cover reasoning over synthetic environments or specific types of events (e.g. traffic collisions), making them hard to reliably benchmark the model generalization ability in diverse real-world scenarios and reasoning dimensions. To overcome these limitations, we develop a video question answering dataset, ACQUIRED: it consists of 3.9K annotated videos, encompassing a wide range of event types and incorporating both first and third-person viewpoints, which ensures a focus on real-world diversity. In addition, each video is annotated with questions that span three distinct dimensions of reasoning, including physical, social, and temporal, which can comprehensively evaluate the model counterfactual abilities along multiple aspects. We benchmark our dataset against several state-of-the-art language-only and multimodal models and experimental results demonstrate a significant performance gap (> 13%) between models and humans. The findings suggest that multimodal counterfactual reasoning remains an open challenge and ACQUIRED is a comprehensive and reliable benchmark for inspiring future research in this direction.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
650
4
$a
Information technology.
$3
559429
653
$a
Counterfactual reasoning
653
$a
AI models
653
$a
Language inputs
653
$a
Performance gap
653
$a
Real-world diversity
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
690
$a
0800
690
$a
0489
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
University of California, Los Angeles.
$b
Computer Science 0201.
$3
1182286
773
0
$t
Masters Abstracts International
$g
85-01.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30524075
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入