語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Building the Next Generation of Multimodal Models.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Building the Next Generation of Multimodal Models./
作者:
Ilharco, Gabriel.
面頁冊數:
1 online resource (143 pages)
附註:
Source: Dissertations Abstracts International, Volume: 85-10, Section: A.
Contained By:
Dissertations Abstracts International85-10A.
標題:
Information science. -
電子資源:
click for full text (PQDT)
ISBN:
9798382213705
Building the Next Generation of Multimodal Models.
Ilharco, Gabriel.
Building the Next Generation of Multimodal Models.
- 1 online resource (143 pages)
Source: Dissertations Abstracts International, Volume: 85-10, Section: A.
Thesis (Ph.D.)--University of Washington, 2024.
Includes bibliographical references
One of the fundamental goals of machine learning is to create systems capable of processing data from a variety of modalities such as images and text. I argue that the next generation of multimodal models will be enabled by a deeper understanding of how to design pretraining datasets, and by techniques that offer better control over models after pretraining. Towards the first goal, I introduce a fully open-source benchmark for designing multimodal datasets. This benchmark provides a shared experimental setting for research on dataset curation, allowing researchers to conduct rigorous and controlled experiments. Our experiments highlight the potential of rigorous empirical work on dataset curation, finding pretraining datasets that outperform existing datasets by a large margin. Towards the second goal, I present multiple techniques for improving models after pretraining. Our fine-tuning techniques improve accuracy without overspecialization and without increasing inference costs. Moreover, I present a modular framework for steering the behavior of trained models, designed to efficiently add or delete capabilities while operating directly within the models' weight space. Altogether, these new techniques pave the way for the next generation of multimodal models.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798382213705Subjects--Topical Terms:
561178
Information science.
Subjects--Index Terms:
Multimodal modelsIndex Terms--Genre/Form:
554714
Electronic books.
Building the Next Generation of Multimodal Models.
LDR
:02667ntm a22004097 4500
001
1144970
005
20240617111730.5
006
m o d
007
cr mn ---uuuuu
008
250605s2024 xx obm 000 0 eng d
020
$a
9798382213705
035
$a
(MiAaPQ)AAI30991151
035
$a
AAI30991151
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Ilharco, Gabriel.
$3
1470163
245
1 0
$a
Building the Next Generation of Multimodal Models.
264
0
$c
2024
300
$a
1 online resource (143 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 85-10, Section: A.
500
$a
Advisor: Hajishirzi, Hannaneh;Farhadi, Ali.
502
$a
Thesis (Ph.D.)--University of Washington, 2024.
504
$a
Includes bibliographical references
520
$a
One of the fundamental goals of machine learning is to create systems capable of processing data from a variety of modalities such as images and text. I argue that the next generation of multimodal models will be enabled by a deeper understanding of how to design pretraining datasets, and by techniques that offer better control over models after pretraining. Towards the first goal, I introduce a fully open-source benchmark for designing multimodal datasets. This benchmark provides a shared experimental setting for research on dataset curation, allowing researchers to conduct rigorous and controlled experiments. Our experiments highlight the potential of rigorous empirical work on dataset curation, finding pretraining datasets that outperform existing datasets by a large margin. Towards the second goal, I present multiple techniques for improving models after pretraining. Our fine-tuning techniques improve accuracy without overspecialization and without increasing inference costs. Moreover, I present a modular framework for steering the behavior of trained models, designed to efficiently add or delete capabilities while operating directly within the models' weight space. Altogether, these new techniques pave the way for the next generation of multimodal models.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Information science.
$3
561178
650
4
$a
Computer science.
$3
573171
653
$a
Multimodal models
653
$a
Next generation
653
$a
Machine learning
653
$a
Benchmarks
653
$a
Multimodal datasets
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0800
690
$a
0796
690
$a
0984
690
$a
0723
710
2
$a
University of Washington.
$b
Computer Science and Engineering.
$3
1182238
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
773
0
$t
Dissertations Abstracts International
$g
85-10A.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30991151
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入