語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
The Intelligent Management of Crowd-...
~
Lin, Christopher H.
The Intelligent Management of Crowd-Powered Machine Learning.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
The Intelligent Management of Crowd-Powered Machine Learning./
作者:
Lin, Christopher H.
面頁冊數:
1 online resource (175 pages)
附註:
Source: Dissertation Abstracts International, Volume: 79-02(E), Section: B.
Contained By:
Dissertation Abstracts International79-02B(E).
標題:
Computer science. -
電子資源:
click for full text (PQDT)
ISBN:
9780355355741
The Intelligent Management of Crowd-Powered Machine Learning.
Lin, Christopher H.
The Intelligent Management of Crowd-Powered Machine Learning.
- 1 online resource (175 pages)
Source: Dissertation Abstracts International, Volume: 79-02(E), Section: B.
Thesis (Ph.D.)--University of Washington, 2017.
Includes bibliographical references
Artificial intelligence and machine learning power many technologies today, from spam filters to self-driving cars to medical decision assistants. While this revolution has hugely benefited from algorithmic developments, it also could not have occurred without data, which nowadays is frequently procured at massive scale from crowds. Because data is so crucial, a key next step towards truly autonomous agents is the design of better methods for intelligently managing now-ubiquitous crowd-powered data-gathering processes. This dissertation takes this key next step by developing algorithms for the online and dynamic control of these processes. We consider how to gather data for its two primary purposes: training and evaluation.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355355741Subjects--Topical Terms:
573171
Computer science.
Index Terms--Genre/Form:
554714
Electronic books.
The Intelligent Management of Crowd-Powered Machine Learning.
LDR
:04384ntm a2200361Ki 4500
001
916801
005
20180928111501.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355355741
035
$a
(MiAaPQ)AAI10619626
035
$a
(MiAaPQ)washington:17736
035
$a
AAI10619626
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Lin, Christopher H.
$3
1190640
245
1 4
$a
The Intelligent Management of Crowd-Powered Machine Learning.
264
0
$c
2017
300
$a
1 online resource (175 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 79-02(E), Section: B.
500
$a
Advisers: Daniel S. Weld; Mausam Mausam.
502
$a
Thesis (Ph.D.)--University of Washington, 2017.
504
$a
Includes bibliographical references
520
$a
Artificial intelligence and machine learning power many technologies today, from spam filters to self-driving cars to medical decision assistants. While this revolution has hugely benefited from algorithmic developments, it also could not have occurred without data, which nowadays is frequently procured at massive scale from crowds. Because data is so crucial, a key next step towards truly autonomous agents is the design of better methods for intelligently managing now-ubiquitous crowd-powered data-gathering processes. This dissertation takes this key next step by developing algorithms for the online and dynamic control of these processes. We consider how to gather data for its two primary purposes: training and evaluation.
520
$a
In the first part of the dissertation, we develop algorithms for obtaining data for testing. The most important requirement of testing data is that it must be extremely clean. Thus to deal with noisy human annotations, machine learning practitioners typically rely on careful workflow design and advanced statistical techniques for label aggregation. A common process involves designing and testing multiple crowdsourcing workflows for their tasks, identifying the single best-performing workflow, and then aggregating worker responses from redundant runs of that single workflow. We improve upon this process by building two control models: one that allows for switching between many workflows depending on how well a particular workflow is performing for a given example and worker; and one that can aggregate labels from tasks that do not have a finite predefined set of multiple choice answers (e.g., counting tasks). We then implement agents that use our new models to dynamically choose whether to acquire more labels from the crowd or stop, and show that they can produce higher quality labels at a cheaper cost than state-of-the-art baselines.
520
$a
In the second part of the dissertation, we shift to tackle the second purpose of data: training. Because learning algorithms are often robust to noise, training sets do not necessarily have to be clean and have more complex requirements. We first investigate a tradeoff between size and noise. We survey how inductive bias, worker accuracy, and budget affect whether a larger and noisier training set or a smaller and cleaner one will train better classifiers. We then set up a formal framework for dynamically choosing the next example to label or relabel by generalizing active learning to allow for relabeling, which we call re-active learning, and we design new algorithms for re-active learning that outperform active learning baselines. Finally, we leave the noisy setting and investigate how to collect balanced training sets in domains of varying skew, by considering a setting in which workers can not only label examples, but also generate examples with various distributions. We design algorithms that can intelligently switch between deploying these various worker tasks depending on the skew in the dataset, and show that our algorithms can result in significantly better performance than state-of-the-art baselines.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
650
4
$a
Artificial intelligence.
$3
559380
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
690
$a
0800
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
University of Washington.
$b
Computer Science and Engineering.
$3
1182238
773
0
$t
Dissertation Abstracts International
$g
79-02B(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10619626
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入