語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
The Role of Model Implementation in Neuroscientific Applications of Machine Learning.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
The Role of Model Implementation in Neuroscientific Applications of Machine Learning./
作者:
Abe, Taiga.
面頁冊數:
1 online resource (233 pages)
附註:
Source: Dissertations Abstracts International, Volume: 85-07, Section: B.
Contained By:
Dissertations Abstracts International85-07B.
標題:
Neurosciences. -
電子資源:
click for full text (PQDT)
ISBN:
9798381276411
The Role of Model Implementation in Neuroscientific Applications of Machine Learning.
Abe, Taiga.
The Role of Model Implementation in Neuroscientific Applications of Machine Learning.
- 1 online resource (233 pages)
Source: Dissertations Abstracts International, Volume: 85-07, Section: B.
Thesis (Ph.D.)--Columbia University, 2024.
Includes bibliographical references
In modern neuroscience, large scale machine learning models are becoming increasingly critical components of data analysis. Despite the accelerating adoption of these large scale machine learning tools, there are fundamental challenges to their use in scientific applications that remain largely unaddressed. In this thesis, I focus on one such challenge: variability in the predictions of large scale machine learning models relative to seemingly trivial differences in their implementation. Existing research has shown that the performance of large scale machine learning models (more so than traditional model like linear regression) is meaningfully entangled with design choices such as the hardware components, operating system, software dependencies, and random seed that the corresponding model depends upon. Within the bounds of current practice, there are few ways of controlling this kind of implementation variability across the broad community of neuroscience researchers (making data analysis less reproducible), and little understanding of how data analyses might be designed to mitigate these issues (making data analysis unreliable). This dissertation will present two broad research directions that address these shortcomings. First, I will describe a novel, cloud-based platform for sharing data analysis tools reproducibly and at scale. This platform, called NeuroCAAS, enables developers of novel data analyses to precisely specify an implementation of their entire data analysis, which can then be used automatically by any other user on custom built cloud resources. I show that this approach is able to efficiently support a wide variety of existing data analysis tools, as well as novel tools which would not be feasible to build and share outside of a platform like NeuroCAAS. Second, I conduct two large-scale studies on the behavior of deep ensembles. Deep ensembles are a class of machine learning model which uses implementation variability to improve the quality of model predictions; in particular, by aggregating the predictions of deep networks over stochastic initialization and training. Deep ensembles simultaneously provide a way to control the impact of implementation variability (by aggregating predictions across random seeds) and also to understand what kind of predictive diversity is generated by this particular form of implementation variability. I present a number of surprising results that contradict widely held intuitions about the performance of deep ensembles as well as the mechanisms behind their success, and show that in many aspects, the behavior of deep ensembles is similar to that of an appropriately chosen single neural network. As a whole, this dissertation presents novel methods and insights focused on the role of implementation variability in large scale machine learning models, and more generally upon the challenges of working with such large models in neuroscience data analysis. I conclude by discussing other ongoing efforts to improve the reproducibility and accessibility of large scale machine learning in neuroscience, as well as long term goals to speed the adoption and reliability of such methods in a scientific context.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798381276411Subjects--Topical Terms:
593561
Neurosciences.
Subjects--Index Terms:
Data analysisIndex Terms--Genre/Form:
554714
Electronic books.
The Role of Model Implementation in Neuroscientific Applications of Machine Learning.
LDR
:04652ntm a22004337 4500
001
1148784
005
20240930100147.5
006
m o d
007
cr bn ---uuuuu
008
250605s2024 xx obm 000 0 eng d
020
$a
9798381276411
035
$a
(MiAaPQ)AAI30818660
035
$a
AAI30818660
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Abe, Taiga.
$3
1474841
245
1 4
$a
The Role of Model Implementation in Neuroscientific Applications of Machine Learning.
264
0
$c
2024
300
$a
1 online resource (233 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 85-07, Section: B.
500
$a
Advisor: Cunningham, John P.
502
$a
Thesis (Ph.D.)--Columbia University, 2024.
504
$a
Includes bibliographical references
520
$a
In modern neuroscience, large scale machine learning models are becoming increasingly critical components of data analysis. Despite the accelerating adoption of these large scale machine learning tools, there are fundamental challenges to their use in scientific applications that remain largely unaddressed. In this thesis, I focus on one such challenge: variability in the predictions of large scale machine learning models relative to seemingly trivial differences in their implementation. Existing research has shown that the performance of large scale machine learning models (more so than traditional model like linear regression) is meaningfully entangled with design choices such as the hardware components, operating system, software dependencies, and random seed that the corresponding model depends upon. Within the bounds of current practice, there are few ways of controlling this kind of implementation variability across the broad community of neuroscience researchers (making data analysis less reproducible), and little understanding of how data analyses might be designed to mitigate these issues (making data analysis unreliable). This dissertation will present two broad research directions that address these shortcomings. First, I will describe a novel, cloud-based platform for sharing data analysis tools reproducibly and at scale. This platform, called NeuroCAAS, enables developers of novel data analyses to precisely specify an implementation of their entire data analysis, which can then be used automatically by any other user on custom built cloud resources. I show that this approach is able to efficiently support a wide variety of existing data analysis tools, as well as novel tools which would not be feasible to build and share outside of a platform like NeuroCAAS. Second, I conduct two large-scale studies on the behavior of deep ensembles. Deep ensembles are a class of machine learning model which uses implementation variability to improve the quality of model predictions; in particular, by aggregating the predictions of deep networks over stochastic initialization and training. Deep ensembles simultaneously provide a way to control the impact of implementation variability (by aggregating predictions across random seeds) and also to understand what kind of predictive diversity is generated by this particular form of implementation variability. I present a number of surprising results that contradict widely held intuitions about the performance of deep ensembles as well as the mechanisms behind their success, and show that in many aspects, the behavior of deep ensembles is similar to that of an appropriately chosen single neural network. As a whole, this dissertation presents novel methods and insights focused on the role of implementation variability in large scale machine learning models, and more generally upon the challenges of working with such large models in neuroscience data analysis. I conclude by discussing other ongoing efforts to improve the reproducibility and accessibility of large scale machine learning in neuroscience, as well as long term goals to speed the adoption and reliability of such methods in a scientific context.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Neurosciences.
$3
593561
650
4
$a
Computer science.
$3
573171
650
4
$a
Bioinformatics.
$3
583857
653
$a
Data analysis
653
$a
Ensembles
653
$a
Infrastructure
653
$a
Machine learning
653
$a
Reproducibility
653
$a
Robustness
653
$a
Accessibility
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0317
690
$a
0984
690
$a
0800
690
$a
0715
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Columbia University.
$b
Neurobiology and Behavior.
$3
1186904
773
0
$t
Dissertations Abstracts International
$g
85-07B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30818660
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入
第一次登入時,112年前入學、到職者,密碼請使用身分證號登入;112年後入學、到職者,密碼請使用身分證號"後六碼"登入,請注意帳號密碼有區分大小寫!
帳號(學號)
密碼
請在此電腦上記得個人資料
取消
忘記密碼? (請注意!您必須已在系統登記E-mail信箱方能使用。)