國立虎尾科技大學 |

Combinatorial Tasks as Model Systems of Deep Learning.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Combinatorial Tasks as Model Systems of Deep Learning./
作者:	Edelman, Benjamin L.
面頁冊數:	1 online resource (397 pages)
附註:	Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
Contained By:	Dissertations Abstracts International85-12B.
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9798382784939

Combinatorial Tasks as Model Systems of Deep Learning.
Edelman, Benjamin L.

Combinatorial Tasks as Model Systems of Deep Learning. - 1 online resource (397 pages)

Source: Dissertations Abstracts International, Volume: 85-12, Section: B.

Thesis (Ph.D.)--Harvard University, 2024.

Includes bibliographical references

This dissertation is about a particular style of research. The philosophy of this style is that in order to scientifically understand deep learning, it is fruitful to investigate what happens when neural networks are trained on simple, mathematically well-defined tasks. Even though the training data is simple, the training algorithm can end up producing rich, unexpected results; and understanding these results can shed light on fundamental mysteries of high relevance to contemporary deep learning.First, we situate this methodological approach in a broader scientific context, discussing and systematizing the role of model systems in science and in the science of deep learning in particular. We then present five intensive case studies, each of which uses a particular combinatorial task as a lens through which to demystify puzzles of deep learning.The combinatorial tasks employed are sparse Boolean functions, sparse parities, learning finite group operations, performing modular addition, and learning Markov chains in-context. Topics of explanatory interest include the inductive biases of the transformer architecture, the phenomenon of emergent capabilities during training, the nuances of deep learning in the presence of statistical-computational gaps, the tradeoffs between different resources of training, the effect of network width on optimization, the relationship between symmetries in training data and harmonic structure in trained networks, the origins of the mechanisms of in-context learning in transformers, and the influence of spurious solutions on optimization.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798382784939Subjects--Topical Terms:

573171
Computer science.
Subjects--Index Terms:

Neural networksIndex Terms--Genre/Form:

554714
Electronic books.

Combinatorial Tasks as Model Systems of Deep Learning.
LDR:02942ntm a22003977 4500 001 1148253
005 20240916070039.5
006 m o d
007 cr bn ---uuuuu
008 250605s2024 xx obm 000 0 eng d
020 $a 9798382784939
035 $a (MiAaPQ)AAI31297036
035 $a AAI31297036
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Edelman, Benjamin L. $3 1474183
245 1 0 $a Combinatorial Tasks as Model Systems of Deep Learning.
264 0 $c 2024
300 $a 1 online resource (397 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
500 $a Advisor: Kakade, Sham;Valiant, Leslie.
502 $a Thesis (Ph.D.)--Harvard University, 2024.
504 $a Includes bibliographical references
520 $a This dissertation is about a particular style of research. The philosophy of this style is that in order to scientifically understand deep learning, it is fruitful to investigate what happens when neural networks are trained on simple, mathematically well-defined tasks. Even though the training data is simple, the training algorithm can end up producing rich, unexpected results; and understanding these results can shed light on fundamental mysteries of high relevance to contemporary deep learning.First, we situate this methodological approach in a broader scientific context, discussing and systematizing the role of model systems in science and in the science of deep learning in particular. We then present five intensive case studies, each of which uses a particular combinatorial task as a lens through which to demystify puzzles of deep learning.The combinatorial tasks employed are sparse Boolean functions, sparse parities, learning finite group operations, performing modular addition, and learning Markov chains in-context. Topics of explanatory interest include the inductive biases of the transformer architecture, the phenomenon of emergent capabilities during training, the nuances of deep learning in the presence of statistical-computational gaps, the tradeoffs between different resources of training, the effect of network width on optimization, the relationship between symmetries in training data and harmonic structure in trained networks, the origins of the mechanisms of in-context learning in transformers, and the influence of spurious solutions on optimization.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
650 4 $a Mathematics. $3 527692
650 4 $a Information technology. $3 559429
653 $a Neural networks
653 $a Finite group
653 $a Deep learning
653 $a Transformer architecture
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
690 $a 0489
690 $a 0800
690 $a 0405
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Harvard University. $b Engineering and Applied Sciences - Computer Science. $3 1467916
773 0 $t Dissertations Abstracts International $g 85-12B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31297036 $z click for full text (PQDT)