國立虎尾科技大學 |

Convex Neural Networks.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Convex Neural Networks./
作者:	Sahiner, Arda Ege.
面頁冊數:	1 online resource (229 pages)
附註:	Source: Dissertations Abstracts International, Volume: 85-06, Section: B.
Contained By:	Dissertations Abstracts International85-06B.
標題:	Neural networks. -
電子資源:	click for full text (PQDT)
ISBN:	9798381028164

Convex Neural Networks.
Sahiner, Arda Ege.

Convex Neural Networks. - 1 online resource (229 pages)

Source: Dissertations Abstracts International, Volume: 85-06, Section: B.

Thesis (Ph.D.)--Stanford University, 2023.

Includes bibliographical references

Neural networks have made tremendous advancements in a variety of machine learning tasks across different fields. Typically, neural networks have relied on heuristically optimizing a non-convex objective, raising doubts into their transparency, efficiency, and empirical performance. In this thesis, we show that a wide variety of neural network architectures are amenable to convex optimization, meaning that their non-convex objectives can be reformulated as convex optimization problems using semi-infinite dual formulations. We first show that for two-layer fully connected neural networks with ReLU activations, the optimization problem is convex and demonstrates a unique link to copositive programming, with a regularizer which promotes both sparsity in the number of activation patterns used in the network, and sparsity in the number of neurons that are active for each activation pattern. We show that this formulation admits closed-form solutions in certain data regimes, and use copositive programming to relax the problem into one that is polynomial-time in the problem dimensions for data matrices of a fixed rank. We show that solving the convex reformulation results in a better solution than that found by heuristic algorithms such as gradient descent applied to the original non-convex objective.In the rest of this thesis, we explore different neural network architectures and training regimes which pose new challenges to the convex optimization formulation. We show that for convolutional neural networks and transformer architectures, the optimization problem also admits a convex reformulation. We also show that for neural networks with batch normalization and generative adversarial networks, the same convex reformulation techniques can disentangle uninterpretable aspects of non-convex optimization and admit faster and more robust solutions to practical problems in the field. Finally, we show that these approaches can be scaled to deeper networks using a Burer-Monteiro factorization of the convex objective which maintains convex guarantees but allows for layerwise stacking convex sub-networks in a scalable fashion.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798381028164Subjects--Topical Terms:

1011215
Neural networks.
Index Terms--Genre/Form:

554714
Electronic books.

Convex Neural Networks.
LDR:03334ntm a22003257 4500 001 1148754
005 20240930100137.5
006 m o d
007 cr bn ---uuuuu
008 250605s2023 xx obm 000 0 eng d
020 $a 9798381028164
035 $a (MiAaPQ)AAI30742170
035 $a (MiAaPQ)STANFORDzs047ch0365
035 $a AAI30742170
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Sahiner, Arda Ege. $3 1474798
245 1 0 $a Convex Neural Networks.
264 0 $c 2023
300 $a 1 online resource (229 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 85-06, Section: B.
500 $a Advisor: Pauly, John;Pilanci, Mert;Vasanawala, Shreyas.
502 $a Thesis (Ph.D.)--Stanford University, 2023.
504 $a Includes bibliographical references
520 $a Neural networks have made tremendous advancements in a variety of machine learning tasks across different fields. Typically, neural networks have relied on heuristically optimizing a non-convex objective, raising doubts into their transparency, efficiency, and empirical performance. In this thesis, we show that a wide variety of neural network architectures are amenable to convex optimization, meaning that their non-convex objectives can be reformulated as convex optimization problems using semi-infinite dual formulations. We first show that for two-layer fully connected neural networks with ReLU activations, the optimization problem is convex and demonstrates a unique link to copositive programming, with a regularizer which promotes both sparsity in the number of activation patterns used in the network, and sparsity in the number of neurons that are active for each activation pattern. We show that this formulation admits closed-form solutions in certain data regimes, and use copositive programming to relax the problem into one that is polynomial-time in the problem dimensions for data matrices of a fixed rank. We show that solving the convex reformulation results in a better solution than that found by heuristic algorithms such as gradient descent applied to the original non-convex objective.In the rest of this thesis, we explore different neural network architectures and training regimes which pose new challenges to the convex optimization formulation. We show that for convolutional neural networks and transformer architectures, the optimization problem also admits a convex reformulation. We also show that for neural networks with batch normalization and generative adversarial networks, the same convex reformulation techniques can disentangle uninterpretable aspects of non-convex optimization and admit faster and more robust solutions to practical problems in the field. Finally, we show that these approaches can be scaled to deeper networks using a Burer-Monteiro factorization of the convex objective which maintains convex guarantees but allows for layerwise stacking convex sub-networks in a scalable fashion.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Neural networks. $3 1011215
655 7 $a Electronic books. $2 local $3 554714
690 $a 0800
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Stanford University. $3 1184533
773 0 $t Dissertations Abstracts International $g 85-06B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30742170 $z click for full text (PQDT)