國立虎尾科技大學 |

New Geometric Approaches for Several Machine Learning Problems.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	New Geometric Approaches for Several Machine Learning Problems./
作者:	Liu, Yangwei.
面頁冊數:	1 online resource (70 pages)
附註:	Source: Dissertation Abstracts International, Volume: 79-10(E), Section: B.
Contained By:	Dissertation Abstracts International79-10B(E).
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9780438047730

New Geometric Approaches for Several Machine Learning Problems.
Liu, Yangwei.

New Geometric Approaches for Several Machine Learning Problems. - 1 online resource (70 pages)

Source: Dissertation Abstracts International, Volume: 79-10(E), Section: B.

Thesis (Ph.D.)--State University of New York at Buffalo, 2018.

Includes bibliographical references

Computational geometry has found applications in many different real world problems--both theoretical and applicational ones. In this dissertation, we apply geometric techniques to multiple problems, including traditional machine learning problems, deep learning related problems, and a biological problem that can be modeled as a graph matching problem.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780438047730Subjects--Topical Terms:

573171
Computer science.
Index Terms--Genre/Form:

554714
Electronic books.

New Geometric Approaches for Several Machine Learning Problems.
LDR:05543ntm a2200349Ki 4500 001 919196
005 20181116131021.5
006 m o u
007 cr mn||||a|a||
008 190606s2018 xx obm 000 0 eng d
020 $a 9780438047730
035 $a (MiAaPQ)AAI10813824
035 $a (MiAaPQ)buffalo:15683
035 $a AAI10813824
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Liu, Yangwei. $3 1193710
245 1 0 $a New Geometric Approaches for Several Machine Learning Problems.
264 0 $c 2018
300 $a 1 online resource (70 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 79-10(E), Section: B.
500 $a Adviser: Jinhui Xu.
502 $a Thesis (Ph.D.)--State University of New York at Buffalo, 2018.
504 $a Includes bibliographical references
520 $a Computational geometry has found applications in many different real world problems--both theoretical and applicational ones. In this dissertation, we apply geometric techniques to multiple problems, including traditional machine learning problems, deep learning related problems, and a biological problem that can be modeled as a graph matching problem.
520 $a 1. Distributed SVM and Online SVM. Support Vector Machine (SVM) is an effective classification technique widely used in various machine learning applications. While having a theoretically guaranteed good classification accuracy, training an SVM can be challenging when the size of the training set is large, which is often the case in the big data era. For instance, a huge amount of data might be obtained in distributed sites and communication between different sites could be expensive. Thus a distributed SVM algorithm with low communication cost is desired. It is also possible that the size of the training set might be too large to fit into RAM. Thus an online SVM algorithm with low space complexity is expected. a. For distributed SVM, we aim to design communication-efficient algorithms. In this work, we first provide a lower bound on the communication complexity of distributed SVM in the coordinator model, based on a reduction from the k-OR problem. Then, we present an algorithm that reaches this bound, and has the desired optimality of both convergence speed and quality of solution. Finally, we give a robust version of the algorithm which can explicitly avoid the influence of outliers in the training set with an improved communication cost, based on a faster distributed top-t selection algorithm. b. For online SVM, we aim to find a space-efficient algorithm. In this work, we give an online SVM algorithm that requires only constant space with respect to the size of the training set.
520 $a Both developed algorithms are theoretically guaranteed to output a (1-epsilon)-approximation of the corresponding problem, and converge in optimal number of rounds. 2. Clustering in the Presence of Large Amount of Sparse Noise. We consider the problem of clustering with a heavy amount of sparse background noise (over 80\% of the data). We propose a practical density based clustering method, combining with the strength of Minimum Spanning Tree based clustering, Locality Sensitive Hashing and Deep Neural Networks. Our algorithm first uses MST based clustering for high confidence label initialization, and then learns a mapping from the original data space to a relatively low dimensional latent space as a deep neural network, based on the labels obtained from the MST clustering. In the latent space, data are clustered based on local density, following the assumption that a point is more likely to belong to a cluster if its distance to the nearest neighbor in that cluster is small. We use locality sensitive hashing as the nearest neighbor search engine to handle the possibly high dimensional data. Experiments on both synthetic and real world datasets (with added noise) suggest that our method is capable of handling heavy noise and significantly outperforms popular existing methods. 3. Alignment of Protein-Protein-Interaction (PPI) Network. In biology, two proteins within a cell are said to be interacting if they are close to each other geometrically. Thus the PPI network is an undirected node-edge graph with some extra biological information stored at each node. Studying the alignment of two PPI networks are promising in discovering new functionality of different proteins. Since the alignment of PPI network can be treated as a generalized graph isomorphism problem which is NP-hard and notoriously challenging, current research has focused on finding algorithms with good practical performance. We developed a geometry based alignment algorithm for the global alignment of two PPI networks, which consists of a geometric step of embedding the network into a low dimensional Euclidean space and using geometric matching methods to find a matching, and a min-cost-max-flow step to handle the sequence similarity information stored in each node. Unlike other popular alignment algorithms which are either greedy or incremental, our algorithm globally optimizes the problem to yield an alignment with better quality.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a State University of New York at Buffalo. $b Computer Science and Engineering. $3 1180201
773 0 $t Dissertation Abstracts International $g 79-10B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10813824 $z click for full text (PQDT)