國立虎尾科技大學 |

On Learning from Collective Data.

Record Type:	Language materials, manuscript : Monograph/item
Title/Author:	On Learning from Collective Data./
Author:	Xiong, Liang.
Description:	1 online resource (170 pages)
Notes:	Source: Dissertation Abstracts International, Volume: 78-03(E), Section: B.
Subject:	Computer science. -
Online resource:	click for full text (PQDT)
ISBN:	9781369298727

On Learning from Collective Data.
Xiong, Liang.

On Learning from Collective Data. - 1 online resource (170 pages)

Source: Dissertation Abstracts International, Volume: 78-03(E), Section: B.

Thesis (Ph.D.)--Carnegie Mellon University, 2013.

Includes bibliographical references

In many machine learning problems and application domains, the data are naturally organized by groups. For example, a video sequence is a group of images, an image is a group of patches, a document is a group of paragraphs/words, and a community is a group of people. We call them the collective data..

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9781369298727Subjects--Topical Terms:

573171
Computer science.
Index Terms--Genre/Form:

554714
Electronic books.

On Learning from Collective Data.
LDR:03515ntm a2200373K 4500 001 913660
005 20180622095235.5
006 m o u
007 cr mn||||a|a||
008 190606s2013 xx obm 000 0 eng d
020 $a 9781369298727
035 $a (MiAaPQ)AAI10181258
035 $a AAI10181258
040 $a MiAaPQ $b eng $c MiAaPQ
100 1 $a Xiong, Liang. $3 1186589
245 1 0 $a On Learning from Collective Data.
264 0 $c 2013
300 $a 1 online resource (170 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-03(E), Section: B.
500 $a Adviser: Jeff Schneider.
502 $a Thesis (Ph.D.)--Carnegie Mellon University, 2013.
504 $a Includes bibliographical references
520 $a In many machine learning problems and application domains, the data are naturally organized by groups. For example, a video sequence is a group of images, an image is a group of patches, a document is a group of paragraphs/words, and a community is a group of people. We call them the collective data..
520 $a In this thesis, we study how and what we can learn from collective data. Usually, machine learning focuses on individual objects, each of which is described by a feature vector and studied as a point in some metric space. When approaching collective data, researchers often reduce the groups into vectors to which traditional methods can be applied. We, on the other hand, will try to develop machine learning methods that respect the collective nature of data and learn from them directly.
520 $a Several different approaches were taken to address this learning problem. When the groups consist of unordered discrete data points, it can naturally be characterized by its sufficient statistics -- the histogram. For this case we develop efficient methods to address the outliers and temporal effects in the data based on matrix and tensor factorization methods.
520 $a To learn from groups that contain multi-dimensional real-valued vectors, we develop both generative methods based on hierarchical probabilistic models and discriminative methods using group kernels based on new divergence estimators. With these tools, we can accomplish various tasks such as classification, regression, clustering, anomaly detection, and dimensionality reduction on collective data.
520 $a We further consider the practical side of the divergence based algorithms. To reduce their time and space requirements, we evaluate and find methods that can effectively reduce the size of the groups with little impact on the accuracy. We also proposed the conditional divergence along with an efficient estimator in order to correct the sampling biases that might be present in the data. Finally, we develop methods to learn in cases where some divergences are missing, caused by either insufficient computational resources or extreme sampling biases.
520 $a In addition to designing new learning methods, we will use them to help the scientific discovery process. In our collaboration with astronomers and physicists, we see that the new techniques can indeed help scientists make the best of data.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
650 4 $a Artificial intelligence. $3 559380
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
690 $a 0800
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Carnegie Mellon University. $3 845406
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10181258 $z click for full text (PQDT)