國立虎尾科技大學 |

Probabilistic Inference When the Model Is Wrong.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Probabilistic Inference When the Model Is Wrong./
作者:	Cai, Diana.
面頁冊數:	1 online resource (181 pages)
附註:	Source: Dissertations Abstracts International, Volume: 85-03, Section: B.
Contained By:	Dissertations Abstracts International85-03B.
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9798380413916

Probabilistic Inference When the Model Is Wrong.
Cai, Diana.

Probabilistic Inference When the Model Is Wrong. - 1 online resource (181 pages)

Source: Dissertations Abstracts International, Volume: 85-03, Section: B.

Thesis (Ph.D.)--Princeton University, 2023.

Includes bibliographical references

By simplifying complex real-world phenomena, probabilistic methods have proven able to accelerate applications in discovery and design. However, classical theory often evaluates models under the assumption that they are perfect representations of the observed data. There remains the danger that these simplifications might sometimes lead to failure under real-world conditions. This dissertation identifies popular data analyses that can yield unreliable conclusions-and in some cases ones that are arbitrarily unreliable-under such "misspecification." But we also show how to practically navigate misspecification. To begin, we consider clustering, a mainstay of modern unsupervised data analysis, using Bayesian finite mixture models. Some scientists are interested not only in finding meaningful groups of data but also in learning the number of such clusters. We provide novel theoretical results and empirical studies showing that, no matter how small the misspecification, some common approaches, including Bayesian robustness procedures, for learning the number of clusters give increasingly wrong answers as one receives more data. But using imperfect models need not be hopeless. For instance, we consider a popular Bayesian modeling framework for graphs based on the assumption of vertex exchangeability. A consequence of this assumption is that the resulting graph models generate dense graphs with probability 1 and are therefore misspecified for sparse graphs, a common property of many real-world graphs. To address this undesirable scaling behavior, we introduce an alternative generative modeling framework and prove that it generates a range of sparse and dense scaling behaviors; we also show empirically that it can generate graphs with sparse power law scaling behavior. Finally, we consider the case where a researcher has access to a sequence of approximate models that become arbitrarily more complex at the cost of more computation, which is common in applications with simulators of physical dynamics or models requiring numerical approximations of some fidelity. In this case, we show how to obtain estimates as though one had access to the most complex model. In particular, we propose a framework for constructing Markov chain Monte Carlo algorithms that asymptotically simulates from the most complex model while only ever evaluating models from the sequence of approximate models.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798380413916Subjects--Topical Terms:

573171
Computer science.
Subjects--Index Terms:

Bayesian inferenceIndex Terms--Genre/Form:

554714
Electronic books.

Probabilistic Inference When the Model Is Wrong.
LDR:03804ntm a22003977 4500 001 1145292
005 20240618081810.5
006 m o d
007 cr mn ---uuuuu
008 250605s2023 xx obm 000 0 eng d
020 $a 9798380413916
035 $a (MiAaPQ)AAI30572387
035 $a AAI30572387
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Cai, Diana. $3 1470558
245 1 0 $a Probabilistic Inference When the Model Is Wrong.
264 0 $c 2023
300 $a 1 online resource (181 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 85-03, Section: B.
500 $a Advisor: Adams, Ryan P.;Engelhardt, Barbara E.
502 $a Thesis (Ph.D.)--Princeton University, 2023.
504 $a Includes bibliographical references
520 $a By simplifying complex real-world phenomena, probabilistic methods have proven able to accelerate applications in discovery and design. However, classical theory often evaluates models under the assumption that they are perfect representations of the observed data. There remains the danger that these simplifications might sometimes lead to failure under real-world conditions. This dissertation identifies popular data analyses that can yield unreliable conclusions-and in some cases ones that are arbitrarily unreliable-under such "misspecification." But we also show how to practically navigate misspecification. To begin, we consider clustering, a mainstay of modern unsupervised data analysis, using Bayesian finite mixture models. Some scientists are interested not only in finding meaningful groups of data but also in learning the number of such clusters. We provide novel theoretical results and empirical studies showing that, no matter how small the misspecification, some common approaches, including Bayesian robustness procedures, for learning the number of clusters give increasingly wrong answers as one receives more data. But using imperfect models need not be hopeless. For instance, we consider a popular Bayesian modeling framework for graphs based on the assumption of vertex exchangeability. A consequence of this assumption is that the resulting graph models generate dense graphs with probability 1 and are therefore misspecified for sparse graphs, a common property of many real-world graphs. To address this undesirable scaling behavior, we introduce an alternative generative modeling framework and prove that it generates a range of sparse and dense scaling behaviors; we also show empirically that it can generate graphs with sparse power law scaling behavior. Finally, we consider the case where a researcher has access to a sequence of approximate models that become arbitrarily more complex at the cost of more computation, which is common in applications with simulators of physical dynamics or models requiring numerical approximations of some fidelity. In this case, we show how to obtain estimates as though one had access to the most complex model. In particular, we propose a framework for constructing Markov chain Monte Carlo algorithms that asymptotically simulates from the most complex model while only ever evaluating models from the sequence of approximate models.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
650 4 $a Statistics. $3 556824
650 4 $a Applied mathematics. $3 1069907
653 $a Bayesian inference
653 $a Finite mixture models
653 $a Markov chain Monte Carlo
653 $a Model misspecification
653 $a Probabilistic modeling
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
690 $a 0463
690 $a 0364
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Princeton University. $b Computer Science. $3 1179801
773 0 $t Dissertations Abstracts International $g 85-03B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30572387 $z click for full text (PQDT)